Oh well, I know I shouldn’t poke directly at people, but they deserve that sometimes (at least in my very personal opinion). Heck, I even gave 12h window for this not to be hot-headed opinion.
Those who followed MySQL at facebook development probably know how much we focus on actual performance on top of mixed-composition I/O devices (flashcache, etc) – not just retreating to comfortable zone of in-memory (or in-pure-flash) data.
I feel somewhat sad that I have to put this truism out here – disks are way more cost efficient, and if used properly can be used to facilitate way more long-term products, not just real time data. Think Wikipedia without history, think comments that disappear on old posts, together with old posts, think all 404s you hit on various articles you remember from the past and want to read.
Building the web that lasts is completely different task from what academia people imagine building the web is.
I already had this issue with other RDBMS pioneer (there’s something in common among top database luminaries) – he also suggested that disks are things of the past and now everything has to be in memory, because memory is cheap. And data can be whatever unordered clutter, because CPUs can sort it, because CPUs are cheap.
They probably missed Al Gore message. Throwing more and more hardware without fine tuning for actual operational efficiency requirements is wasteful and harms our planet. Yes, we do lots of in-memory efficiency work, so that we reduce our I/O, but at the same time we balance the workload so that I/O subsystem provides as efficient as possible delivery of the long tail.
What happens in real world if one gets 2x efficiency gain? Twice more data can be stored, twice more data intensive products can be launched.
What happens in academia of in-memory databases, if one gets 2x efficiency gain? A paper.
What happens when real world doesn’t read your papers anymore? You troll everyone via GigaOM.
Though sure, there’s some operational overhead in handling sharding and availability of MySQL deployments, at large scale it becomes somewhat constant cost, whereas operational efficiency gains are linear.
Update: Quite a few people pointed out that I was dissing a person who has done incredible amount of contributions, or that I’m anti-academia. I’m not, and I extremely value any work that people do wherever they are, albeit I do apply critical thinking to whatever they speak.
In my text above (I don’t want to edit and hide what I said) I don’t mean that “a paper” is useless. Me and my colleagues do read papers and try to understand the direction of computer science and how it applies to our work (there are indeed various problems yet to solve). I’d love to come up with something worth a paper (and quite a few of my colleagues did).
Still, if someone does not find that direction useful, there’s no way to portray them the way the original GigaOM article did.