Brian exposed some of his internal letters about death of replication (caused by memcached). Back when he wrote this, I responded back a bit too. Now as quite a few people really want to burry replication, let me point out some of reasoning why it will live.
First of all, both MySQL and memcached are slow (however you look at it, they’re both fast) – in proper gigabit environment both respond in a millisecond or so (well, MySQL is closed to 1.5ms). The major task becomes putting as much of work done in that round trip as possible.
Replication lag? The major problem with it was fixed by Google patches back in 4.0, finally hitting stock MySQL in 5.1. Now replication thread doesn’t get queued due to concurrency, and always enters the execution. Use binary log position serialization for reading users, and they will never notice replication lag.
More servers? Also more performance. Putting everything to memcached? Lots of stuff still has to be written to database. Once it is in database anyway, one can query it from database too. In low hitrate situations using memcached will be 3x slower, than just fetching data from database (get/get/set vs get). I’ve seen lots of code that was enthusiastic to use memcached, but authors didn’t actually try to profile what are the hit ratios.
Major problem with memcached is that it is a hash table. All it supports in data retrieval is asking for a key and getting a value. Which works great in situations where one just has a key and gets a value. Now if 50 keys are needed, memcached will need 50 lookups, quite often – routed to 50 different servers. Thats single database B-Tree read. How does one fetch all keys from 1 to 10000 with memcached? Thats right – ask for all of them. Of course, it is easy to resolve some of inefficiency by having separate memcached clusters for different tasks, appending information to multiple tracking objects, but thats where the ease of distribution starts fading, and development and administration needs surface.
memcached APIs now start supporting replication too – but flapping hosts can get environment out of sync quite fast then (host disappears, failover host starts getting traffic, host comes back with stale data…). Solution – object generation management, reading from multiple hosts, etc. – here again, solving simple problem already needs quite some complexity.
Add the ACID properties of databases, which quite often make whole development much easier – what ends up quite difficult to achieve in completely distributed ‘get/set’ environment.
And by the way – memcached can be outgunned. Hot objects can be cached directly on local application server stores, like APC object cache, file system, etc. New application servers nowadays have lots of memory.. :) Need global state? Just broadcast it to all.
There’re much more what replicated databases can provide – more complex views, all indexed and snappy, single line change doesn’t need invalidation of hundreds or thousands of objects around, and it all comes to interactivity and serving user’s needs better. Single line change immediately visible to all the users around.
Brian suggests using job queue systems and pushing everything to memcached – which makes it a dump of stuff instead of a cache. Putting more information that might be needed ends up with unnecessary evictions, which decrease efficiency of system too. Building those objects needs reading from database (or other persistent store), and eventually they end up in database too. Surprise – they can be served from database as well! :)
Anyway, memcaching ideas are moving forward, so does database replication. There is lots of room for replication to evolve yet – making it more async, parallel, relaxed. Whole MySQL protocol might be better – now it is all synchronous and boring.
Though replication has the storage overhead – more copies are usually saved – it also allows utilizing those copies in different way, different indexing schemes, though still maintaining same image of data on different nodes. Even better, such application-specific ‘roles’ of slaves can migrate from one node to other, heating up different segments of data.
The role of database replication will still remain core for scaling out reads for various workflows. Database allows incremental changes to infinite datasets required by various applications. Replication just multiples system capacity for presenting those datasets. Thats good. If it was up to me, I’d let it live.
P.S. Our current memcached cluster has 80 nodes, each providing 2gb of storage. When used properly, memcached is great tool too. :)