I don’t like stupid benchmarks, as they waste my time. I don’t like stupid marketing, as it wastes my time too. Sometimes I succumb to those things, and now in return I want to waste your time a bit.
So, this MemSQL thing, written by some smart guys has been making rounds in press and technical community. Centerpiece of all the communication was:
“MemSQL, the database they have developed over the past year, is thirty times faster than conventional disk-based databases”
Though I usually understand that those claims don’t make any sense, I was wondering what did they do wrong. Apparently they got MySQL with default settings running and MemSQL with default settings running, then compared the two. They say it is a good benchmark, as it compares what users get just by installing standard packages.
That is already cheating, because systems are forced to work in completely different profiles. For example, memory used for data buffering, is essentially unbound on MemSQL, yet InnoDB has it limited to 128MB on 5.5 (and that is 16x the default setting used on 5.1).
For write benchmarks MemSQL will write out snapshot at 2G log mark, InnoDB is configured with 10MB transaction log, so it will start checkpointing pretty much immediately.
Still, for any benchmark, most important thing is durability. See, MemSQL claims that they support ACID, and durability is core part of that. MySQL’s InnoDB (I don’t assume other engines are usable) is durable by default, making sure that if it says that transaction returned ‘ok’, it is on disk and will be there after a crash. MemSQL is also “durable by default”, which means that it will write a transaction log, but it doesn’t really mean that it will hit the disk.
See, MemSQL also has “transaction-buffer” setting, which, in default “full durability mode” will asynchronously return “ok” until 128M buffer is full (or background log flusher thread writes it out). Essentially this is something similar to innodb_flush_log_at_trx_commit=2. In my opinion not durable.
What happens if you really enable full durability on MemSQL? Absolute sadness does. Apparently each commit will wait for background thread to wake up and write out transaction log. How often does background thread wake up? Every 50ms. Well, it actually does time accounting magic, to flush every 50ms, and calls very exact sleep.
Claim #1: MemSQL is 500x slower at durable transactions a second than InnoDB.
It is relatively easy to back that up – with decent RAID controller that has write-behind caching, InnoDB can easily sustain 10k transactions a second from a single thread, as it doesn’t sleep for 50ms between fsyncs. There is some commit grouping there, two threads will have 40tps, ten threads will have 200tps, but as I get to choose my own benchmark I claim that MemSQL is 500x slower at single-thread durable transaction rate.
Now that we established MySQL superiority (ha ha), let’s look at read performance. I sure agree, that it is where MemSQL should shine. I do agree, that its execution speeds for table scans are not bad – 8M rows scanned a second (for SELECT COUNT(*) query) from single thread is sure a great achievement.
To be honest, I didn’t want to spend my time in benchmarking what an in-memory database should excel at (I’m sure it does random point reads on skiplist just fine). Instead I decided to test my favorite query:
SELECT * FROM table ORDER BY id DESC LIMIT 5;
You know, the query that is all around the web – showing you heads of various lists. MySQL does that by pointing a cursor at an index position then walking record by record in index order. Not MemSQL, it will actually have to traverse whole table and sort it to return you the answer. Even “SELECT MAX(id)” does crawl whole table.
Claim #2: MemSQL is thousand times slower than MySQL. Or million times slower. At simple read queries. (I have been corrected on this – apparently indexes in MemSQL are unidirectional, so you have to define separate index for each direction you are going to read the table in).
Well, once we establish that MemSQL will have O(N) performance on some common operation, all we need is just find an N that is large enough ;-)
I don’t know how much we should be blaming MemSQL guys and how much that should be directed at journalists that were hyping on the technology. If we get back to ACID, we’d see that A for atomicity is done only at statement level and BEGIN/COMMIT are ignored. Isolation is only READ COMMITTED (difficult to have REPEATABLE READ with no real transactions). Durability is flawed, and I didn’t check C part. I got to admit, MemSQL FAQ states that “Yes, MemSQL supports fully ACID transactions”. This is on them, then.
The 80000 queries a second on MemSQL number isn’t anything impressive, compared to what modern MySQL can do (especially with HandlerSocket) – people are approaching million queries a sec there :-)
Though, definitely, for things that it is doing well, it is fastest MySQL protocol speaking thing at the moment, though it isn’t that far ahead of MySQL Cluster, and people talking to NDB are having also quite good performance (really amazing performance, that is).
I’m sure, that my both claims can be fixed with some engineering work. Write performance needs proper real time synching with group commit (there has been some great development in MySQL world about that lately too – though when binlog is involved things are way more complicated).
Read performance needs proper optimizations for most common patterns – index order reads, for example. Memory is fast, but not fast enough if high concurrency environment would need to do this over and over again. Even for what it does well, I’m somewhat sure that it wouldn’t overperform InnoDB 30x at in-memory workloads. I’m too lazy to benchmark today, but this ‘Claim #3’ is not that difficult to prove :-)
Anyway, we wouldn’t need this post if there was a decent disclosure of behaviors and proper benchmarking. Now we get multiple conflicting claims that are way too easy to spot within few minutes of testing. Way too easy.
P.S. Harrison also has discussed this on Quora