MySQL is bazillion times faster than MemSQL

I don’t like stupid benchmarks, as they waste my time. I don’t like stupid marketing, as it wastes my time too. Sometimes I succumb to those things, and now in return I want to waste your time a bit.

So, this MemSQL thing, written by some smart guys has been making rounds in press and technical community. Centerpiece of all the communication was:

“MemSQL, the database they have developed over the past year, is thirty times faster than conventional disk-based databases”

Though I usually understand that those claims don’t make any sense, I was wondering what did they do wrong. Apparently they got MySQL with default settings running and MemSQL with default settings running, then compared the two. They say it is a good benchmark, as it compares what users get just by installing standard packages.

That is already cheating, because systems are forced to work in completely different profiles. For example, memory used for data buffering, is essentially unbound on MemSQL, yet InnoDB has it limited to 128MB on 5.5 (and that is 16x the default setting used on 5.1).

For write benchmarks MemSQL will write out snapshot at 2G log mark, InnoDB is configured with 10MB transaction log, so it will start checkpointing pretty much immediately.

Still, for any benchmark, most important thing is durability. See, MemSQL claims that they support ACID, and durability is core part of that. MySQL’s InnoDB (I don’t assume other engines are usable) is durable by default, making sure that if it says that transaction returned ‘ok’, it is on disk and will be there after a crash. MemSQL is also “durable by default”, which means that it will write a transaction log, but it doesn’t really mean that it will hit the disk.

See, MemSQL also has “transaction-buffer” setting, which, in default “full durability mode” will asynchronously return “ok” until 128M buffer is full (or background log flusher thread writes it out). Essentially this is something similar to innodb_flush_log_at_trx_commit=2. In my opinion not durable.

What happens if you really enable full durability on MemSQL? Absolute sadness does. Apparently each commit will wait for background thread to wake up and write out transaction log. How often does background thread wake up? Every 50ms. Well, it actually does time accounting magic, to flush every 50ms, and calls very exact sleep.

Claim #1: MemSQL is 500x slower at durable transactions a second than InnoDB.

It is relatively easy to back that up – with decent RAID controller that has write-behind caching, InnoDB can easily sustain 10k transactions a second from a single thread, as it doesn’t sleep for 50ms between fsyncs. There is some commit grouping there, two threads will have 40tps, ten threads will have 200tps, but as I get to choose my own benchmark I claim that MemSQL is 500x slower at single-thread durable transaction rate.

Now that we established MySQL superiority (ha ha), let’s look at read performance. I sure agree, that it is where MemSQL should shine. I do agree, that its execution speeds for table scans are not bad – 8M rows scanned a second (for SELECT COUNT(*) query) from single thread is sure a great achievement.

To be honest, I didn’t want to spend my time in benchmarking what an in-memory database should excel at (I’m sure it does random point reads on skiplist just fine). Instead I decided to test my favorite query:

SELECT * FROM table ORDER BY id DESC LIMIT 5;

You know, the query that is all around the web – showing you heads of various lists. MySQL does that by pointing a cursor at an index position then walking record by record in index order. Not MemSQL, it will actually have to traverse whole table and sort it to return you the answer. Even “SELECT MAX(id)” does crawl whole table.

Claim #2: MemSQL is thousand times slower than MySQL. Or million times slower. At simple read queries. (I have been corrected on this – apparently indexes in MemSQL are unidirectional, so you have to define separate index for each direction you are going to read the table in).

Well, once we establish that MemSQL will have O(N) performance on some common operation, all we need is just find an N that is large enough ;-)

I don’t know how much we should be blaming MemSQL guys and how much that should be directed at journalists that were hyping on the technology. If we get back to ACID, we’d see that A for atomicity is done only at statement level and BEGIN/COMMIT are ignored. Isolation is only READ COMMITTED (difficult to have REPEATABLE READ with no real transactions). Durability is flawed, and I didn’t check C part. I got to admit, MemSQL FAQ states that “Yes, MemSQL supports fully ACID transactions”. This is on them, then.

The 80000 queries a second on MemSQL number isn’t anything impressive, compared to what modern MySQL can do (especially with HandlerSocket) – people are approaching million queries a sec there :-)

Though, definitely, for things that it is doing well, it is fastest MySQL protocol speaking thing at the moment, though it isn’t that far ahead of MySQL Cluster, and people talking to NDB are having also quite good performance (really amazing performance, that is).

I’m sure, that my both claims can be fixed with some engineering work. Write performance needs proper real time synching with group commit (there has been some great development in MySQL world about that lately too – though when binlog is involved things are way more complicated).

Read performance needs proper optimizations for most common patterns – index order reads, for example. Memory is fast, but not fast enough if high concurrency environment would need to do this over and over again. Even for what it does well, I’m somewhat sure that it wouldn’t overperform InnoDB 30x at in-memory workloads. I’m too lazy to benchmark today, but this ‘Claim #3′ is not that difficult to prove :-)

Anyway, we wouldn’t need this post if there was a decent disclosure of behaviors and proper benchmarking. Now we get multiple conflicting claims that are way too easy to spot within few minutes of testing. Way too easy.

P.S. Harrison also has discussed this on Quora

This entry was posted in mysql. Bookmark the permalink.

27 Responses to MySQL is bazillion times faster than MemSQL

  1. Arthur says:

    Nice post, however I’m wondering why the MySQL defaults are so low.

  2. Nice says:

    Ha ha! Very thorough and awesome. I like you.

  3. vipulvpatil says:

    Will Solid state drives change everything all over again?

  4. MemSQL CTO here. Great article- Domas has done a good job of digging into the internals of MemSQL! A few questions/comments:
    1.) The range query issue you pointed out can be explained by a well known limitation of skip lists. Unlike B-Trees, skip lists are unidirectional. By default, our indexes are ascending, so indeed you have to skip to the end to run a MAX() or “ORDER BY id DESC” query. To fix this, just change the primary key to be descending in your schema:
    CREATE TABLE x (id int, …, PRIMARY KEY id (id) DESC, …)

    This is explained here http://developers.memsql.com/docs/1b/indexes.html#skip-list-…. If you want both behaviors in your app, you’ll have to create two skip list indexes (for now).
    2.) The transaction-buffer > 0 setting is really where we shine. Synchronous durability is something we have for flexibility, but I’ll be the first to admit that it’s not really optimized to perform well. The customers that we’re working with are okay with this. And it’s inspired by what modern companies do. Maybe it’s changed in the time since I’ve left (Domas?) but Facebook certainly does not write comments synchronously to disk.
    3.) Our synchronous durability does indeed flush in 50 MS cycles, so you’ll see poor performance on single threaded inserts. However, as you pointed out, we’re optimized for the case with multiple concurrent writers. Since MemSQL implements group commit on high parallel load throuput picks up. Sure, writing your own very specific benchmark you can show us writing poorly, but we’ve never worked with a customer that’s needed single threaded, synchronous writes to shine. If this is your use case, unless you need substantial read performance, MemSQL is not the database for you.

    • As mentioned on HN discussion, at Facebook we run with full durability on (and that has been standard operating practice before I joined in 2009, and has stayed that way – we did have a certain situation where we’d downgrade, but are not doing that anymore on our masters). We also have transactional crash recovery on slaves and quite a few other things, like group commit optimizations.

      Regarding transaction buffer, InnoDB has a setting for that too, less granular though, but still, you were talking about ACID and durability, yet forgot to tell that you don’t really provide those guarantees to clients.

      Yeah, I admit I didn’t account for Skiplist behavior, OTOH, it is there, and can be easily be exploited for the argument, especially if I engineer benchmark for that being memory efficiency issue, or some up with some other index behavior (loose scan, maybe?)

      Again, I will reiterate something I’ve said in multiple places – it definitely can perform really well in various tests, but actually coming up with decent tests that showcase that is something what is way more welcoming than whatever PR that was ongoing lately.

    • Emmanuel says:

      I think you should open source MemSQL. What’s the point if I can’t hack it?

  5. Pingback: Quora

  6. kenn says:

    I think 128M buffering is way less durable than innodb_flush_log_at_trx_commit=2, unless you have 128M writes per second.

    • their 20hz flusher can make it more durable tha innodb_flush_…=2 – this one relies on one second synch interval plus whatever synchs come via checkpointing and such.

  7. @kenn, it really depends on the rate of inserts. we flush at the speed of disk using group commit. you need to have an incredibly high rate of inserts to fall behind i/o. now 128Mb sounds like a lot, but you can write way more per second on a modern hard drive. We had a long discussion on what should be this setting: time or buffer size. We believe that buffer size if more honest, b/c you cannot control how long fsync will take and how much data can be written in a second.

  8. Povilas says:

    A little off-topic here, but looking forward to a reply.
    Since you are like the best database expert I know (don’t be very proud – I don’t know much :D), I would like to ask what about MsSQL vs. MySQL? There are indefinite rants going both ways. Like there are a lot of nasty things in 2k5-2k8 mssql, like there is no normal paging support. Let’s say in mysql you have LIMIT 0,5 and in mssql you have nothing (not nothing but everything you have is hack or workaround). Still, I haven’t seen any bad performance issues in MsSQL, never db layer was a bottleneck of any kind, and there was no lost transactions. Load tests that there are no scalability or other issues, everything performs really great there (well it is considered an enterprise solution isn’t it ;)). MySQL has all these nice people using it, various braggings about how great it is, how you can tune in the engines and stuff.
    Although I feel that you would be pro-mysql, but what would be the bottom line here?

    P.S. I have not taken into account that mysql is somewhat better than mssql because it’s open source. Personally (and professionally) I don’t really think I’ll be going deep inside mysqls engine – that’s why I don’t care that mssql is closed source.

    • I have no idea about MS SQL and I can’t say much. I hear it is a great product, and it had lots of smart minds behind it. We’re not working on MySQL for sole purpose of ranting against any ‘competition’, we work on it to make it best technology for our business.

  9. James Day says:

    Arthur, we pick something which has at least a good chance of starting and working well enough to let people get going without being too bad for things like shared hosting with MySQL, Apache and whatever else running on the same box. Then we hope that people with more serious production power needs will do a little reading and settings changing to set it up for what they are going to use it for.

    Views are my own; for an official Oracle view consult a PR person.

    James Day, MySQL Senior Principal Support Engineer, Oracle

  10. Shirish Jamthe says:

    Nice work Domas.

  11. bobx says:

    Writing a post bragging about MySQL performance is a pretty risky, perhaps masochistic endeavor. MySQL doesnt perform. I guess you found a way to make it seem like it does against this tool, but MySQL is the biggest pain in the ass open source has ever unleashed on the world.

  12. Justin says:

    Baller post.

  13. kedar says:

    I wonder if a memory engine with a trigger to disk table can replace MemSQL!

    Take away is every product has its use-case and you should choose one matching your exact need!! Stay away from baseless comparisons and benchmarks!

    Quite informative & good links.
    Thanks Domas.

    PS: Just joking @ line 1 ;)

  14. Pingback: Thursday Linkage | I'm James Hall.

  15. Pingback: Why are we spending so much time refuting?

  16. Pingback: Sysadmin Sunday 86 | Server Density Blog

  17. Golan Zakai says:

    I made a simple mysql heap table vs memsql:

    keder, if you are using replication you can use memory table on the master and innodb on the slave ;)

  18. It has been stated before that the compiled sql syntax adds little improvement, so I am not quite sure what if any benefit it has over Alchemy Database which is free and open source. Alchemy Database did everything they claim to do plus a whole lot more http://code.google.com/p/alchemydatabase/

Comments are closed.