Checksums again, some I/O too

When I was doing data loading tests, I realized that usually low checksum calculation CPU percentage is actually the blocking factor. See, usually when background writers do the flushing, it gets parallelized, but if active query is forcing a checkpoint, it all happens in ‘foreground’ thread, checksum computation included. This is where more Sun-ish wisdom (these people tune kernel with debugger all the time) comes in:

gdb -p $(pidof mysqld) -ex "set srv_use_checksums=0" --batch

Puff. Everything becomes much faster. Of course, one would be able to restart the server with –skip-innodb-checksums, but that would interrupt the whole process, etc. Of course, proper people would implement tunable parameter (5 lines of code, or so), but anyone with Solaris experience knows how to tune stuff with debuggers, hahaha.

Odd though, I was used to compiled -O3 mode optimizing checksums to disappear from profiles, so either this doesn’t work, or it just got so much emphasis to actually matter. This is why I told Heikki at the Users Conference, that checksums ‘must go’ or be fixed in one way or another. Even when they’re small part of operation and doesn’t cost anything if it doesn’t block primary operations (e.g. happens in parallel), people keep accessing SSDs, and then relative cost of checksum is insane.

It is quite easy to check that, just run InnoDB with tiny buffer pool, OS caching, and try some benchmarking with checksums enabled and disabled – there will be a huge difference, which may simply explain the performance difference of caching at buffer pool and OS buffers. Turn checksums off, and OS caching may be even tolerable for your workloads.

Other interesting issue here is that MarkC has lots of I/O path optimization in his fourth (and four is a good number!) patch. Now… maybe everyone will be soon caching pages in OS, once some more work is done in I/O access path cost work?