On checksums

InnoDB maintains two checksums per buffer pool block. Old formula of checksum, and new formula of checksum. Both are read, both are written. I guess this had to be some kind of transition period, but it obviously took too long (or was forgotten). Anyway, disabling checksums code entirely makes single-thread data load 7% faster – though in parallel activity locking contention provides with some extra CPU resources for checksum calculation.

Leaving just single version of checksum would cut this fat in half, without abandoning the feature entirely – probably worth trying.

Update: Benchmarked InnoDB checksum against Fletcher. Results were interesting (milliseconds for 10000 iterations):

Algorithm:	InnoDB	Fletcher
–	826	453
-O2:	316	133
-O3:	42	75

So, though using Fletcher doubles the performance, -O3 optimizes InnoDB checksumming much better. How many folks do run -O3 compiled mysqld?

6 thoughts on “On checksums”

Mark Callaghan says:

2008/05/29 at 22:56

I hope you get less than half because the old style checksum is computed over much less of the page than the new style checksum.
Arjen Lentz says:

2008/05/29 at 23:56

Good moment perhaps to mention the checksum algorithm also, people will be curious and one can always discuss… just paste code snipped w/ comments?
Peter Zaitsev says:

2008/05/30 at 07:43

Indeed it is not 2 times gain but there is some to save,
Another question is if checksumming algorithm is as efficient as possible ?
Could be it could be replaced with some other CPU Cache scratch free SSE based algorithm for better performance.
I know Linux kernel uses these for RAID checksum computation when available.
Domas Mituzas says:

2008/05/30 at 12:25

Interesting, SSE4.2 adds CRC32-on-chip. It also adds more nice functions, that can be used for strlen() and similar optimizations. Nehalem will be kickass for databases, if they manage to use the feature set.
Matt Ingenthron says:

2008/05/30 at 19:09

Is there an option to turn it off entirely if the filesystem is checksumming? I keep meaning to look into this.

For instance ZFS always checksums and uses encryption like algorithms and CPU features to be extremely efficient about it. With that big non-Open Source DB, turning off the DB checksum was a good win. Neel has a blog on this somewhere…
Domas Mituzas says:

2008/05/30 at 23:27

Matt, –skip-innodb-checksums

Yeah, ZFS can be fast, but so can be userland checksums, if implemented properly. The way currently InnoDB does it may not be that easy to optimize at compiler level – I’ll have to check it more.

Comments are closed.

	markcallaghan (@mark… on MySQL does not need SQL
	markcallaghan (@mark… on MySQL does not need SQL
	Domas Mituzas on MySQL does not need SQL
	Marc on MySQL does not need SQL
	Nils Meyer on linux memory management for…

Related

6 thoughts on “On checksums”