how MySQL engineering broke the backups

MySQL has exceptional track of record by introducing minor fixes that cause major breakages. Though usually I could blame naiveté of engineers, who did not really ever have to deal with production implications, but lately I can start sensing various business implications against open-source offerings.

As an original author of mydumper I really cannot get out of my mind that 5.5 and 5.6 metadata locking changes are there to screw with anyone who is building a backup solution using stable snapshot views of MySQL (for example, mysqldump –single-transaction, the golden standard of backing things up in MySQL world).

As seen in a bug #71017 (palindrome!) filed by my esteemed colleague Eric, newly introduced behaviors gobble all the locks possible, even if it makes absolutely no sense for backup/ETL/migration/etc scenarios. 

The only supported way out of that is using MySQL Enterprise Backup, which is proprietary software, and does not produce logical backups that allow selective data restores or ETL capabilities or anything else. You get complete vendor lock in where there is no way to get your data out of the system in a consistent manner, unless, of course, you restrict to “no metadata changes allowed in production” mode. 

On hot backups

Few years ago I was looking at crash recovery code, and realized that InnoDB has removed all the comments from the code [this assumption is debunked by Heikki in comments section], related to replay of transaction log. Judging by high quality of comments in the remaining codebase, I realized that it was all done to obscure any efforts to build another InnoDB hot backup solution – competitor to first Innobase standalone offering.

I was enjoying the moment when Percona launched their own implementation of the tool. Since the inception, it became more and more robust and feature rich.

We have used xtrabackup in our environment a lot – just… not for backup – the major use case right now is for cloning server instances – either for building new replicas, shadow servers, or replacing masters – and allows us to do that without interrupting any operation.

Now what makes the whole situation way more interesting – Oracle/MySQL announced at the conference, that InnoDB Hot Backup will be part of the Enterprise offering – which makes it way more available to MySQL customer community, than when it required quite expensive per-server licenses.

Of course, open source xtrabackup is way easier to tweak for our environment (O_DIRECT support, posix_fadvise(), flashcache hints, etc – was all added after release) – and it is interesting, how Oracle-provided tool will evolve. Right now xtrabackup already supports streaming operation, which makes it much more usable in large-database-on-small-hardware (read: sharded) environments, and provides flexibility to the users. Oracle of course owns much more of in-house expertise of both current internals operation, as well as all the future changes that will happen, so we may see leadership in the field coming from their side too.

One of our reasons for not using physical backup solution is simply that it is not space efficient. There may be multiple ways to approach that – from robust incremental backups, to partial backups, that wouldn’t include secondary indexes or have limited set of tables taken.

Some changes may actually require extended MySQL/InnoDB support – on multiple-terabyte instances one may not want to rescan the dataset for each incremental backup run – as resulting diff would be just a hundred gigabytes or less. This would require support for always-running backup agent that would aggregate information about block changes and allow for more efficient backup operation.

Discarding secondary indexes is way more attractive option with 5.1/Plugin ability to do fast independent index builds, that don’t require one row at a time B-Tree builds for all indexes at once (and of course, hit severe memory penalties on large tables or in parallel workloads).

Having always ready backups is important not only for ability to rebuild a box (and we have replicas for machine failures) – the real value is when backups can be used for massive-scale thousands of machines subset of table rows extraction. For that one cannot just ship full instance data around from backup storage – so recovery tools will have to be way flexible.

Probably core feature for that kind of operation would be ability to import tables directly from hot backup to online instances – unfortunately, restarting database instance is still costly (though we’re doing quite some work in that direction too).

I’m extremely happy that InnoDB started fixing operational issues like crash recovery performance, but there’s still a wide area of problems not touched properly yet – extremely in disaster recovery space, and I’m eager to see developments in this field – both from Oracle, and community members.

mydumper

Last weekend I ended up working on small pet project – and today I’m kind of releasing it.

So, I had that idea that there’s no good tool to do logical dump of MySQL data for large sites – mysqldump doesn’t provide too much of I/O pressure, mk-parallel-dump is closer, but it doesn’t do consistent snapshots, uses same mysqldump, as well as is written in Perl (haha!), and… I just wanted something new to hack, as proof of concept. For a while to use it one had to edit constants in code, but my colleague Mark contributed options support and it doesn’t need recompiles anymore to run it :)

So, let me introduce mydumper. It doesn’t dump table definitions, all it does is extracting data and writing it to files, fast. Continue reading “mydumper”

%d bloggers like this: