Few years ago I was looking at crash recovery code, and realized that InnoDB has removed all the comments from the code [this assumption is debunked by Heikki in comments section], related to replay of transaction log. Judging by high quality of comments in the remaining codebase, I realized that it was all done to obscure any efforts to build another InnoDB hot backup solution – competitor to first Innobase standalone offering.
I was enjoying the moment when Percona launched their own implementation of the tool. Since the inception, it became more and more robust and feature rich.
We have used xtrabackup in our environment a lot – just… not for backup – the major use case right now is for cloning server instances – either for building new replicas, shadow servers, or replacing masters – and allows us to do that without interrupting any operation.
Now what makes the whole situation way more interesting – Oracle/MySQL announced at the conference, that InnoDB Hot Backup will be part of the Enterprise offering – which makes it way more available to MySQL customer community, than when it required quite expensive per-server licenses.
Of course, open source xtrabackup is way easier to tweak for our environment (O_DIRECT support, posix_fadvise(), flashcache hints, etc – was all added after release) – and it is interesting, how Oracle-provided tool will evolve. Right now xtrabackup already supports streaming operation, which makes it much more usable in large-database-on-small-hardware (read: sharded) environments, and provides flexibility to the users. Oracle of course owns much more of in-house expertise of both current internals operation, as well as all the future changes that will happen, so we may see leadership in the field coming from their side too.
One of our reasons for not using physical backup solution is simply that it is not space efficient. There may be multiple ways to approach that – from robust incremental backups, to partial backups, that wouldn’t include secondary indexes or have limited set of tables taken.
Some changes may actually require extended MySQL/InnoDB support – on multiple-terabyte instances one may not want to rescan the dataset for each incremental backup run – as resulting diff would be just a hundred gigabytes or less. This would require support for always-running backup agent that would aggregate information about block changes and allow for more efficient backup operation.
Discarding secondary indexes is way more attractive option with 5.1/Plugin ability to do fast independent index builds, that don’t require one row at a time B-Tree builds for all indexes at once (and of course, hit severe memory penalties on large tables or in parallel workloads).
Having always ready backups is important not only for ability to rebuild a box (and we have replicas for machine failures) – the real value is when backups can be used for massive-scale thousands of machines subset of table rows extraction. For that one cannot just ship full instance data around from backup storage – so recovery tools will have to be way flexible.
Probably core feature for that kind of operation would be ability to import tables directly from hot backup to online instances – unfortunately, restarting database instance is still costly (though we’re doing quite some work in that direction too).
I’m extremely happy that InnoDB started fixing operational issues like crash recovery performance, but there’s still a wide area of problems not touched properly yet – extremely in disaster recovery space, and I’m eager to see developments in this field – both from Oracle, and community members.