on removing files

If you remove a file, file system generally just marks in its metadata that previously occupied blocks can now be used for other files – that operation is usually cheap, unless the file has millions of segments (that is such a rare case, only seen in experimental InnoDB features that Oracle thought was a good idea).

This changes a bit with SSDs – if you update underlying device metadata, it can have smarter compaction / grooming / garbage collection underneath. Linux file systems have ‘discard’ option that one should use on top of SSDs – that will extend the life time of their storage quite a bit by TRIM’ing underlying blocks.

Now, each type of storage device will react differently to that, some of them support large TRIM commands, some of them will support high rate of them, some of them won’t, etc – so one has to take that into account when removing files in production environments.

Currently Linux block layer sees TRIM commands in same shape as write commands, so if you are truncating a terabyte, it is seen as a terabyte of write activity (and managed in similar fashion). That may make your writes (and/or reads) suffer.

Probably correct place to handle this better could be a file system – but it doesn’t have good feedback signals and doesn’t really have the job of I/O scheduling. Currently it does this data discard at the same place as it returns blocks – in a path that is expected to be fast, so it may hold file system transaction for a duration of this operation. This may or may not stall other file system activity at the time.

So now that we know that devices can be stupid, device drivers are stupid, block layer is stupid and file systems are stupid, we have to somehow address this issue. An obvious solution is to delete files slower.

Lots of performance engineering can be done by adding right sleeps into appropriate places – so one can do that to ‘rm’ too – it can sleep a bit after certain amount of bytes removed. What to do with large files? We have to slowly truncate them before we unlink them.

So I built a slower rm:

https://github.com/midom/slowrm

Usage:
 slowrm [OPTION...] PATH [PATH] ...

Help Options:
 -h, --help Show help options

Application Options:
 -r, --recursive Dive into directories recursively
 -c, --chunk=128 Chunk size in megabytes
 -s, --sleep=0.1 Sleep time between chunks
 -f, --force Continue on errors (by default bail on everything)
 -x, --one-file-system Only operate on one file system

We do what we got to do.

	markcallaghan (@mark… on MySQL does not need SQL
	markcallaghan (@mark… on MySQL does not need SQL
	Domas Mituzas on MySQL does not need SQL
	Marc on MySQL does not need SQL
	Nils Meyer on linux memory management for…

Related