We're using rsnapshot for backups. It keeps lots of snapshots of the backuped up file, but it does delete old ones. This is good. However it's taking about 7 hours to do a rm -rf
on a massive directory tree. The filesystem is XFS. I'm not sure how many files are there, but it numbers in the millions probably.
Is there anyway to speed it up? Is there any command that does the same as rm -rf
and doesn't take hours and hours?
No.
rm -rf
does a recursive depth-first traversal of your filesystem, callingunlink()
on every file. The two operations that cause the process to go slowly areopendir()
/readdir()
andunlink()
.opendir()
andreaddir()
are dependent on the number of files in the directory.unlink()
is dependent on the size of the file being deleted. The only way to make this go quicker is to either reduce the size and numbers of files (which I suspect is not likely) or change the filesystem to one with better characteristics for those operations. I believe that XFS is good for unlink() on large file, but isn't so good for large directory structures. You might find that ext3+dirindex or reiserfs is quicker. I'm not sure how well JFS fares, but I'm sure there are plenty of benchmarks of different file system performance.Edit: It seems that XFS is terrible at deleting trees, so definitely change your filesystem.
As an alternative, move the directory aside, recreate it with the same name, permissions and ownership and restart any apps/services that care about that directory.
You can then "nice rm" the original directory in the background without having to worry about an extended outage.
Make sure you have the right mount options set for XFS.
Using -ologbufs=8,logbsize=256k with XFS will probably triple your delete performance.
It's good to use ionice for IO-intensive operations like that regardless of filesystem used.
I suggest this command:
It will play nicely for background operations on server with heavy IO load.
If you are doing the rm at effectively at the file level then it will take a long time. This is why block based snapshots are so good:).
You could try splitting the rm into separate areas and trying to do it in parallel however I might not expect it to make any improvement. XFS is known to have issues deleting files and if that is a large part of what you do then maybe a different file system for that would be an idea.
I know this is old, but I thought id toss in a suggestion. You are deleting those files sequentially, executing parallel rm operations might speed things up.
http://savannah.nongnu.org/projects/parallel/ parallel can commonly be used in place of xargs
so if your deleting all the files in deltedir
That would leave you with just empty directory structures to delete.
Note: You will likely still hit the filesystem limitations as noted above.
Would an alternative option here be to seperate the data in such a way that you can junk and rebuild the actual filesystem instead of doing the rm?
How about decreasing the niceness of the command? Like: