I've been working with qemu raw images and I just had a few questions about using tar with them.
From what I've read, bsdtar with kernel >= 3.1 is able to handle the sparse image files much quicker than gnu tar can because it can take advantage of the seek_hole functionality in the kernel. I tested it out and it is significantly quicker than tar.
My question is this... my image file (full size) is 260G. Since it isn't full and is sparse it only actually takes up 38G. When I do a tar -cvSf test.img.tar test.img
it takes a long time (~10 minutes) but I end up with a file that's 20G. If I untar, it goes back up to 38G. When I do a bsdtar -cvf test.img.tar test.img it
goes much quicker (~2.5 minutes), but the filesize is 38G intead of the 20G that gnu tar gave me.
What's the difference? Why is the filesize smaller with tar? I would expect the behavior to be like what bsdtar did because I thought tar -S only forced tar to treat the file as a sparse file and not expand it so I don't get why its smaller.
Thanks in advance!
From the GNU tar manual (info):
(emphasis added)
Ie, it's slower because it reads the file(s) twice; the first time to analyze the file contents, second time to actually archive them.
This approach to detecting sparseness probably also explains why the archive ends up even smaller; quite possibly there are significant sequences of zeroes that are not actually stored sparsely.