I am having a difficult time grasping what is the correct way to read the size of the files since each command gives you varying results. I also came across a post at http://forums.devshed.com/linux-help-33/du-and-ls-generating-inconsistent-file-sizes-42169.html which states the following;
du gives you the size of the file as it resides on the file system. ( IE will will always give you a result that is divisible by 1024 ).
ls will give you the actual size of the file.
What you are looking at is the difference between the actual size of the file and the amount of space on disk it takes. ( also called file system efficiency ).
What is the difference between as it resides on the file system and actual size of the fil
This is called slack space:
So, if your filesystem allocates space in units of 64 KB, and you store a 3 KB file, then:
Note: Some filesystems support block suballocation, which helps to mitigate this issue by assigning multiple small files (or the tail ends of large files) into the same block.
There's another option here, that hasn't been covered -- sparse files. In this case,
du
will show a smaller size than a simplels -l
would, becausels
is reporting the "size" of the file as being the apparent size (the number of bytes you could read, if you wanted a whole lot of zeroes), whiledu
will continue to use the actual number of disk blocks in use.Fun trick: Create a great many large sparse files, then impress your friends with how much disk space you have ("look, I'm storing eleventy-gazillion 1TB files on my hard drive!"). OK, maybe not so fun then.
Filesystems are made up of blocks. Files don't have to neatly fit into blocks. If a file was 1024 bytes it's size in ls and du would be 1024. If the file size was 1025 the size would be 1025 in ls and 2048 in du.
Note the example above assumes a block size of 1024. Larger block sizes are the norm these days e,g,
There's still one more reason they may be different. du -h knows when it sees the same file under another name (hard links, as opposed to symlinks) and will report each file for the size it is, but only add the size once to the common parent directory.