What's the reason behind the difference in reported file sizes?
[root@localhost]# ls -lah sendlog
-rw-rw-r-- 1 mail mail 1.3T Aug 15 17:30 sendlog
[root@localhost]# du -m sendlog
24M sendlog
This came to our attention when a server's backup kept failing for quota issues, so it wasn't only "ls" which was seeing this wrong size.
Terms like "sparse files" and "block assignment" are coming to mind, but I'm not sure why it would happen or the real reason behind it. Obviously there is a difference in the ways the two commands check size, am I right always trusting du?
FYI, this should be a pretty standard mail log file.
The difference between the values is as follows.
From the manual of stat(2)
The size as reported by ls is
st_size
, the size as reported by du isst_blocks * 512
The value reported by du is the number of bytes used by the file on the filesystem/disk, and the value reported by ls is the actual size/length of the file when you interact with it. (In addition to operating with on-disk usage, du also only counts hardlilnked files once)
Which value is the "right one" depends on context. If you're after disk-usage du is correct, if you're wondering how many bytes is in the file, ls/
st_size
is correct.In addition, you can by using various options get i.e. du (--apparent-size) to use the size reported by
st_size
or you can get ls (-s) to report the number of blocks used.Your assumption regarding your logfile beeing a sparse file sounds plausible, however, the reason why I don't know.
Just as Kjetil explained, you have a sparse file. Blocks of blank data inside the file are not allocated to disk until you actually write to those blocks. How that happened in a log file is a mystery. You have to check your audit logs from the last time sendlog had a correct size to the time where it had this huge hole. Perhaps the answer is in the log file itself.
Perhaps someone did that intentionally to cause havoc in your system. Or it was some software error.
You can create your own terabyte-sized file easily with:
That file will allocate only a few kilobytes of disk space in any current Linux version with a filesystem with support for sparse files.
Your backup solution needs a replacement. Any serious backup system nowadays handle sparse files efficiently. Even the simplest solution using GNU tar support it (-S or --sparse option).
Maybe your
du
doesn't support such large numbers?Your filesystem could be corrupted (or disk has some physical problems). You should do fsck ASAP (on unmounted partition) and see what happens with those numbers afterwards.