Since the beginning the standard UNIX/Linux systems support sparse files, this is a file which contains unused space that is unallocated until needed. To review, to generate via a C program: create a file, position to 2G, write ONE byte, close file. Doing an ls -l shows the size to be 2G....however ls -ls shows the size in blocks to be closer to a one byte file. If you logically access the file (i.e. cp sparse_file xxx) the resulting file xxx will indeed contain a fully allocated 2Gbytes.
I have created sparse files in the past as a testing vehicle for some of applications. However, their existence has caused a few problems.
The important problem is that outside of the 'dump' program, backup programs and general procedures access these type of files logically and thus for a 1 byte sparse file one gets a backup w/ 2G of 0'd data. This has caused some upset backup folks when I do this.
Any good solutions for this type of situation?
GNU Tar has the --sparse (-S) options that make working with spares files simple.
Use a backup program that is capable of detecting and handling sparse files correctly. There's plenty of them around (a la Jeremy's suggestion of tar with -S), just make it a checklist item on your backup system evaluation.
rsync-based backup programs should be able to handle space files just fine (rsync has --sparce/-S options)
The star program is much faster for sparse files than GNU tar. It requires the -sparse option when handling such files. For just plain copying use cp --sparse=auto