I have a rather large file (~50GB) and it takes some time to run
tar xvf file.tar.bz2
on it. I'm aware of programs that can do parallel compression for bzip2 files but unaware of programs that can do parallel decompression for bzip2 files.
Are there any programs that can achieve this? What is the exact syntax of the command to use to extract from the file?
I'm using ubuntu 12.04
lbzip2
andpbzip2
are the tools which you can use for parallel compression and decompression.Usage:
-d
option is used for decompression.To install these packages:
lbzip2 type:
pbzip2 type:
You can uncompress your archive with a single command using the tar
-I
option. It gives you the ability to use any compression utility that supports the-d
option.It comes very useful when deailing with big archive as you don't need to have twice the uncompressed size available on the target filesystem (the tar temp file and the output file) It's also faster as you need far less disk IO.
Of course that works when compressing too :
Check
tar --help
for more options.you can use pbzip2 with the
-d
flag to "decompress",from the manpage:
This example will decompress the file "myfile.tar.bz2" into the decompressed file "myfile.tar". It will use the autodetected # of processors (or 2 processors if autodetect not supported).
After decompressing, you need to untar the file with
A tar file is just a container, to which you can apply multiple compression algorithms, for example, you can have a ".tar.gz" or a ".tar.bz2" which both have different compression algorithms applied. So pbzip2 will only uncompress the archive but it will not extract the files, use
tar
to extract the files. Tar shouldn't take long since the archive is already uncompressed and it will just extract the files. (note that we are Not using the 'z' flag or the 'j' flag in the tar command, which they indicate that we also want to decompress the file)lbzip2 seems a lot better than pbzip2 in your case as it is able to speed up decompression of standard .bz2 files while pbzip2 doesn't do that. (Just tested it - 17 seconds for lbzip2 vs 56 seconds for pbzip2 on a partially loaded quad core).