We rotate and compress our Apache logs each day but it's become apparent that this isn't frequently enough. An uncompressed log is about 6G, which is getting close to filling our log partition (yep, we'll make it bigger in the future!) as well as taking a lot of time and CPU to compress each day. We have to produce a gziped log for each day for our stats processing. Obviously we could move our logs to a partition with more space but I also want to spread the compression overhead throughout the day.
Using Apache's rotatelogs we can rotate and compress the log more often -- hourly, say -- but how can I concatenate all the hourly compressed logs into a running compressed log for the day, without decompressing the previous logs? I don't want to uncompress 24 hours' worth of data and recompress it because that has all the disadvantages of our current solution.
Gzip doesn't seem to offer any append or concatenate option but perhaps I've missed something obvious. This question suggests straight shell concatenation "works" in that the archive can be decompressed but that gzip -l
doesn't work seems a bit dodgy.
Alternatively, perhaps this is still a bad way to do things. Other suggestions are welcome -- our only constraints are our relatively small log partitions and the need to provide a daily compressed log.
The gzip man page should have what you want, but you can concatate them directly:
Compression is not as good as if it was just one file compressed, but you can recover with:
gzip doesn't care. You can concatenate gzipped files and it will be exactly as if you had concatenated them and then gzipped them.
Just tar the gzipped files together. It's effectively a concatenate, and keeps them grouped logically together. The difference in file size between doing this and decompress/recompressing them together is virtually zero.
As in, with non-trivial log files, tar'ing 24 gzipped log files together will produce a file virtually identical in size to a single gzipped archive of all 24 original files.
The
CustomLog
directive allows you to specify a command that the logs are piped into instead of the usual log file.You could, for example, write a shell script that simply gzips everything it get's on stdin to a file you specify as argument:
It is probably not a good idea to combine this with
rotatelogs
, since that might corrupt the archive, but you can relatively easily emulate its behaviour.Then you configure Apache like so:
Test this! The buffering of gzip could be an issue.