My apache log files are getting too big and I'm looking for ways to make them more manageable.
I know I can use conditional logging to only log access to specific types of files, but it seems to make more sense to log a random sample of the requests, so that I can still get an idea of what's going on without having to log every single request.
Is there something like that available?
I'm on ubuntu 8.04 with apache 2, and using cronolog for log rotating.
Why not rotate log more often ? If each week rotation give you too much logs, turn them each day. If each day, turn them each hour. The problem of this solution is in log analyzers, like webalizer : they need to be configured accordingly.
I prefer to log everything, because when you have a problem, you never have too much informations. And with the actual disk prices, there is no issue of capacity for me.
You can control the log format via the LogFormat directive.
If volume is the problem consider http://www.mrunix.net/webalizer/ which you can run off a cron job and produces nice graphics. I think it even looks inside older logs which have been zipped by logrorate.
Assuming there is a random distribution of error messages within the log file you could just print every 20th line in the log e.g.
I can think of three options to reduce the logfile size.
One possible way of doing this is the conditional logging you mentioned. Now conditional logging uses the SetEnvIf Apache feature. The actual syntax specs of SetEnvIf state:
So how about using this to make an expression that only matches the 'even' (or 'odd') IP addresses of the Remote_Addr? You can cut it even further by limiting the IP ranges even further.
Of course you could also look at the reason for your question here: What makes the logfiles 'too big' and 'unmanageable'? What information to they hold for you?