What is a sensible logs policy?
On one hand I would like to keep everything forever. On the other hand I don't want to waste time in administrative tasks and must avoid disks getting full in production servers.
What is a sensible logs policy? What tools are there (free or not) to help you implement the policy.
Are you rotating your logs? That will probably be your best plan of action. Using logrotate makes it really easy to save old logs, compress them if you want, and keep them for as long as you want.
That's a snippet of one of my logrotate files. The first stanza rotates the mail log every day, keeping old copies for seven days. "missingok" means that it will ignore the file if it isn't where it's supposed to be. The postrotate . . . endscript section contains commands that will be run after the file has been rotated. Compress is self-explanatory, the default is gzip. You can change the compression using something like
The lab submit log is rotated once a month and kept for 5 months.
I hope this helps. I am assuming (obviously) that you currently aren't rotating your logs, that you're running some kind of linux, and that you would want to use logrotate, depending on your distro and type of log you might not want to use logrotate. If any of my assumptions are incorrect, let me know and I will try to revise my answer.
My general course of action, depends on the amount of disk I can comfortably maintain for log information, while dealing with that once-in-a-while catastrophic debug event that may cause a dramatic increase in disk space usage.
Remote logging, always, due to the following:
On the central server, keep logs as long as you think are necessary (or are required to). I generally hold onto [compressed] logs between 6 and 12 months for trending, but 1 or 2 months may be fine for you.
Local logging and rotation:
The local logging keeps you covered just in case you lose network connectivity at some point.
For certain things, I want to hold onto logs for a long time - for example my apache logs for historical interest. But even there, I have a
cron
job running every day and/or week to do an simple analysis of unique visitors that gets mailed to a gmail account I established just for that sort of thing.However, my general approach is that I don't want or need most of the data in those logs going back more than a few days.
I already know that I'm never going to "get around tuit" when it comes to doing any graphing or historical analysis because, frankly, I'm too busy doing my "real" job :)
If you're running a
syslog
collector, you may need to hold onto those logs longer - just because they're grabbing everything from however many servers you're collecting from.The last time I had a
syslog
server setup, we had a pair of old DL180s with 18GB harddrives running Ubuntu. Both cross-mounted the other via nfs (<othersys>/path/to/log @ <currentsys>/path/to/backup
).We rotated our logs daily, compressing via
bzip2
. When the drive space hit >90% used, we'd drop the oldest file.It's been mentioned before*, but you may also want to investigate a log analyzer such as epylog or Splunk as a component of your log policy.
Well, out in the real world a lack money and time tends to get in the way; but here are the primary concerns IMHO:
a) Collect logs in a central repository. Collect logs in one secure location, to
b) Use realtime search & filtering to slice and dice log data when needed.
c) Set up alerts. Set up meaningful alerts for your system. This has a great deal of overlap with other systems such as Munin or Nagios; they can do pretty much the same. Which system you'll prefer to alert you will be a matter of opinion and case-by-case circumstances.
d) Keep everything for at least 90 days. You can throw away less important data after more than 90 days if you need to, but you may not need to. For example, if you use MySQL with the Archive storage engine for historic data, then you can store large amounts of data cheaply, but mostly read-only and with poor indexing. Dividing the data up in "hot" and "near-line" may work well.
Good & cheap systems that enable the above? I'm still searching. The solutions seem divided up in two IMHO: 1) open source / cheap systems based on a database and some scripting, and 2) large enterprisey' we-help-you-with-regulatory-compliance systems that are expensive. Neither seem just right for me.