Once a day, I want to run AWStats on webserver log files generated by multiple load balanced servers. I want an efficient way of transfering them to one place. Is there already a tool that can do this?
Otherwise, I was thinking of using a cron job to grep for the current day, then tar and gzip the files before sending them over so I can merge and analyze them. Is this a good approach or can you suggest a better approach?
Thanks!
Just rsync the logs over to your analysis machine, saves a hell of a lot of unnecessary logic.
Use rsync
I'd use rsync in a cron job. Fast, reliable, simple.
There are tools that are meant to be a Central Log Management System for Linux, that doesn't require them to be copied. At least, not in the sense you are talking about, you can just set up NFS mounts or install clients on the machines.
An easy one with a nice web interface is Splunk, it is free for up to 500MB a day of indexing without authentication for the web interface.
The classic more manual method is syslog-ng which might already be on your system. Here is a tutorial on setting up a central log server with that.
In AWstats tools you can find a perl script (logresolvemerge.pl) for merge log files in case of load balancing. This script can help you AFTER the copy (rsync is good choice - see answeres of jodiek and KPWINC) of web server's logs.
Have you considered using NFS on the servers to mount a directory from another server?