I don't want to give an external company like s3stat access to my logs. I know that AWS logs S3 and Cloud Front in a format readable by AWStats. Has anyone used AWSats to analyze them?
S3stat used to offer a hosted version of their software that was in beta but I believe it has been discontinued.
I am not tied to AWStats, I will consider other self hosted web log analyzing software options.
I don't use AWStats with S3, but would suggest there are 3 problems with processing the logs:
You need to obtain the data - it is stored on S3
With Cloudfront, AWS gives you the option of which bucket you wish to use - it does not have to be the source (origin) bucket. You can easily setup a specific bucket for your logs and can mount this via s3fs - this should provide the simplest access to the files - retaining the timestamps, etc. that are often needed for incremental processing of logs. Alternatively, if you don't wish to mount a bucket as a local file system, you could use s3cmd, aws, or one of the SDKs to download the files. (There is a python script (using boto) for this purpose - here - although, I can't vouch for its effectiveness.)
You need to decompress and combine the logs
Cloudfront logs are compressed (gzipped), and stored as multiple files - the filenames contain the date and hour (e.g.
XXXXXXXXXXXXX.YYYY-MM-DD-HH.XXXXXXXXX
), although, there can be multiple files per hour. The files can be decompressed withgunzip
and combined with the (AWStats provided tool)logresolvemerge.pl
.You need to provide a custom log format to AWStats
The file format is tab separated and resembles:
You would, therefore, setup AWStats with: