we have a few web servers and am planning to create a dashboard to show the real time stats ip address,geo-location and other custom data based on database lookups. Splunk sort of fits perfectly but wondering if there are any open source alternative . i have looked at logstash and graylog2, but to my knowledge they are more of a log analysis tools. Piwik is sort of interesting except that i cannot put any javascript on the webpages. All i have access to is apache web log. Any recommendations please..
Good ol' AWStats is a real-time log analyzer that has dashboards and widgets and wingdings and portals and panes o' glass and other such things. You can even customize it with plugins to your liking.
Visitors has a real-time mode and can show you basic information such as most visited pages, the hottest hours/days and even visual path analysis.
You can also feed your Apache logs to MySQL with
syslog-ng
and then use front-ends such as logzilla (previously known as php-syslog-ng) for querying the data.An interesting question, by the way -- I'm all ears for better solutions! +1 to your question because of that. :)
To what end?
There's really 2 branches of web analytics - marketing information and performance information (and user interface design which kinda spans both).
Google Analytics, Open Web Analytics, Piwik and to a lesser extent AWstats, Analog et al are primarily about gathering marketing information (what your customer base is, where they are, what browsers they use, what conversion ratio...).
The performance side doesn't offer as much choice - but statsd + graphite provides a stonking backend for storing and presenting data from multiple sources (logs, javascript bugs). I'm currently planning an installation using this at the back end and Yahoo Boomerang to collect page load times. Have a look at Graphene for an example of what it can do. Writing, say an awk script, to parse the logs and feed the backend would be trivial.
There's also tools like PastMon which can sniff and report on lots of low level network stats. Or mrtg.
As you mentioned there is Piwik, which has a flexible tracking API, you can both insert Javascript with
<noscript>
tag or insert a simple image in your pages.Insert the following code, as suggested on the official Piwik tracking API page:
No need for JS for basic features. :)
Thanks for all the advice. I have currently setup logstash on client to send their access logs to a central rabbitmq server and using another instance of logstash to parse the data into elasticsearch. with the RESTApi of elasticsearch i was able to do a few interesting dashboards (like the current location of users accessing the web server).