Possible Duplicate:
Alternatives to Splunk?
This has been discussed, but it has been several months, so it may be time to revisit it:
Earlier discussion RE Splunk alternatives
For the record, Splunk rocks. But the pricing is simply beyond what we can consider (When I spoke with Splunk today, the cost for a system to index 5gb/day of data is over $30,000.)
That is more than we spend on SQL Server (by a large multiple), more than we spend on a rack of servers (by a multiple), etc. etc.
The splunk sales team is correct (that for $30K we get more value and functionality than if we spend the same building our own system), but it doesn't matter. The splunk cost is simply too high (by a multiple).
Soooooo, we are looking around!
Is anyone out there building a splunk like system?
Our basic need:
- Able to listen for syslog messages on multiple udp ports
- Able to index the incoming data in an async way
- Some kind of search engine
- Some kind of UI
- An API to the search engine (to embed in our console)
We currently need to index 3-5gb/day, but need to be able to scale to 10gb/day or more. We do not need a lot of history (30 days is fine).
We use Windows 2008 and 2003 servers.
Thanks for your thoughts!
UPDATE: We spent two weeks researching commercial and open source options. Our conclusion: Write our own (we are a software company... we know how to write things). We built a great system built on mongodb and .NET that gives us the functions we needed from MongoDB in about one engineering week. We have now completed our implementation. We use two Mongodb servers (master and slave), and are able to log and index any amount of log data (5gb/day, 15gb/day, etc), limited only by disk space.
UPDATE TO THE UPDATE (December, 2012): We continue to use our mongodb solution, and it works great! If we were building it today, we would strongly consider building it on top of elasticsearch.
OBSERVATIONS: This space needs a solid solution that is $1000-3000 flat rate. The licensing models used by the commercial firms are based on a "milk the data center ops guys" models. That is their right (of course!), but it leaves a HUGE space open for someone to come in underneath them. My guess is that in another year or two there will be a good open source solution that will be really usable.
Thank you all for your input (even if it was self promotion).
https://www.elastic.co/products/logstash
It's still rather early in development, but sound very promising and moves fast.
I don't have a comparison matrix for the following in my mind, especially when it comes to comparison with splunk:
These are some fully operational tools:
Octopussy http://www.octopussy.pm
Logreport http://www.logreport.org/
Snare: http://www.intersectalliance.com/projects/index.html
Log surfer: http://www.crypt.gen.nz/logsurfer/
Log Analyser: http://loganalyzer.adiscon.com/
Log 2 timeline: http://log2timeline.net/#download ( this is more of a "timeline" analysis tool )
Finally, if you want to do some coding yourself but possibly have a more scalable solution: (the following are tools to collect log data, they don't necessary have all the functionality out of the box to search through the data.)
Honu https://github.com/jboulon/Honu
Chukwa http://wiki.apache.org/hadoop/Chukwa
Flume http://archive.cloudera.com/cdh/3/flume/
Edit: Added this comparison link: http://csgrad.blogspot.com/2010/07/guided-tour-of-hadoop-zoo-getting-data.html
Edit: Added Graylog2: Added Logstash. Logstash is probably the best positioned to day to become the "open source splunk replacement."