I'm running Snort in conjunction with MySQL for logging, which is generating ENORMOUS datasets (currently the event table is over 2.5 million, I don't know exactly how much because it only goes up to 2.5 million before it clunks out from using too much memory).
Unfortunately, this data is not very useful anymore because I can't pull it out anywhere else (the stored procedure causes the server to crash).
My question is, is there a way to optimize MySQL for these huge datasets or is this beyond the technical capabilities of MySQL and I need to go to something like Oracle, MS SQL, or PostgreSQL?
We have both an Oracle and a MS SQL Server instance but both of these are business critical production servers and it would be very bad news to knock either one of those offline or inhibit their abilities.
Any thoughts on the matter?
like others say - 2.5M is not a huge number of rows. look at your schema design - can it be that your reporting runs full table scans where indexes can be used [warning: introducing new indexing will decrease insert performance].
did you try to optimize innodb? make sure that at least indexes fit in buffer pool memory. try mysqltuner.pl or if you have more time - dive into mysqlperformanceblog.com.
2.5 million records should be no problem. Sharing the schema would help. Also, mysqltuner.pl (mentioned in another answer) will warn you about some my.cnf issues - such as innodb_buffer_pool being smaller than the size of your indexes. Definitely run that. innodb_buffer_pool should be set as high as possible.
If you have any TEXT columns, any queries that involve scanning a lot of rows will perform much better if you move those columns out into a separate table. Even better, use InnoDB plugin, Percona Server, or MariaDB and enable compression for those new text column tables.
Maybe innodb is not the best choice for logs?
I've got a centralized syslog server and it's set up so that every month the data goes to a different/new table and there is a view with all these tables joined. The old logs are then compressed with myisampack so they take up a lot less space, are read quicker and become read-only. It works very fast.