The Context:
We ran Zenoss 2.2 for way too long, and decided it was time to upgrade to the current version, 3.2. I followed the upgrade steps here, and checked to make sure everything was working after each incremental version upgrade. I made sure to restart when unsure, and verified that all of the database migrations (and the preupgrade ZenPack) applied properly. Everything worked, with the exception of a couple of outdated ZenPacks that we weren't using much anyway. We run CentOS 5, so I used the old rpm'd versions available on sourceforge for the intermediate upgrades, and then made the last "hop" to 3.2 using yum. For some reason yum didn't install the core zenpacks for 3.2, so I did that manually, and then everything was fine: zenoss worked great, no issues. Well...except one.
The Problem:
We schedule backups using cron/mysqldump of Zenoss' events database. Those backups go off daily, and usually take up around 7-800MB of space. After the Zenoss upgrade, those backups are taking up more than 200GB (with a 'G'!) of space, though I haven't seen one finish yet. To run the backup, we are using
mysqldump -A | gzip > dump-12-28-2011.sql.gz
We have MySQL 5, so --extended-insert is a default parameter for mysqldump. Zenoss is the only thing using the database (and the "events" database is the only one present besides the mysql one), and I've turned Zenoss off for the duration of the dump. My ibdata1 (we only have 1) file is around 13GB now. I don't know how big it was before the upgrade. I have tried this solution to delete old events detail entries. The query ran forever, but afterwards the dumps balloon in size the same way they did before. Why is this happening? I have backups from before the upgrade, should I revert?
TL;DR:
Why are my Zenoss "events" database dumps 1000X larger after a multiversion upgrade?
After a couple of days of digging, I figured out the problem: InnoDB corruption. Out events database really WAS quite large (we were retaining a year's worth of old events, and had a ton of windows computers reporting tetchy little things, so we had a lot of data), but that wasn't the issue. I started running
$ZENHOME\Products\ZenUtils\ZenDeleteEvents.py -n 60
to trim our events history back to 60 days, and MySQL crashed after it got about halfway done. I looked in the MySQL logs, and there were a ton of InnoDB corruption errors. This was the eventual solution:mysqldump -uzenoss -pzenoss events > dump.sql
For some reason, after the pruning via ZenDeleteEvents.py, the dump worked, and didn't grow unmanagably large.zeneventbuild localhost zenoss zenoss events
(these parameters might be different for others: the syntax iszeneventbuild <dbhost> <dbuser> <dbpassword> <dbname>
mysql
, thengrant super on *.* to 'zenoss'@'localhost' identified by 'zenoss';
mysql -uzenoss -pzenoss events < dump.sql
After that, everything worked fine, though I didn't have a lot of historical events (which I wasn't using anyway). This thread in the zenoss forums was instrumental in helping me figure out what was going on.