After some problems with UTF8 and none-UTF8 strings, we're standardising on UTF8. One thing I need to do is check that everything is in UTF8 in the MySQL database? What do I need to check?
- Server default characterset
- Default character set of each database
- Does each text column have a character set? How do I check for that?
I was thinking of putting all this in a nagios plugin to check that everything is in UTF8. Suggestions?
Rory, First of all, you are correct for wanting to monitor what gets created in you databases. While we all implement steps to prevent mistakes, you cannot assume that the mistakes won't creep in. I do a very similar thing as most of our infrastructure demands UTF8.
The following queries are good for checking stats:
The following queries would be good from a cron job that emails you if any results are found:
These queries work for MySQL >= 5.0. If you need help writing the cron job, let me know.
I also have a problem with my developers (mostly the Ruby on Rails team) creating InnoDB tables when they don't need them. To keep this under wraps I monitor the staging database with a cron that calls:
I'd do something like mysqldump --all-databases --no-data | grep -i character
Jure1873 suggests using grep, and I commented that awk was more appropriate. Here is what I threw together.
Here it is with increased readability:
Nagios is really the wrong sort of tool for this. Nagios is more for service monitoring, not checking configuration is wrong. A better tool would be some sort of configuration management tool like puppet or cfengine to make sure that the configuration files have the right settings.
As far as actually configuration, the MySQL manual has a very comprehensive chapter on character sets and encoding.