This morning I was reinstalling Percona-Server-server (v5.5.13) via Yum on my CentOS box and decided to dig into the my.cnf file a bit and make sure I had everything setup correctly. Lo and behold, I did not. I noticed that a few of my charset and collation settings were using latin1 when they should have been using UTF8.

Being a DBA newb, I tried to set server variables like ‘character_set_database’ only to see the following error messages: “110726 20:24:04 [ERROR] /usr/sbin/mysqld: unknown variable ‘character_set_database=utf8′”. I read the docs on @ and couldn’t see what the problem was. It was listed as a ‘Global’ option, however if you scroll up to “Table 5.2″ you will notice that character_set_database cmd-line option is blank, which I assume means it cannot be set via the my.cnf file. I believe this setting can be set during ‘CREATE DATABASE’ commands, but I didn’t want to have to do this every time I created a database.

So I did some toying around and found the ‘character_set_server’ and ‘collation_server’ settings will actually control the children settings (database, table, etc). So all you have to do to ensure consistent utf8 for your mysql server is to add the following to your my.cnf file:

character_set_server = utf8
collation_server = utf8_general_ci

Be sure to restart the daemon, then you can issue the following commands:

# Show me collation settings

mpurcell@dev1 ~ $ -> mysql -e "show variables" -u mpurcell -p | fgrep -i collat
Enter password:
collation_connection    utf8_general_ci
collation_database      utf8_general_ci
collation_server        utf8_general_ci

# Show me charset settings
mpurcell@dev1 ~ $ -> mysql -e "show variables" -u mpurcell  -p | fgrep -i char
Enter password:
character_set_client    utf8
character_set_connection    utf8
character_set_database    utf8
character_set_filesystem    binary
character_set_results    utf8
character_set_server    utf8
character_set_system    utf8
character_sets_dir    /usr/share/mysql/charsets/

Now your mysql server is setup to store and collate in UTF8.