I need help with character encoding issues while transferring a forum database between MySQL servers.
I’m moving a phpBB forum database from an older server to a newer one. The source server runs MySQL 5.0.45 and the destination has MySQL 5.5.17.
The original database uses latin1 charset with latin1_swedish_ci collation, but it contains Polish characters. While these special characters show up fine on the actual forum website, they appear corrupted in database management tools:
- ą shows as ±
- ę appears as ê
- ń displays as ñ
- ó becomes ó
I’ve tried creating database exports using both phpMyAdmin and command line mysqldump, but when I import them into the new server, the Polish characters get mangled. For example, the character ą should have UTF-8 hex value C4 85 but in my dumps it shows as C3 B1.
What’s the proper way to handle this migration while preserving the special characters? Should I convert the dump file somehow before importing? I attempted using iconv but couldn’t get it working correctly.
Note that the forum’s HTML pages use iso-8859-2 encoding which displays Polish text properly on the frontend.
sounds like a double encoding issue. try dumping with --single-transaction --routines --triggers and ensure your connection charset matches the actual data encoding. sometimes data stored as utf8 might be marked as latin1, which confuses mysql during export/import.
I encountered a similar issue when migrating a forum with special characters. It seems like the data might already be in UTF-8 but is misinterpreted as latin1 during migration. One approach is to ensure that your mysqldump command specifies the correct character set using --default-character-set=utf8. After exporting your data, make sure that the new database is set to utf8 as well. You might want to check the collation settings before importing. If the characters are still corrupted after import, consider running an ALTER TABLE command to modify the charset and collation configurations for the affected tables. This should help preserve the special characters without any corruption.
Those hex values show exactly what’s wrong - you’ve got UTF-8 data stuck in latin1 columns. Seeing C3 B1 instead of C4 85 means the database thinks those UTF-8 bytes are separate latin1 characters. Don’t dump yet - fix the column definitions first. Connect to your source database and run ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci on each table. This makes MySQL read the existing UTF-8 data correctly instead of treating it like latin1. Once you’ve fixed the source schema, then use mysqldump with --default-character-set=utf8. I dealt with this exact problem migrating a Polish forum last year - worked perfectly, no corruption.