I need to move my MySQL database to PostgreSQL and I’m looking for some help. The database is running MySQL 5.7.44 with about 20GB of data using latin1 encoding. I want to keep my app running while doing this migration and only have a short downtime when switching to the new PostgreSQL setup.
Since I’m new to database work, I tried a few things in my test setup:
Used pgloader to move the initial data over
Made a mysqldump of new data that came in after the first migration
I’m doing the second part because I don’t know if pgloader can handle just the new changes or if it copies everything again (which would mean copying all 20GB).
There’s also an issue with XML data in my database. The MySQL dump has escape characters like \n that make it hard to convert to PostgreSQL format.
I tried Chameleon too but it makes table names in camelCase which doesn’t work well.
Does anyone know tools that can convert mysqldump files to work with PostgreSQL? Is there a better way to do this whole migration?
Different approach - skip mysqldump conversions entirely and use pg_loader with staged migration. Set up PostgreSQL alongside your existing MySQL, then run pg_loader in chunks by table groups instead of syncing everything at once. You can verify data integrity for each group before moving to the next batch. For the latin1 encoding issue, create a conversion mapping file upfront to handle character set differences. PostgreSQL defaults to UTF8 and mismatches will silently corrupt your data if you don’t handle this properly. Fix XML escape sequences during the pg_loader process using custom transformation rules rather than preprocessing the whole dump. I’ve found this catches encoding errors immediately, which is way more reliable than post-processing. For minimal downtime, keep MySQL as your write target until you’ve migrated the last table group, then do a quick final sync and switch your app config.
try ora2pg instead. i know the name says oracle, but it handles mysql too and deals with encoding issues way better than other tools. set TYPE to mysql and it’ll convert your schema and data without screwing up the camelCase. much less painful than pgloader.
I did this exact migration last year with a 15GB MySQL database. pgloader actually handles incremental syncing well if you set it up right - just use timestamp-based filtering or primary key ranges instead of doing full reloads. For the XML escaping mess, I wrote a quick Python script that preprocesses the mysqldump file before importing. It fixes those \n escape sequences and converts them to PostgreSQL format. Took me about an hour to write but saved days of manual cleanup. For downtime, try logical replication if you can upgrade MySQL to 8.0 first. Keeps both databases in sync during migration. Can’t upgrade? Check out AWS DMS or similar tools - they handle continuous replication between different database engines. The camelCase thing with Chameleon is annoying, but you can fix it with custom mapping rules in the config file to keep your original naming conventions.
You’re migrating a 20GB MySQL 5.7.44 database (latin1 encoding) to PostgreSQL, aiming for minimal downtime and encountering issues with XML escape characters and tool-specific limitations (pgloader, Chameleon). You’ve attempted using pgloader for initial data transfer and mysqldump for incremental updates, but are struggling with data conversion and inconsistent naming conventions.
Understanding the “Why” (The Root Cause):
Manual database migrations at this scale are notoriously error-prone and time-consuming. The core issue isn’t just about choosing the right tool; it’s about establishing a robust, automated workflow to minimize human intervention and ensure data integrity. Using separate tools for initial and incremental syncs increases complexity and risk. Furthermore, inconsistent handling of character sets (latin1 to UTF8) and XML escape sequences can lead to data corruption if not addressed proactively. The most efficient approach is to automate the entire process, including data validation, error handling, and the final switchover.
Step-by-Step Guide:
Automate the Migration with a Workflow: The most effective solution is to build (or adapt an existing) automated workflow. This workflow should handle the following steps:
Initial Data Transfer: Use a tool like pgloader for the initial data migration from MySQL to PostgreSQL. Address latin1 to UTF8 conversion upfront using pgloader’s transformation capabilities to create a mapping file. This avoids silent data corruption and allows for immediate error detection.
Incremental Synchronization: Continue using pgloader, but configure it for incremental syncing based on timestamps or primary key ranges. This avoids redundant full copies. Handle XML escape sequences within the pgloader transformation rules instead of preprocessing the entire dump.
Automated Validation: Integrate automated checks before and after each synchronization step to ensure data integrity.
Automated Switchover: Automate the process of switching your application’s connection string from MySQL to PostgreSQL once the migration is complete and validated. This minimizes downtime.
Rollback Mechanism: Implement a rollback mechanism to revert to MySQL if the migration or switchover fails.
Error Handling and Logging: Implement comprehensive logging to track the entire process and readily identify any errors.
Preprocessing (If Necessary): If you encounter specific issues that can’t be handled within the pgloader transformation rules (though this should be minimal with a properly configured mapping file), you can use a lightweight preprocessing script (Python, sed, or awk) to address remaining issues like XML escape sequences or specific character set conversions before importing into pgloader. Keep this as a last resort, focusing on automation within pgloader whenever possible.
Address Encoding Issues: Carefully plan your character set conversion from latin1 to UTF8. The conversion is crucial; failure to do so may silently corrupt your data. Use pgloader’s transformation capabilities and verify data integrity after each step.
Test Thoroughly: Test your application extensively after the migration to catch any compatibility issues between MySQL and PostgreSQL. Queries that worked in MySQL might behave differently or fail entirely in PostgreSQL due to stricter type handling.
Common Pitfalls & What to Check Next:
Data Type Mismatches: PostgreSQL is stricter about data types than MySQL. Carefully examine potential mismatches during the migration.
Schema Differences: Ensure your schema (table and column definitions) is compatible with PostgreSQL.
Application Compatibility: Verify your application’s compatibility with PostgreSQL’s features and behavior.
Still running into issues? Share your (sanitized) config files, the exact command you ran, and any other relevant details. The community is here to help!
honestly, pg_chameleon isn’t bad once you set it up right. you can turn off the camelCase thing in settings - check the identifier_case_policy option. for 20GB, i’d just do a full pgloader run during off-peak hours instead of messing with incremental syncing, especially if you’re new to this. much simpler approach and 20GB won’t take forever anyway.
Ran into this exact problem migrating a production MySQL system to PostgreSQL 6 months back. That latin1 encoding issue you’re dealing with? Way worse than the XML escaping - seriously, nail down the character set conversion or your data will get trashed. For minimal downtime, I’d set up PostgreSQL as a replica first using SymmetricDS. It handles MySQL to PostgreSQL replication in real-time, so you can keep syncing while your app runs on MySQL, then just flip the connection strings when you’re ready. For those mysqldump XML headaches, sed or awk will fix most escape sequence problems before import. Here’s what bit me hard - PostgreSQL’s way stricter about data types than MySQL. Test your app like crazy after migrating because queries that ran fine in MySQL might break with PostgreSQL’s type system.