I have a MySQL table that includes these columns: id, updated_time, large_data, and a few others. The column updated_time gets modified frequently through this query:
UPDATE my_table SET updated_time = NOW() WHERE id IN (...)
On the other hand, the large_data column is updated less often but holds a significant amount of data (approximately 5-50kb per record). Recently, I’ve observed high CPU activity during write operations. I’m curious if the frequent updates to a small column combined with a rarely updated large column are contributing to this problem.
When I only change updated_time, does MySQL rewrite the entire row including the large large_data column? If that is the case, would it improve performance to store large_data in a separate table?
When updating the updated_time column in a MySQL table, it’s essential to understand how MySQL manages data updates. Even though you’re only changing a small column, MySQL may rewrite the entire row depending on the storage engine being used, particularly with InnoDB. Frequent updates can lead to page fragmentation, consuming CPU resources as the system manages row versions and locking. In my experience, separating the large_data column into a different table reduced CPU usage significantly during high update volumes while maintaining retrieval efficiency, despite slight complexities in querying.
Honestly, just test it urself. Create a duplicate table, split out the large_data column, and run benchmarks. I’ve found the join overhead usually outweighs any benefits unless ur hammering those tables with tons of concurrent updates.
MySQL’s behavior depends on your storage engine and row format. With InnoDB using DYNAMIC row format, variable-length columns like large_data get stored off-page when they hit certain size thresholds. So updating just updated_time usually won’t rewrite the large column data itself. But the row header and metadata still need updates, which creates overhead. I’ve seen similar issues where the real problem was actually the WHERE clause with IN operations on huge datasets. Make sure you’ve got proper indexing on your id column, and check your slow query log to see if it’s a query execution problem rather than row rewriting. That CPU spike might be from lock contention due to frequent updates, not physical data movement.
Those CPU spikes are likely from InnoDB’s MVCC system, not actual row rewrites. When you’re constantly updating updated_time, MySQL creates new row versions for transaction isolation. It has to maintain undo logs and run garbage collection - gets expensive with large columns even when you’re not touching them. I’ve dealt with similar workloads and found that moving timestamp tracking to a memory engine table or Redis really cuts down the load on your main table. Also check your buffer pool size - if it’s too small for your dataset, all that row versioning creates tons of disk I/O from pages getting evicted and reloaded.