Script for Identifying and Managing Large MySQL Tables to Save Disk Space

I’m creating a bash script aimed at checking and managing the disk space used by MySQL databases. This script identifies big table files (greater than 1GB) in the MySQL data directory and displays their sizes before any optimization process.

#!/bin/bash
df -h / | head -n2
echo "===================="
for large_file in $(find /var/lib/mysql/ -name "*.ibd" -size +1000M -exec ls -lh {} \; | awk '{print $9}');
do
    echo "Current size: " $(ls -lh $large_file | awk '{print $5}')
    database_name=$(echo $large_file | cut -d'/' -f5)
    table_name=$(echo $large_file | cut -d'/' -f6 | cut -d'.' -f1)
    echo "Database: $database_name, Table: $table_name"
done

I’m concerned about whether it’s safe to run this script on production MySQL servers during peak usage. Are there alternative methods to find out which tables consume the most disk space without accessing the filesystem directly?

Your script’s fine for production - you’re just reading metadata, not actual data files. The filesystem operations are tiny and won’t hurt MySQL performance. But honestly, I’d go with INFORMATION_SCHEMA instead. It’s cleaner and you don’t need filesystem access. Try this query: SELECT table_schema, table_name, ROUND(((data_length + index_length) / 1024 / 1024 / 1024), 2) AS 'DB Size in GB' FROM information_schema.tables WHERE (data_length + index_length) > 1073741824 ORDER BY (data_length + index_length) DESC; Gets you the same info through MySQL’s built-in stats without touching the filesystem. I’ve used this for years on production systems - zero issues, and it works across different MySQL setups and storage engines.

This topic was automatically closed 4 days after the last reply. New replies are no longer allowed.