Discord bot CSV file modifications not persisting on Heroku deployment

I’m building a Discord bot that tracks song statistics by writing data to CSV files. The bot monitors music playback and stores information in a CSV file to count how many times each song gets played per month.

Here’s my function that adds new song entries:

async def track_song_data(message, track_title):
    data_frame = pd.read_csv('song_tracking.csv', index_col=0)
    current_period = str((message.created_at - datetime.timedelta(hours=8)).month) + '-' + str((message.created_at - datetime.timedelta(hours=8)).year)
    new_entry = pd.DataFrame([[current_period, track_title, 1]],
                            columns=["Period", "Track Title", "Play Count"])
    data_frame = pd.concat([data_frame, new_entry], ignore_index=True)
    data_frame.to_csv(r'song_tracking.csv')

And here’s how I retrieve the statistics:

@bot.command(aliases=["stats", "period"])
async def show_stats(ctx):
    current_period = str((ctx.message.created_at - datetime.timedelta(hours=8)).month) + '-' + str((ctx.message.created_at - datetime.timedelta(hours=8)).year)
    track_data = pd.read_csv('song_tracking.csv', index_col=0)
    results = track_data[track_data['Period'].str.contains(current_period)].groupby('Track Title', as_index=False
                ).sum()[['Track Title', 'Play Count']].sort_values('Play Count', ascending=False)[:5].reset_index(drop=True)
    await ctx.message.channel.send(f'Top 5 tracks for {calendar.month_name[(ctx.message.created_at - datetime.timedelta(hours=8)).month]} ')
    for index in np.arange(len(results)):
        await ctx.message.channel.send(f'{index + 1}: {results.loc[index][0]} - {str(results.loc[index][1])}')

Everything works perfectly when running locally. But after deploying to Heroku, the CSV files don’t seem to update permanently. The bot shows correct data immediately after adding new songs, but after some time it reverts back to the original data from when I first deployed.

I suspect Heroku resets the filesystem periodically, which would explain why my CSV changes don’t persist. Is there a better approach for storing this data on Heroku? Should I use a database instead of CSV files?

yeah heroku wipes files when dynos restart - learned this the hard way too lol. quick fix is using a google sheet as your database with the gspread library. works surprisingly well for simple tracking and it’s free forever

Had the exact same issue with my Discord bot last year. Heroku’s filesystem isn’t persistent - any file changes get wiped when the dyno restarts. For a simple stats tracker like yours, I’d go with SQLite using the sqlite3 module. Zero setup and barely any code changes needed. Just swap your CSV operations for basic SQL commands. Make a table with your three columns and use INSERT/SELECT instead of pandas operations. SQLite files still get wiped on Heroku, but you can backup/restore to Heroku Postgres when you need to. Want something more robust? Jump straight to Heroku Postgres with psycopg2. The migration from pandas is pretty straightforward either way.

You’re absolutely right - Heroku’s ephemeral filesystem is the problem. I hit this same issue when I started deploying on Heroku. Any file changes get wiped when dynos restart, which happens at least every 24 hours. I’d switch to PostgreSQL since Heroku’s free tier works great for Discord bots. Just create a simple table with period, track_title, and play_count columns. Moving from pandas to SQL isn’t bad - swap your CSV reads for database queries and use INSERT statements instead of concatenating DataFrames. Want to keep your pandas workflow? Use Heroku Postgres with SQLAlchemy’s to_sql() method. You can treat database tables almost like CSV files. Minimal code changes and your persistence problem is solved.