How to merge multiple CSV files from music dataset into single pandas DataFrame

alexj · August 26, 2025, 9:54am

I’m trying to combine music chart data from multiple CSV files into one pandas DataFrame. The dataset has folders organized by year, and each folder contains several CSV files with chart information.

import pandas as pd
import glob
import re
from datetime import datetime

start_row = 0
data_frames = []
for csv_file in glob.glob('MusicData/*/202?/*.csv'):
    country = csv_file.split('/')[1]
    date_matches = re.findall('\d{4}-\d{2}-\d{2}', csv_file.split('/')[-1])
    chart_data = pd.read_csv(csv_file, header=start_row, sep='\t')
    chart_data['start_date'] = datetime.strptime(date_matches[0], '%Y-%m-%d')
    chart_data['end_date'] = datetime.strptime(date_matches[1], '%Y-%m-%d')
    chart_data['country'] = country
    data_frames.append(chart_data)
combined_charts = pd.concat(data_frames)

When I execute this code, I get a ValueError saying “No objects to concatenate”. The error happens at the pd.concat() line. I can see that the glob pattern should match my file structure, but it seems like no DataFrames are being added to my list. What could be causing this issue and how do I fix it?

Ethan99 · September 3, 2025, 8:19pm

That ValueError: No objects to concatenate means your glob pattern isn’t finding any files. I’ve hit this before with quarterly sales data in nested folders. Your 202? pattern only catches 2020-2029 - might be too narrow. Try 20* instead, or be explicit with {2020,2021,2022,2023,2024}. Double-check you’re in the right directory too - os.getcwd() will show you. Also, those music chart files might have encoding issues or weird delimiters that make pd.read_csv() fail quietly. Wrap your read_csv calls in try/except blocks to catch what’s breaking. And honestly? Just manually check if those CSV files actually exist where you think they do first.

sapphireSkies · September 3, 2025, 7:50pm

Yeah, that empty data_frames list is your problem. But honestly, all this manual file matching and date parsing is a nightmare.

I had the same headache processing log files from different servers. Instead of fighting with glob patterns and DataFrame merging, I built a workflow in Latenode that watches folders and auto-processes new CSVs as they show up.

It reads each file, pulls country and date info from the path, adds those columns, and dumps everything into a master dataset. No more glob debugging or concat errors.

You can schedule it or trigger it when new files hit your folders. Way cleaner than managing all that Python mess.

Quick fix though - throw in some debug prints to see what glob’s actually finding. Your path separators might be wrong depending on OS.

Stella_Dreamer · September 2, 2025, 3:20am

try adding print(csv_file) inside your loop to see if the glob pattern’s working. also, ensure your folder structure matches what you’re expecting with those wildcards. good luck!

Finn_Mystery · September 2, 2025, 2:49am

Had this exact error last month with financial data files. Your glob pattern isn’t matching any files - that’s the problem. First, check your directory structure. On Windows you might need backslashes or just use os.path.join() instead of forward slashes. Also make sure your CSVs are actually tab-separated since you’re using sep='\t'. Try running glob.glob('MusicData/*/202?/*.csv') by itself first to see what it returns. If it’s empty, your path pattern’s wrong. Also check you’re running the script from the right spot relative to your MusicData folder. I’d add a quick check like if not data_frames: print('No files found') before the concat line - makes debugging way easier.