I’m working on building a music database using a streaming API but the performance is really bad. I have to loop through many genres, and each one takes a long time to process.
music_data = {
"playlist_ids": [],
"status_logs": []
}
for genre_id in music_genres_df['genre_ids']:
genre_playlists = fetch_genre_playlists(genre_id, max_results=50, start=0)
playlist_items = genre_playlists['data']['results']
status_msg = genre_playlists['status']
# Convert to numpy array and merge with existing data
new_ids = np.array([playlist['id'] for playlist in playlist_items])
music_data["playlist_ids"] = np.concatenate((music_data["playlist_ids"], new_ids))
music_data["status_logs"].extend([status_msg] * len(playlist_items))
The issue is my genre list has over 50 categories and I’m processing 50 playlists for each category. Each playlist contains around 70 tracks. Just gathering the playlist data takes over 30 seconds. What are some effective strategies to make these loops more efficient?