Hi there! I’m facing a tough challenge with my n8n automation setup (running locally) and would love some guidance from the community.
My Setup:
I’ve got a massive Google Spreadsheet containing 130,105+ business names. My n8n workflow pulls company names from job postings and I need to verify if these companies are in my master list.
Current Issues:
The Google Sheets connector fails when handling this much data. When I use Get Rows, I encounter:
Maximum call stack size exceedederrors- Complete timeouts with no response
What I’ve Attempted:
- Column-specific filtering with Get Rows on the “Company Name” field - same crash issues
- Exported the dataset as
.jsonusing local Python script - Various n8n import methods:
File Read/Writenodes - JSON parsing issues with binary dataHTTP Requestfrom GitHub raw file - works but extremely slow parsing, can’t pin due to size (~12MB)
- Set node with hardcoded values - browser memory crashes
- Code node with workflow static data storage - no persistence between executions
- Considered batch processing or sub-workflows - still can’t solve initial data loading
What I Need:
A fast, efficient method to:
- Validate company existence in the large dataset
- Avoid re-downloading/processing 130k+ rows every execution
- Stay within n8n memory constraints
Ideas I’m Considering:
- Better caching strategies in n8n
- Moving to a simple database solution
- Alternative file hosting approaches
- Your experience with large static datasets
Any suggestions would be amazing! How do you handle similar large dataset lookups?