Help Needed: n8n Performance Issues with Massive Dataset Lookup (130k+ Records)

John_Clever · August 12, 2025, 4:42am

Hi folks! I’m having a tough time with my n8n automation workflow running on my local machine and could really use some guidance.

My Setup:

I’ve got a Google Spreadsheet containing 130,000+ business entries. My n8n automation pulls business names from job postings and I need to verify if each business is in my spreadsheet.

The Issue:

The Google Sheets integration breaks down because of the huge dataset. When I use Fetch All Rows it either:

Throws Call stack overflow error
Gets stuck and never responds

What I’ve Tested:

Column-specific queries on the “Company Name” field - same crash issues
Converted everything to a JSON export using Python scripts
Loading attempts in n8n:
- File Operations node - can’t handle the JSON format properly
- Web Request from GitHub repository - works but extremely slow parsing, can’t pin the 12MB+ data
Manual Set node with hardcoded entries - browser crashes from memory overload
Function node with workflow static storage - doesn’t persist between executions
Considered batch processing or subprocess workflows - still can’t solve the initial data loading problem

What Would Help:

I need a fast way to:

Lookup if a business name exists in this huge list
Avoid reloading all 130k+ records every time the workflow runs
Keep n8n stable without memory crashes

Ideas Welcome:

Data caching strategies?
Using external databases instead?
Better file hosting approaches?
How do others manage large reference datasets?

Thanks for any suggestions!

Dave_17Sketch · August 24, 2025, 11:32am

Split this into two phases to avoid the loading issues. First, process your 130k records offline into a simple text file - just business names, one per line. Then use n8n’s Execute Command node with grep or findstr to search that file directly. This won’t load everything into n8n’s memory, and filesystem searches are crazy fast for exact matches. No database setup needed, no external services, runs locally, and won’t crash your browser. I’ve done this for similar reference data problems and it always beats trying to push spreadsheet tools past their breaking point.

marcoMingle · August 24, 2025, 5:55am

Been there with massive data lookups. You’re fighting tools that weren’t built for this scale.

You need proper automation with data preprocessing and smart caching. Set up a workflow that transforms your 130k records into an optimized lookup structure first, then build an API endpoint for fast queries.

Use a multi-step process: First workflow processes your Google Sheet data once and stores it for speed. Second workflow handles business name verification by calling the lookup endpoint.

This stops the memory crashes since you’re not loading everything at once. You get persistent data between runs without managing database infrastructure.

I’ve solved similar problems by building preprocessing pipelines that turn clunky data sources into fast APIs. The automation does the heavy lifting upfront, keeping your main workflow lean and responsive.

Latenode makes this pattern straightforward. You can build the data transformation workflow and lookup API in the same platform, then chain them together.

MiaDragon42 · August 23, 2025, 10:40pm

Honestly, just use Elasticsearch or even a CSV file with grep commands through the Execute Command node. Way easier than setting up databases and you don’t need to learn anything new. I’ve done 200k+ lookups this way and it’s lightning fast.

josephk · August 23, 2025, 4:29am

I faced similar issues with large datasets in n8n and found the solution in setting up a Redis cache instance. It works exceptionally well for lookups like yours. Load your 130k records into Redis as key-value pairs, with the business name as the key. This allows your n8n workflow to simply send a GET request to check the existence of a company. Redis can manage millions of operations per second and retains data between executions, ensuring stability. You can use n8n’s HTTP Request node to interact with Redis via its REST API, which takes about 10 minutes to configure and resolves memory issues effectively. This alternative is far better than relying on Google Sheets for such large datasets.

SkippingLeaf · August 22, 2025, 8:05am

SQLite’s your best bet here. Dump that Google Sheet data into a local SQLite file and query it straight from n8n with SQL nodes or HTTP requests to a simple API wrapper. SQLite crushes 130k records no problem and keeps everything on your machine. I switched a similar workflow from Google Sheets to SQLite last year - performance difference was insane. No more timeouts or crashes. Takes maybe an hour to export your data and set up the database, then your lookups go from minutes to milliseconds. Plus you dodge all those annoying Google Sheets rate limits.

ethant · August 19, 2025, 6:37pm

totally feel ya! using a db like postgres or mysql will help tons. they handle big datasets way better than sheets. plus, connecting it to n8n via API makes it faster and way less headache. give it a shot!