Performance Issues with Large Dataset Lookup (130k+ entries) in n8n Local Environment - Looking for Solutions

Hi there! I’m facing a tough challenge with my n8n automation setup (running locally) and would love some guidance from the community.

My Setup:

I’ve got a massive Google Spreadsheet containing 130,105+ business names. My n8n workflow pulls company names from job postings and I need to verify if these companies are in my master list.

Current Issues:

The Google Sheets connector fails when handling this much data. When I use Get Rows, I encounter:

  • Maximum call stack size exceeded errors
  • Complete timeouts with no response

What I’ve Attempted:

  1. Column-specific filtering with Get Rows on the “Company Name” field - same crash issues
  2. Exported the dataset as .json using local Python script
  3. Various n8n import methods:
    • File Read/Write nodes - JSON parsing issues with binary data
    • HTTP Request from GitHub raw file - works but extremely slow parsing, can’t pin due to size (~12MB)
  4. Set node with hardcoded values - browser memory crashes
  5. Code node with workflow static data storage - no persistence between executions
  6. Considered batch processing or sub-workflows - still can’t solve initial data loading

What I Need:

A fast, efficient method to:

  • Validate company existence in the large dataset
  • Avoid re-downloading/processing 130k+ rows every execution
  • Stay within n8n memory constraints

Ideas I’m Considering:

  • Better caching strategies in n8n
  • Moving to a simple database solution
  • Alternative file hosting approaches
  • Your experience with large static datasets

Any suggestions would be amazing! How do you handle similar large dataset lookups?