I’m having trouble with a JavaScript CSV parser. I’m using a module to read a CSV file into an array. The file has over 1000 rows but the parser only gets the first 507. Here’s my code:
const fs = require('fs');
const path = require('path');
const csvPath = path.join(__dirname, 'translations.csv');
const parseCSV = require('csv-parser');
const results = [];
fs.createReadStream(csvPath)
.pipe(parseCSV(['Key', 'Description', 'EnglishText', 'TranslatedText']))
.on('data', (data) => results.push(data))
.on('end', () => {
console.log(`Total rows: ${results.length}`);
});
Why isn’t it reading the whole file? Any ideas what could be causing this?
I’ve dealt with this exact problem before, and it’s usually related to memory constraints. When parsing large CSV files, Node.js can sometimes run out of memory, especially if you’re working with a lot of data or on a machine with limited resources.
Here’s what worked for me:
Instead of loading the entire file into memory at once, try processing it in chunks. You can do this by setting a high water mark on your read stream:
fs.createReadStream(csvPath, { highWaterMark: 64 * 1024 })
This sets the chunk size to 64KB. You might need to adjust this value depending on your file size and available memory.
Also, consider processing the data as you read it, rather than storing everything in the ‘results’ array. This can significantly reduce memory usage:
.on(‘data’, (data) => {
// Process data here instead of pushing to array
processRow(data);
})
If you absolutely need all the data in memory, you might want to look into using a database or some other storage solution to handle large datasets more efficiently. Hope this helps!
I’ve encountered a similar issue before. The problem might be related to the file’s encoding or line endings. Try adding the ‘utf8’ encoding option to your createReadStream call:
fs.createReadStream(csvPath, { encoding: ‘utf8’ })
Also, check if your CSV file uses non-standard line endings. Some files, especially those created on different operating systems, might use \r\n instead of just \n. You can try adding a skipLines option to the csv-parser to skip any potential BOM characters:
.pipe(parseCSV({ skipLines: 1, headers: [‘Key’, ‘Description’, ‘EnglishText’, ‘TranslatedText’] }))
If these don’t work, consider using a different CSV parsing library like ‘fast-csv’ or ‘papaparse’. They sometimes handle edge cases better. Let me know if you need more help!
hey grace, i had this problem too. try increasing the buffer size:
fs.createReadStream(csvPath, { highWaterMark: 16 * 1024 * 1024 })
this sets it to 16MB. should help with big files. also check ur file isn’t corrupted. good luck!