I have a web application that produces large PDF files, sometimes exceeding 100 pages. The process I follow is:
- Generate HTML with nunjucks templates.
- Launch a Puppeteer instance.
- Create the PDF cover page (see code snippet below).
- Generate the remaining PDF pages.
- Combine the pages into a single document and create a byte buffer.
import { PDFDocument } from 'pdf-lib';
const generatedHtml = await nunjucks.render(...);
const browserInstance = await puppeteer.launch({
args: [
'--disable-dev-shm-usage',
'--no-first-run',
'--no-sandbox',
'--no-zygote',
'--single-process',
],
headless: true
});
const newPage = await browserInstance.newPage();
await newPage.setContent(`${generatedHtml}`, { waitUntil: 'networkidle0' });
const coverBuffer = await newPage.pdf({
... someOptions,
pageRanges: '1'
});
const contentBuffer = await newPage.pdf({
... someOptions,
pageRanges: '2-',
footerTemplate: ...,
});
const finalDoc = await PDFDocument.create();
const coverDocument = await PDFDocument.load(coverBuffer);
const [cover] = await finalDoc.copyPages(coverDocument, [0]);
finalDoc.addPage(cover);
const contentDocument = await PDFDocument.load(contentBuffer);
for (let index = 0; index < contentDocument.getPageCount(); index++) {
const [contentPage] = await finalDoc.copyPages(contentDocument, [index]);
finalDoc.addPage(contentPage);
}
const finalPdfBytes = Buffer.from(await finalDoc.save());
// Handle the bytes as needed
As the PDF size increases, the processing time and memory consumption also rise, causing delays in the API. What strategies can I implement to optimize this process, or are there alternative tools available to prevent API stalls?