Race Condition in PDF Generation with Puppeteer and Front-End Event Dispatching

Emma_Galaxy · December 31, 2024, 12:33am

I’m developing a web application that utilizes Puppeteer for PDF generation in the back-end, based on the content displayed by a Lightning Web Component (LWC) on the front-end. I’m facing a timing dilemma where Puppeteer may attempt to generate the PDF before the front-end’s rendering is complete.

Here’s a brief overview of my back-end setup using Node.js and Puppeteer:

async function createPdf(webpageUrl, footerContent) {
    const browserInstance = await puppeteer.launch({headless: true, args: ['--no-sandbox']});
    const pageInstance = await browserInstance.newPage();
    pageInstance.on('console', message => {
        console.log(message.text());
        message.stackTrace().forEach(trace => {
            console.log(`Line: ${trace.lineNumber}, Column: ${trace.columnNumber}, File: ${trace.url}`);
        });
    });
    pageInstance.on('pageerror', ({message}) => console.log(message));

    await pageInstance.goto(webpageUrl, {waitUntil: 'networkidle0'});

    // Custom event handling
    await pageInstance.evaluate(() => {
        return new Promise(resolve => {
            console.log('Triggering pdfrendered event');
            window.addEventListener('pdfrendered', resolve, { once: true });
            setTimeout(resolve, 30000);
        });
    });

    const footerHtmlContent = generateFooter(footerContent);
    const pdfStream = await pageInstance.createPDFStream({
        printBackground: true,
        format: 'A3',
        margin: {
            top: '48px',
            bottom: '84px',
            left: '48px',
            right: '48px',
        },
        displayHeaderFooter: true,
        headerTemplate: '<div/>',
        footerTemplate: footerHtmlContent
    });

    return {
        browser: browserInstance,
        pdfStream: pdfStream,
    };
}

On the front-end using LWC:

import { api, LightningElement } from 'lwc';

export default class PdfRelatedList extends LightningElement {
  @api pdfRecords;
  @api headerBgColor;
  @api textColor;
  @api borderVisible;
  pdfHasRendered = false;

  get headerStyle() {
    const bgColor = this.headerBgColor || this.pdfRecords.subsectionBackgroundColor;
    const textColor = this.textColor || this.pdfRecords.subsectionTextColor;
    return bgColor || textColor ? `background-color: ${bgColor} !important;` : '';
  }

  get cellStyle() {
    return this.textColor ? `color: ${this.textColor} !important;` : '';
  }

  get hasRecords() {
    return this.pdfRecords.relatedRecords && this.pdfRecords.relatedRecords.length > 0;
  }

  renderedCallback() {
    const headerCells = this.template.querySelectorAll('th');
    headerCells.forEach(cell => { cell.style.cssText = this.headerStyle; });

    const dataCells = this.template.querySelectorAll('td');
    dataCells.forEach(cell => { cell.style.cssText = this.cellStyle; });

    if (this.isPdfRelatedList && !this.pdfHasRendered) {
      if (window['pdfjs-dist/build/pdf']) {
        this.processPdf();
      } else {
        this.addEventListener('pdfjsloaded', this.processPdf);
      }
      this.pdfHasRendered = true;
    } else {
      this.dispatchEvent(new CustomEvent('pdfrendered', { bubbles: true, composed: true }));
    }
  }

  get isPdfRelatedList() {
    return this.pdfRecords?.type === 'pdfRelatedList' && this.pdfRecords?.relatedRecords?.some(record => record.pdfDataUrl);
  }

  async processPdf() {
    // PDF processing logic...
    this.dispatchEvent(new CustomEvent('pdfrendered', { bubbles: true, composed: true }));
  }
}

The problem: When the condition this.isPdfRelatedList is false, the front-end instantly dispatches the ‘pdfrendered’ event. This can occur before Puppeteer has set up the listener in webPage.evaluate().

Questions:

How can I guarantee that Puppeteer won’t miss the ‘pdfrendered’ event when it’s fired?
Is there a more effective method to synchronize the completion of front-end rendering with the back-end PDF generation?
Should a different technique be adopted for handling non-PDF content?

Any insights or best practices for effectively managing this front-end/back-end synchronization with Puppeteer would be greatly valued. I originally assumed that the ‘pdfrendered’ event would be captured by Puppeteer regardless of dispatch timing. However, it appears that when this.isPdfRelatedList is false, the event is entirely missed. I attempted to extend the timeout in the Puppeteer evaluate function from 30 seconds to 60 seconds, hoping to provide additional time for the event to dispatch and be captured, but that did not resolve the issue and only increased waiting times for non-PDF content.

Bob_Clever · January 5, 2025, 4:31am

To ensure Puppeteer doesn't miss the 'pdfrendered' event, consider modifying your setup:

Front-End Event Dispatching: Introduce a delay in dispatching the 'pdfrendered' event until you're sure Puppeteer is ready. For instance, use a flag to confirm Puppeteer's readiness before dispatching.
Back-End Synchronization: Create a retry mechanism or a loop in pageInstance.evaluate() to check for a flag indicating readiness on the front-end. You can use setInterval() or a similar approach until the event listener is confirmed active.

Modify the front-end dispatch logic:

if (this.isPdfRelatedList && !this.pdfHasRendered) {
    // Ensure Puppeteer is ready before proceeding
    await checkPuppeteerReady();
    this.processPdf();
} else {
    setTimeout(() => {
        // Slightly delay the dispatch
        this.dispatchEvent(new CustomEvent('pdfrendered', { bubbles: true, composed: true }));
    }, 500);
}

Implement a check in Puppeteer:

await pageInstance.evaluate(() => {
    return new Promise(resolve => {
        const checkEventReady = setInterval(() => {
            if (window.isPdfRendered) {
                clearInterval(checkEventReady);
                window.addEventListener('pdfrendered', resolve, { once: true });
            }
        }, 500);

        setTimeout(resolve, 30000); // Fallback timeout
    });
});

This approach helps ensure Puppeteer is synchronized effectively with the front-end by utilizing readiness flags and delayed event dispatching.