How can I generate a PDF with HTML content on separate pages using Puppeteer?

I’m utilizing Puppeteer in a Node.js environment to create PDF reports from HTML content. I have sections of HTML that I want to appear on distinct pages in the final PDF document. For instance, if my HTML includes

Title 1 - Page A

Title 2 - Page B

Title 3 - Page C

, I would like to ensure that Title 1 - Page A occupies page one, Title 2 - Page B is on page two, and Title 3 - Page C is on page three. Below is the sample code I am using for PDF generation:
const uniqueID = uuidv4();
const filename = output.pdf;
const browserInstance = await puppeteer.launch({
headless: true,
args: [‘–no-sandbox’],
});

const newPage = await browserInstance.newPage();
const sampleHTML = ‘

Title 1 - Page A

Title 2 - Page B

Title 3 - Page C

’;
await newPage.setContent(sampleHTML, { waitUntil: ‘domcontentloaded’ });
await newPage.pdf({
path: filename,
format: ‘A4’,
margin: {
top: ‘20px’,
bottom: ‘20px’,
left: ‘20px’,
right: ‘20px’,
},
});
await browserInstance.close();

Add CSS for page breaks to separate each section in your PDF:

const sampleHTML = `
  
    h1 {
      page-break-after: always;
    }
    h1:last-child {
      page-break-after: auto;
    }
  
  

Title 1 - Page A

Title 2 - Page B

Title 3 - Page C

`;

Apply the changes in your PDF generation code and each title will appear on its own page.

To ensure each section of your HTML content appears on separate pages in the PDF using Puppeteer, you’ll need to modify your HTML and CSS to handle pagination. Specifically, you can apply CSS page-break properties to dictate when a new page should begin.

Here’s how you can adjust your sample code:

const uniqueID = uuidv4();
const filename = `output.pdf`;
const browserInstance = await puppeteer.launch({
  headless: true,
  args: ['--no-sandbox'],
});

const newPage = await browserInstance.newPage();

// Add CSS for page breaks
const sampleHTML = `
  <style>
    h1 {
      page-break-after: always;
    }
    h1:last-child {
      page-break-after: auto;
    }
  </style>
  <h1>Title 1 - Page A</h1>
  <h1>Title 2 - Page B</h1>
  <h1>Title 3 - Page C</h1>
`;

await newPage.setContent(sampleHTML, { waitUntil: 'domcontentloaded' });
await newPage.pdf({
  path: filename,
  format: 'A4',
  margin: {
    top: '20px',
    bottom: '20px',
    left: '20px',
    right: '20px',
  },
});
await browserInstance.close();

Explanation:

  • CSS page-break-after: This property is applied to ensure that each <h1> element starts on a new page.
  • Style Adjustments: The h1:last-child rule ensures that no unintended breaks occur after the last title.

By employing the CSS page-break properties, you effectively control the pagination within the PDF, ensuring each title starts on a new page, as you’ve described.

To generate a PDF with each section of your HTML content on separate pages using Puppeteer, you can use CSS page-break-after to control pagination:

const browserInstance = await puppeteer.launch({
  headless: true,
  args: ['--no-sandbox'],
});

const newPage = await browserInstance.newPage();

// HTML with CSS for page breaks
const sampleHTML = `
  
    h1 {
      page-break-after: always;
    }
    h1:last-child {
      page-break-after: auto;
    }
  
  

Title 1 - Page A

Title 2 - Page B

Title 3 - Page C

`;

await newPage.setContent(sampleHTML, { waitUntil: ‘domcontentloaded’ });
await newPage.pdf({
path: ‘output.pdf’,
format: ‘A4’,
margin: { top: ‘20px’, bottom: ‘20px’, left: ‘20px’, right: ‘20px’ },
});
await browserInstance.close();

Explanation:

  • CSS page-break-after: This ensures each <h1> tag starts on a new page.
  • h1:last-child: Prevents a page break after the last title.

This changes each title to appear on a separate page as required, optimizing your PDF output efficiently.

If you're aiming to generate a PDF where each section of HTML content appears on a distinct page, leveraging Puppeteer, a subtle yet effective method involves using the CSS property page-break-before instead of page-break-after or alongside it, depending on your document's structure.

Consider using the following approach in your Puppeteer script:

const uniqueID = uuidv4();
const filename = `output-${uniqueID}.pdf`;
const browserInstance = await puppeteer.launch({
  headless: true,
  args: ['--no-sandbox'],
});

const newPage = await browserInstance.newPage();

// HTML with CSS for page breaks
const sampleHTML = `
  
    h1 {
      page-break-before: always;
    }
    h1:first-child {
      page-break-before: auto;
    }
  
  

Title 1 - Page A

Title 2 - Page B

Title 3 - Page C

`;

await newPage.setContent(sampleHTML, { waitUntil: ‘domcontentloaded’ });
await newPage.pdf({
path: filename,
format: ‘A4’,
margin: {
top: ‘20px’,
bottom: ‘20px’,
left: ‘20px’,
right: ‘20px’,
},
});
await browserInstance.close();

Explanation:

  • CSS page-break-before: This property ensures that each <h1> element starts on a new page, effectively controlling the pagination right before the content you wish to separate.
  • h1:first-child Adjustment: The first heading is excluded from page break to avoid a blank first page.

By applying page-break-before, the PDF will start each title on a new page, providing a more intuitive control of pagination across different sections of content.