Trouble generating PDFs with Playwright in Docker container

35 Views Asked by At

Hello StackOverflow community,

I'm encountering issues while attempting to generate PDFs using Playwright within a Docker container for my AdonisJS application. Despite successful local development, I faced challenges with Dockerizing the project and encountered issues with TailwindCSS styles not rendering correctly within the container.

Here's a summary of the key points:

  • I chose AdonisJS for its MVC architecture and seamless integration of TailwindCSS and the Edge HTML templating engine.
  • While the local setup worked smoothly, Dockerizing the project proved challenging, with TailwindCSS styles not rendering correctly within the container.
  • After extensive troubleshooting, I discovered that the only way to properly generate PDF styles within the container was to use TailwindCSS as a CDN in the template header.
  • Additionally, the container crashed after the 20th request, with Playwright throwing a "browsertype.launch: timeout 180000ms exceeded" error. This mirrors an issue I previously encountered with Puppeteer and Express on an AWS-hosted server.

I've provided my Dockerfile, which was graciously shared by a community member, for reference:

Playwright browser launch error in Docker container with AdonisJS project

I'm struggling to understand the inner workings of these PDF libraries and how to optimize them effectively. I've read the documentation and explored possible optimizations, but my limited understanding may be hindering me from fully grasping how they function underneath and how to leverage them to their fullest potential

The error you're seeing, Error during navigation: page.setContent: Timeout 30000ms exceeded, indicates a timeout while setting the content of the page. Upon closer inspection, it seems that the error occurs during my initialization of the browser.

One potential reason for this issue could indeed be related to the contents of the HTML header. Specifically, the inclusion of external resources, such as fonts and stylesheets loaded via CDN, might be contributing to the delay in page rendering.

<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <!-- Signature fonts -->
    <link rel="preconnect" href="https://fonts.googleapis.com">
    <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
    <link
        href="https://fonts.googleapis.com/css2?family=Mali&family=Mrs+Saint+Delafield&family=The+Nautigal&display=swap"
        rel="stylesheet">

    <style>
        html {
            -webkit-print-color-adjust: exact;
            print-color-adjust: exact;
        }

        h1 {
    @apply text-3xl font-bold mb-3 mt-6;
}

.comment-container {
    width: 100%; 
    border: 1px solid #E2E2E2;
    border-radius: 0.25rem;
    margin: 0.5rem 0;
}

.comment-header {
    padding: 0.5rem;
    background-color: #F3F4F6;
    font-size: 8px;
}

.comment-content {
    padding: 1rem;
}

.comment-reason,
.comment-text {
    color: #545454;
}

.comment-text {
    color: #141414;
    font-size: 10px;
    font-weight: bold;
}

        
    </style>
    <script
        src="https://cdn.tailwindcss.com?plugins=forms,typography,aspect-ratio,line-clamp"></script>
    <script>
        tailwind.config = {
            theme: {
                extend: {
                    screens: {
                        print: { raw: 'print' },
                        screen: { raw: 'screen' },
                    },
                },
            },
            important: true,
        }
    </script>
</head>

I'd appreciate any insights or guidance on how to address these issues effectively.

Thank you.

Here's my service:

import { inject } from '@adonisjs/core'
import { HttpContext } from '@adonisjs/core/http'
import { chromium } from 'playwright'
import { ConfigPdfInterface } from '../interface/config_pdf_interface.js'
// @ts-ignore
import { JSDOM } from 'jsdom'

@inject()
export default class PlaywrightService {
  constructor(protected ctx: HttpContext) {}

  async generatePdfPlaywright(
    response: HttpContext['response'],
    path: string,
    documents: any,
    config: ConfigPdfInterface,
    isPagedJS: boolean
  ) {
    // The first try/catch handles any errors that occur during browser launch and navigation to the webpage.
    try {
      const browser = await chromium.launch({
        headless: true,
        args: ['--no-sandbox']
      })

      const page = await browser.newPage()
      // try-catch to handle potential errors during navigation and rendering
      try {
        await page.goto('about:blank', { waitUntil: 'domcontentloaded' })
        const html = await this.ctx.view.render(`${path}`, documents)
        await page.setContent(html, {
          waitUntil: 'networkidle',
        })
      } catch (navigationError) {
        console.error('Error during navigation:', navigationError)
      }

      // try-catch to handle potential errors when generating the PDF
      try {
        const pdfBuffer = await page.pdf(config)
        console.log('isPagedJS', isPagedJS)
        response.header('Content-type', 'application/pdf')
        response.header('Content-Length', pdfBuffer.length)
        response.status(200).send(pdfBuffer)
      } catch (pdfError) {
        console.error('Error generating PDF:', pdfError)
      } finally {
        await browser.close()
      }
    } catch (launchError) {
      console.error('Error launching the browser:', launchError)
      // response.status(500).send('Error launching the browser')
    }
  }
}

Here's my controller

  async findOne({ request, response, view }: HttpContext) {
    const requisitionModel = request.body()
    const pathTemplate = 'inventory/requisitions/requisition'
    const pathFooterTemplate = 'partials/footer_requisitions'
    let footerHtml = await view.render(`${pathFooterTemplate}`, requisitionModel)
    const optionsPdfConfig = {
      format: 'A4',
      margin: {
        top: '30px',
        right: '25px',
        bottom: '100px',
        left: '25px',
      },
      footerTemplate: footerHtml,
      displayHeaderFooter: true,
      printBackground: true,
    }
    await this.pdfService.generatePdfPlaywright(
      response,
      pathTemplate,
      requisitionModel,
      optionsPdfConfig,
      false
    )
  }

router.post('/api/inventory/test', [RequisitionsController, 'testingfindOne'])

Edit Question

After reading several blogs discussing the issue, I understand that the problem lies in Playwright being unable to open the page within the minimum wait time. In other words, during that time, it couldn't finish loading the HTML resources. Most blogs suggest two solutions: either increasing the default timeout value or deactivating it by setting it to zero. The second option is not highly recommended.

However, I see these solutions applied when making a request to a webpage. In this case, I'm not making a request to any webpage. The resources are supposed to have been previously loaded when Playwright receives the content. It then opens the browser, sets the content of the previously loaded HTML as a string, and draws the PDF. If I understand correctly and this is how it works, then why does it encounter timeout issues when loading the PDF?

Reference: Puppeteer timeout Dealing with timeouts in Puppeteer Constantly getting Navigation Timeout Error #782 node-js-puppeteer-how-to-set-navigation-timeout Common Errors and Solutions in Puppeteer Fix: navigation timeout of 30000 ms exceeded in puppeteer?

1

There are 1 best solutions below

0
Izlia On

After conducting several tests by hosting my project on a Windows IIS, where a reverse proxy was configured to redirect server requests to the container, and performing massive requests using a testing tool, app.k6.io, the tests didn't go beyond 20 requests. The container would freeze, throwing a timeout error, causing it to hang and not respond.

So, after researching and ensuring the server's resources, I tried increasing the server's resources by hosting it on AWS and boosting the CPU and memory resources. When applying the testing again, the massive requests were successful.

Therefore, I can conclude that the main issue was a server resource problem. I will continue to conduct tests to further corroborate this statement I am making at this moment. But in case someone else encounters the same issue, this could be an indication of their solution.