Nodejs app not working on Render.com using Puppeteer-cluster package (whereas it works with puppeteer)

66 Views Asked by At

I am currently working on a project which requires puppeteer package with nodejs.

I first used puppeteer package on my app, and after few difficulties I managed to make it work on my server by dockerizing my app.

Here is my dockerfile :

FROM ghcr.io/puppeteer/puppeteer:22.0.0

ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true \
    PUPPETEER_EXECUTABLE_PATH=/usr/bin/google-chrome-stable
    
WORKDIR /usr/src/app

COPY package*.json ./
RUN npm ci
COPY . .
CMD [ "node", "server.js"]

However, the process takes too much time in my server, so I wanted to scale my app by browsing through all my pages in parallel using puppeteer-cluster : https://github.com/thomasdondorf/puppeteer-cluster

I followed the instructions and everything works fine when running locally, but it just doesn't work on my deployed app in Render.com.

Here is how I call puppeteer-cluster :

const cluster = await Cluster.launch({
        concurrency: Cluster.CONCURRENCY_PAGE,
        puppeteerOptions: {
            headless: true,
            args: [
                '--no-sandbox',
                '--disable-setuid-sandbox',
            ]
        },
        maxConcurrency: 20,
        skipDuplicateUrls: true,
    });

I am getting these two errors and sometimes it doesn't show an error but my pages were not loaded because I don't get the results :

/usr/src/app/node_modules/puppeteer-cluster/dist/Worker.js:41
                        throw new Error('Unable to get browser page');
                              ^
Error: Unable to get browser page
    at Worker.<anonymous> (/usr/src/app/node_modules/puppeteer-cluster/dist/Worker.js:41:31)
    at Generator.next (<anonymous>)
    at fulfilled (/usr/src/app/node_modules/puppeteer-cluster/dist/Worker.js:5:58)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
Node.js v20.9.0
Requesting main frame too early!

I tried running my app with debug mode : https://github.com/thomasdondorf/puppeteer-cluster?tab=readme-ov-file#debugging

And I get this error :

2024-02-28T14:53:53.066Z puppeteer-cluster: Worker Error getting browser page (try: 0), message: Timeout hit: 5000
2024-02-28T14:53:53.066Z puppeteer-cluster: SingleBrowserImpl Repair requested
2024-02-28T14:53:53.066Z puppeteer-cluster: SingleBrowserImpl Starting repair
2024-02-28T14:53:53.166Z puppeteer-cluster: Worker Error getting browser page (try: 0), message: Timeout hit: 5000
2024-02-28T14:53:53.166Z puppeteer-cluster: SingleBrowserImpl Repair requested

For information here is how I used puppeteer which was working :

const browser = await puppeteer.launch({ 
        args: [
            '--disable-gpu',
            '--disable-dev-shm-usage',
            '--disable-setuid-sandbox',
            '--no-first-run',
            '--no-sandbox',
            '--no-zygote',
            '--deterministic-fetch',
            '--disable-features=IsolateOrigins',
            '--disable-site-isolation-trials',
        ],
        headless: true, 
        executablePath: 
            process.env.NODE_ENV === "production"
                ? process.env.PUPPETEER_EXECUTABLE_PATH
                : puppeteer.executablePath(),
    });

I tried to keep the executablePath with puppeteer-cluster but it doesn't work.

It's my first time coding a nodejs app so maybe I missed something, but I can't find what. The package puppeteer-cluster is well installed since everything works perfectly locally, I just can't find a way to make it work in Render. Maybe I should somehow add puppeteer-cluster to my dockerfile but I don't know how and after testing many things I can't find a solution. Any help would be appreciated !

0

There are 0 best solutions below