I'm trying to develop a little crawling/scraping project with Crawlee and Playwright on JavaScript/TypeScript, For each URL I feed the crawler it tries to scrap some data like this:
productDescriptionContainer = await page.locator(
'div[class="product-details__product-description"]'
),
region = await productDescriptionContainer
.locator("p")
.filter({ hasText: "Región:" })
.textContent(),
farm = await productDescriptionContainer
.locator("p")
.filter({ hasText: "Finca:" })
.textContent(),
The problem comes when one of the locator is not found on the page. The crawler retries 3 times and completely stops the scraping process for that specific URL. I would like to set those variables to some default value if the locator is not found and continue with the next.
I hope you can shed some light onto this because I've run out of ideas (catching the error, using ||, initialise the variables...). Thank you in advance.
Here is the solution I found to this.
Include a .catch in the await call, to avoid throw an error, so the code continues.
On your sample, it should be like this