Web Scrape Return Empty HTML Tag

52 Views Asked by toru At 03 November 2023 at 16:06

I can't seem to web scrape the artworks from this website? The data I get back returns the HTML tag but it's empty.

I have not used web scraping tools that much and I am unsure what my problem is.

from bs4 import BeautifulSoup
import requests

url = "https://centerforbookarts.org/book-shop"

response = requests.get(url)

soup = BeautifulSoup(response.text, "lxml")
# soup = BeautifulSoup(response.text, "html.parser")
element = soup.find_all("section", {"class": "posts"})
print(element)

I also tried html.parser and Selenium but I can't seem to get the data that I need. It always returns an empty tag but clearly this tag isn't empty because it holds all the information that I am looking for.

Original Q&A

There are 1 best solutions below

Yevhen Kuzmovych On 03 November 2023 at 17:48

The information you are looking for is not initially present in the section tag. It is getting populated from the <script> var posts = ... </script> (you can find it if you search "posts" in the HTML of the page).

What you can do is find that script and extract the info from it directly as it is neatly stored in JSON:

from bs4 import BeautifulSoup
import requests
import re
import json
from pprint import pprint

url = "https://centerforbookarts.org/book-shop"

response = requests.get(url)

soup = BeautifulSoup(response.text, "lxml")


script = str(soup.find('script', string=re.compile('.*posts.*')))

posts = json.loads(re.findall('(\[.*\]);', script)[0])

pprint(posts[0])

Web Scrape Return Empty HTML Tag

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in WEB-SCRAPING

Related Questions in BEAUTIFULSOUP

Related Questions in PYTHON-REQUESTS

Related Questions in SCRAPE

Trending Questions

Popular # Hahtags

Popular Questions