from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions
from selenium.webdriver.support.ui import WebDriverWait
elemental_list = []
driver = webdriver.Chrome()
for page in range(1, 21):
page_url = "https://www.fastexpert.com/top-real-estate-agents/florida/?page=" + str(page)
driver.get(page_url)
WebDriverWait(driver, 60).until(expected_conditions.presence_of_all_elements_located((By.CSS_SELECTOR, "section.TOP25AGENTSE10.RTAGENTCOLUMN div.TOPREPT_RRT")))
for agent in range(1,25):
driver.find_element(By.TAG_NAME, "h3").click() #Finding the link on the page
driver.find_element(By.TAG_NAME, "h1") #finding the name on the personal page
driver.find_element(By.ID, "my_map_adress") #finding the location on the personal page
agents = BeautifulSoup(driver.page_source, 'lxml').find('section', {'class': 'TOP25AGENTSE10 RTAGENTCOLUMN'}).find_all('div', {'class': 'TOPREPT_RRT'})
for agent in agents:
elemental_list.append((agent.find('h1').text.strip(), agent.find({'id': 'my_map_adress'}).text.strip())) if agent.find({'class': 'my_map_adress'}) else elemental_list.append((agent.find('h1').text, ''))
for element in elemental_list:
print(element)
driver.quit()
I try to scrap data from that website. The goal is to click the links and scrap the name and the address. After clicking all 25 links on page 1, it should loop through all 20 pages and do the same. I think my logic is right, but I'm kinda stuck. What do I not see what breaks my code?
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":"[id="my_map_adress"]"}
(Session info: chrome=114.0.5735.199); For documentation on this error, please visit: https://www.selenium.dev/documentation/webdriver/troubleshooting/errors#no-such-element-exception
Stacktrace:
Backtrace:
GetHandleVerifier [0x00AAA813+48355]
(No symbol) [0x00A3C4B1]
(No symbol) [0x00945358]
(No symbol) [0x009709A5]
(No symbol) [0x00970B3B]
(No symbol) [0x0099E232]
(No symbol) [0x0098A784]
(No symbol) [0x0099C922]
(No symbol) [0x0098A536]
(No symbol) [0x009682DC]
(No symbol) [0x009693DD]
GetHandleVerifier [0x00D0AABD+2539405]
GetHandleVerifier [0x00D4A78F+2800735]
GetHandleVerifier [0x00D4456C+2775612]
GetHandleVerifier [0x00B351E0+616112]
(No symbol) [0x00A45F8C]
(No symbol) [0x00A42328]
(No symbol) [0x00A4240B]
(No symbol) [0x00A34FF7]
BaseThreadInitThunk [0x770000C9+25]
RtlGetAppContainerNamedObjectPath [0x777D7B4E+286]
RtlGetAppContainerNamedObjectPath [0x777D7B1E+238]
Your code has semantic issues in multiple places.
At these lines, you're trying to click on the first person's profile, 25 times, which in turn has multiple issues,
h3tag isn't clickableYour code for fetching details is also outside this loop, so it'd have taken the details from the last opened page only.
Issues aside, I noticed that the website is static, so you need not use selenium unless there's some constraint, so I rewrote the code in
requests.At the line
page.find_all('a', {'class': 'profileLink', 'href':re.compile('\/agents\/')})I am fetching all the matching anchors having the given class and matching the givenhrefpattern.This code outputs:
You can change the number of pages in the loop definition.