I'm trying to open a Hotel website www.booking.com and extract the name, price, location, and link from the top 50 search results which are sorted by cheapest first. I'm using Selenium python to automate the process However some HTML elements are targetable while others are not.
after inspecting the website I realized that all hotel names have the class name: fcab3ed991 a23c043802
I tried to target all of them and put them into an array as seen in my code below. But I can't seem to target the element correctly. What I'm I doing wrong?
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options
PATH= "C:\Program Files (x86)\chromedriver.exe"
driver=webdriver.Chrome(PATH)
driver.get("https://www.booking.com/searchresults.html?label=gen173nr-1FCAEoggI46AdIM1gEaAKIAQGYATG4ARfIAQzYAQHoAQH4AQKIAgGoAgO4AvqR75YGwAIB0gIkZDQ4MTdjZDctYzIyNC00N2RlLWJhYjItZDU1YTAwMGU2M2Q12AIF4AIB&sid=8005d0cc6b75af8d0d2e74451b73cb8b&aid=304142&sb=1&sb_lp=1&src_elem=sb&error_url=https%3A%2F%2Fwww.booking.com%2Findex.html%3Flabel%3Dgen173nr-1FCAEoggI46AdIM1gEaAKIAQGYATG4ARfIAQzYAQHoAQH4AQKIAgGoAgO4AvqR75YGwAIB0gIkZDQ4MTdjZDctYzIyNC00N2RlLWJhYjItZDU1YTAwMGU2M2Q12AIF4AIB%26sid%3D8005d0cc6b75af8d0d2e74451b73cb8b%26sb_price_type%3Dtotal%26%26&ss=Jumeirah%2C+Dubai%2C+Dubai+Emirate%2C+United+Arab+Emirates&is_ski_area=&checkin_year=2022&checkin_month=8&checkin_monthday=1&checkout_year=2022&checkout_month=8&checkout_monthday=3&group_adults=2&group_children=0&no_rooms=1&map=1&b_h4u_keep_filters=&from_sf=1&ss_raw=jum&ac_position=1&ac_langcode=en&ac_click_type=b&dest_id=941&dest_type=district&place_id_lat=25.205553&place_id_lon=55.239216&search_pageview_id=c0ac477da63f02c2&search_pageview_id=c0ac477da63f02c2&search_selected=true&ac_suggestion_list_length=5&ac_suggestion_theme_list_length=0&order=price#map_closed")
try:
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.CLASS_NAME, "d4924c9e74"))
)
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.CLASS_NAME, "fcab3ed991 a23c043802"))
)
names=element.find_elements_by_class_name("fcab3ed991 a23c043802")
except:
driver.quit()
To extract the texts from the name and price fields you can use list comprehension and you can use the following locator strategies:
Code block:
Note : You have to add the following imports :
Console Output:
PS: Following this solution you can similarly extract the location and link texts as well and dump in a JSON format.