10X Repeat - Scraping Google News Search Results Python Selenium

415 Views Asked by At

When I run this code to get the titles and links, I get 10X results. Any idea what I am doing wrong? Is there a way to stop the scraping when we reach the last result on the page?

Thanks!

while True:
    web = 'https://news.google.com/search?q=weather&hl=en-US&gl=US&ceid=US%3Aen'
    driver.get(web)
    time.sleep(3)
    
    titleContainers = driver.find_elements(by='xpath', value='//*[@class="DY5T1d RZIKme"]')
    linkContainers = driver.find_elements(by='xpath', value='//*[@class="DY5T1d RZIKme"]')
    
    if (len(titleContainers) != 0):
        for i in range(len(titleContainers)):
            counter = counter + 1
            print("Counter: " + str(counter))
            titles.append(titleContainers[i].text)
            links.append(linkContainers[i].get_attribute("href"))
            
    else:
        break
1

There are 1 best solutions below

0
Barry the Platipus On

You put yourself in an infinite loop, with that 'while True' statement. if (len(titleContainers) != 0): condition will always evaluate to True, once they're found in page (they're 100). You're not posting your full code as well, I imagine that counter, titles and links are lists defined somewhere in your code. You may want to test for counter to be less or equal to titleContainers length.