In the link below, I am trying to collect the total number of games in the 2023-24 Regular Season table.
https://www.basketball-reference.com/players/j/jokicni01/gamelog/2024
I have the variable for those elements set to total_games. My issue is when I do print(len(total_games), I am getting an output 113.
total_games = driver.find_elements_by_xpath('//tbody/tr[@id and @data-row]')
print(len(total_games))
I have manually inspected the elements on the page and done a search for //tbody/tr[@id and @data-row] and even in the search results it shows only 66 entries (accurate as of Mar 19, 2024, will increase as the season continues but should never exceed 82). Can anyone tell me where all of those extra entries are coming from when I run this in PyCharm?
I have also tried using total_games = driver.find_elements(By.XPATH, '//tbody/tr[@id and @data-row]') but I get the same result. I have also tried making it more specific with the following two lines but when those are used PyCharm returns a length of 0 for total_games. In both of those cases when inspecting the page manually, the correct results are returned.
total_games = driver.find_elements(By.XPATH, '//table[@id="pgl_basic"]/tbody/tr[@id and @data-row]')
and
total_games = driver.find_elements(By.XPATH, '//tbody/tr[contains(@id, "pgl_basic") and @data-row]')
So this was a weird one. The URL was correct, but for some reason even though you could see the script going to the correct page, when it came time to collect those elements, it was still taking them from the previous page. I added a WebDriverWait function to make it wait for a specific element on the page I needed before collecting the elements and now it works.