I'm trying to loop through a job listing website to grab their job listing and do text analysis. For this job I use RSelenium. The code I am working on is as follows:
#### REMOTE.COM ####
remDR$navigate('https://remote.com/jobs/all?query=marketing&country=anywhere')
# click on the cookies policy
remDR$findElement(using = 'xpath', '//*[@id="ccc-notify-accept"]')$clickElement()
# print all job listings
num_links <- 20
for(i in 1:num_links){
remDR$findElement(using = 'xpath',
paste('/html/body/div[2]/main/div/div/div[3]/article[',i,']', sep = ''))$clickElement()
print(remDR$getCurrentUrl())
remDR$goBack()
}
The problem is that when I get the loop started, two issues occur.
First, the print(remDR$getCurrentUrl()) command returns the original url (https://remote.com/jobs/all?query=marketing&country=anywhere), not the page that was clicked on in the first part of the for loop. Second, when remDR$goBack() executes, it takes me back to the previous blank page, as if there was no link clicked on.
To summarize, I think the loop is running faster than Rselenium takes to find and click on the element.
EDIT
Solution was found thanks to a recommendation:
for(i in 1:5){
remDR$findElement(using = 'xpath',
paste('/html/body/div[2]/main/div/div/div[3]/article[',i,']', sep = ''))$clickElement()
Sys.sleep(2) # add time for page to load
print(remDR$getCurrentUrl())
remDR$navigate('https://remote.com/jobs/all?query=marketing&country=anywhere') # .$navigate() works better as it makes the page load and give you time
Sys.sleep(2) # add time for page to load
}
The steps taken were to give chrome time to load the page Sys.sleep(2) and use .$navigate() instead of goBack(), reason is .$navigate() load content in browser. Important note, loop won't work without the final Sys.sleep(2) as you need the first page to completely load before the loop clicks on the second item.