I'd like to recreate the python script below which uses mechanize and http.cookiejar in R. I thought it would be straight forward using rvest but I was unable to do so. Any insight on which packages to use and apply would be extremely helpful. I realize reticulate may be a possibility but I figure that there has to be a way to do this in R that is straight forward.
import mechanize
import http.cookiejar
b = mechanize.Browser()
b.set_handle_refresh(True)
b.set_debug_redirects(True)
b.set_handle_redirect(True)
b.set_debug_http(True)
cj = http.cookiejar.CookieJar()
b.set_cookiejar(cj)
b.addheaders = [
('User-agent', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.94 Safari/537.36'),
('Accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8'),
('Host', 'www.fangraphs.com'),
('Referer', 'https://www.fangraphs.com/auctiontool.aspx?type=pit&proj=atc&pos=1,1,1,1,5,1,1,0,0,1,5,5,0,18,0&dollars=400&teams=12&mp=5&msp=5&mrp=5&mb=1&split=&points=c|0,1,2,3,4,5|0,1,2,3,4,5&lg=MLB&rep=0&drp=0&pp=C,SS,2B,3B,OF,1B&players=')
]
b.open("https://www.fangraphs.com/auctiontool.aspx?type=pit&proj=atc&pos=1,1,1,1,5,1,1,0,0,1,5,5,0,18,0&dollars=400&teams=12&mp=5&msp=5&mrp=5&mb=1&split=&points=c|0,1,2,3,4,5|0,1,2,3,4,5&lg=MLB&rep=0&drp=0&pp=C,SS,2B,3B,OF,1B&players=")
def is_form1_form(form):
return "id" in form.attrs and form.attrs['id'] == "form1"
b.select_form(predicate=is_form1_form)
b.form.find_control(name='__EVENTTARGET').readonly = False
b.form.find_control(name='__EVENTARGUMENT').readonly = False
b.form['__EVENTTARGET'] = 'AuctionBoard1$cmdCSV'
b.form['__EVENTARGUMENT'] = ''
print(b.submit().read())
The R code I was using to attempt to recreate this with rvest is below. The comments indicate the main source of my confusion. In particular the needed fields grabbed by the python code were not showing up when I grabbed the form with rvest and when I tried to manually insert them I got a Connection Refused upon submitting.
library(rvest)
atc.pitcher.link = "https://www.fangraphs.com/auctiontool.aspx?type=pit&proj=atc&pos=1,1,1,1,5,1,1,0,0,1,5,5,0,18,0&dollars=400&teams=12&mp=5&msp=5&mrp=5&mb=1&split=&points=c|0,1,2,3,4,5|0,1,2,3,4,5&lg=MLB&rep=0&drp=0&pp=C,SS,2B,3B,OF,1B&players="
proj.data = html_session(atc.pitcher.link)
form.unfilled = proj.data %>% html_node("form") %>% html_form()
# note: I am suprised "__EVENTTARGET" and "__EVENTARGUMENT" are not included as attributes of the unfilled form. I can select them in the posted python script.
# If I try and create them with the appropriate values I get a Connection Refused Error.
form.unfilled[[5]]$`__EVENTTARGET` = form.unfilled[[5]]$`__VIEWSTATE`
form.unfilled[[5]]$`__EVENTARGUMENT`= form.unfilled[[5]]$`__VIEWSTATE`
form.unfilled[[5]]$`__EVENTTARGET`$readonly = FALSE
form.unfilled[[5]]$`__EVENTTARGET`$value = "AuctionBoard1$cmdCSV"
form.unfilled[[5]]$`__EVENTARGUMENT`$value = ""
form.unfilled[[5]]$`__EVENTARGUMENT`$readonly = FALSE
form.filled = form.unfilled
session = submit_form(proj.data, form.filled)
Here is a way to do it using RSelenium and setting chrome to be headless an enabling remote download to your working directory. It automatically brings up a headless browser and then lets the code drive it.
I believe to do the equivalent in rvest you need to write some native phantomjs.