python mechanize submit response.read() doesn't return correct page?

162 Views Asked by At

all! I am trying to do some data processing on a publicly accessible database (property tax records, I know, not very glamorous!). The code runs perfectly, but it doesn't return the right page. When I enter the information into the forms manually, and click "submit" I get a lovely page full of relevant data. But when I submit the form via python mechanize, I do not get the same page. For example:

brwsr = mechanize.Browser()
brwsr.open("https://sdat.dat.maryland.gov/RealProperty/Pages/default.aspx")
brwsr.select_form(nr = 0)
# <option selected="selected" value="02">ANNE ARUNDEL COUNTY</option>
brwsr.form["ctl00$cphMainContentArea$ucSearchType$wzrdRealPropertySearch$ucSearchType$ddlCounty"] = ['02'] 
# STREET ADDRESS
# <option selected="selected" value="01">STREET ADDRESS</option>
brwsr.form["ctl00$cphMainContentArea$ucSearchType$wzrdRealPropertySearch$ucSearchType$ddlSearchType"] = ['01']
brwsr.submit()
# this works wonderfully!

# this is now the second page
brwsr.select_form(nr = 0)
# street number
brwsr.form["ctl00$cphMainContentArea$ucSearchType$wzrdRealPropertySearch$ucEnterData$txtStreenNumber"] = '523' # just an example
# street name
brwsr.form["ctl00$cphMainContentArea$ucSearchType$wzrdRealPropertySearch$ucEnterData$txtStreetName"] = 'SIXTH' # just an example
response = brwsr.submit()
print(response.read())

This returns no errors at all, but some of the html returned is the following:

<img src="/RealProperty/egov/img/ajax-loader.gif" id="imgLoader" alt="Loading... Please Wait." title="Loading... Please Wait." />\r\n 

Hmmmmmmm... "Loading"? "Please wait"? The second page seems to be dynamically generated, i.e., maybe it is running scripts that haven't finished by the time the response come. I've been reading and poking, and people seem to say that mechanize is the wrong tool for this kind of job. But what is the right tool? I feel like it ought to be straight-forward to parse the data on that second page... if I can just get my hands on it!

Thank you for your time!

Best, Susan

1

There are 1 best solutions below

0
Ahmed Mohamed On

Try brwsr.response.read()

brwsr = mechanize.Browser()
brwsr.open("https://sdat.dat.maryland.gov/RealProperty/Pages/default.aspx")
brwsr.select_form(nr = 0)
# <option selected="selected" value="02">ANNE ARUNDEL COUNTY</option>
brwsr.form["ctl00$cphMainContentArea$ucSearchType$wzrdRealPropertySearch$ucSearchType$ddlCounty"] = ['02'] 
# STREET ADDRESS
# <option selected="selected" value="01">STREET ADDRESS</option>
brwsr.form["ctl00$cphMainContentArea$ucSearchType$wzrdRealPropertySearch$ucSearchType$ddlSearchType"] = ['01']
brwsr.submit()
# this works wonderfully!

# this is now the second page
brwsr.select_form(nr = 0)
# street number
brwsr.form["ctl00$cphMainContentArea$ucSearchType$wzrdRealPropertySearch$ucEnterData$txtStreenNumber"] = '523' # just an example
# street name
brwsr.form["ctl00$cphMainContentArea$ucSearchType$wzrdRealPropertySearch$ucEnterData$txtStreetName"] = 'SIXTH' # just an example
brwsr.submit()
print(brwsr.response.read())