I'm doing a small project to help my work go by faster. I currently have a program written in Python 3.2 that does almost all of the manual labour for me, with one exception. I need to log on to the company website (username and password) then choose a month and year and click download. I would like to write a little program to do that for me, so that the whole process is completely done by the program.
I have looked into it and I can only find tools for 2.X. I have looked into urllib and I know that some of the 2.X moudles are now in urllib.request.
I have even found some code to start it off, however I'm confused as to how to put it into practise.
Here is what I have found:
import urllib2
theurl = 'http://www.someserver.com/toplevelurl/somepage.htm'
username = 'johnny'
password = 'XXXXXX'
# a great password
passman = urllib2.HTTPPasswordMgrWithDefaultRealm()
# this creates a password manager
passman.add_password(None, theurl, username, password)
# because we have put None at the start it will always
# use this username/password combination for urls
# for which `theurl` is a super-url
authhandler = urllib2.HTTPBasicAuthHandler(passman)
# create the AuthHandler
opener = urllib2.build_opener(authhandler)
urllib2.install_opener(opener)
# All calls to urllib2.urlopen will now use our handler
# Make sure not to include the protocol in with the URL, or
# HTTPPasswordMgrWithDefaultRealm will be very confused.
# You must (of course) use it when fetching the page though.
pagehandle = urllib2.urlopen(theurl)
# authentication is now handled automatically for us
All Credit to Michael Foord and his page: Basic Authentication
So I changed the code around a bit and replaced all the 'urllib2' with 'urllib.request'
Then I learned how to open a webpage, figuring the program should open the webpage, use the login and password data to open the page, then I'll learn how to download the files from it.
ie = webbrowser.get('c:\\program files\\internet explorer\\iexplore.exe')
ie.open(theurl)
(I know Explorer is garbage, just using it to test then I'll be using crome ;) )
But that doesnt open the page with the login data entered, it simply opens the page as though you had typed in the url.
How do I get it to open the page with the password handle? I sort of understand how Michael made them, but I'm not sure which to use to actually open the website.
Also an after thought, might I need to look into cookies?
Thanks for your time
you get things confused here.
webbrowser
is a wrapper around your actual webbrowser, andurllib
is a library for http- and url-related stuff. They don't know each other, and serve very different purposes.In former IE versions, you could encode HTTP Basic Auth username and password in the URL like so:
http(s)://Username:Password@Server/Ressource.ext
- I believe Firefox and Chrome still support that, IE killed it: http://support.microsoft.com/kb/834489/EN-USif you want to emulate a browser, rather than just open a real one, take a look at
mechanize
: http://wwwsearch.sourceforge.net/mechanize/