What am I missing re: python/urllib2?
July 26, 2007 12:29 PM   Subscribe

Why can't I get python's urllib2 library to fetch a page using HTTP basic authentication?

I'm playing around with python, and trying to use the urllib2 library to fetch a page that's password protected with a simple .htaccess file.

From what I can tell the opener object, when it encounters a 401, is supposed to check the authentication object for a URI match, and if it finds one, retry the URL with the specified username/password.

That doesn't seem to be happening and I'm not comfortable enough with python to know the best way to go about debugging this, and I'm not even sure I'm approaching this right (am I supposed to handle the 401 errors myself?)

Working code is appreciated, but I'm clearly missing something obvious here, so an explanation of what's going on would be more appreciated.


My Code:

#!/usr/local/bin/python
import urllib2
password_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm()
password_mgr.add_password(None, "alanstorm.com/testbed", 'admin','the-password')
handler = urllib2.HTTPBasicAuthHandler(password_mgr)
opener = urllib2.build_opener(handler)
f = opener.open("http://alanstorm.com/testbed/password-test/test.txt")
print(f.read());


The page
http://alanstorm.com/testbed/password-test/test.txt

Username: admin
Password: the-password

the .htaccess

AuthUserFile /path/to/htpass
AuthName grrrr
AuthType Basic
require user admin


The Error Message


Traceback (most recent call last):
File "./my-script.py", line 34, in ?
f = opener.open("http://alanstorm.com/testbed/password-test/test.txt")
File "/usr/local/lib/python2.4/urllib2.py", line 364, in open
response = meth(req, response)
File "/usr/local/lib/python2.4/urllib2.py", line 471, in http_response
response = self.parent.error(
File "/usr/local/lib/python2.4/urllib2.py", line 402, in error
return self._call_chain(*args)
File "/usr/local/lib/python2.4/urllib2.py", line 337, in _call_chain
result = func(*args)
File "/usr/local/lib/python2.4/urllib2.py", line 480, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 401: Authorization Required
posted by alana to Computers & Internet (6 answers total)
 
A URI includes the protocol.

Change that to
password_mgr.add_password(None, "http://alanstorm.com/testbed", 'admin','the-password')
posted by cmiller at 1:29 PM on July 26, 2007


Response by poster: No dice there. I tried it both with the http protocol and without. Same results.
posted by alana at 2:10 PM on July 26, 2007


Uh huh.
cmiller@zippy:~ $ python t.py 
i r in your secret dir

peeking at your files

cmiller@zippy:~ $ cat t.py 
#!/usr/local/bin/python
import urllib2
password_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm()
password_mgr.add_password(None, "http://alanstorm.com/testbed", 'admin','the-password')
handler = urllib2.HTTPBasicAuthHandler(password_mgr)
opener = urllib2.build_opener(handler)
f = opener.open("http://alanstorm.com/testbed/password-test/test.txt")
print(f.read());

posted by cmiller at 2:24 PM on July 26, 2007


Response by poster: Yeah, that's really weird.

I took your script, put in on my host's server, and still got the same error, which to me says hosed/old python install.

Thanks for sanity check!
posted by alana at 3:33 PM on July 26, 2007


Yo. I r in your secret dir, peeking at your files. Here's a working version of your script. See if that works. If it doesn't, show me the output.
posted by evariste at 9:15 PM on July 27, 2007


Hmm. cmiller's script works perfectly for me, too. But we approached things somewhat differently; I added a urllib2.install_opener call. I'd be curious to see if my script works on your host's server or not.
posted by evariste at 9:42 PM on July 27, 2007


« Older Introductions to technological development   |   How to motivate for cleaning/organizing? Newer »
This thread is closed to new comments.