FancyURLopener、401 和“Connection: close”的 Python 问题
我是 Python 新手,所以如果我遗漏了一些明显的东西,请原谅我。
我正在使用 urllib.FancyURLopener 来检索 Web 文档。当在 Web 服务器上禁用身份验证时它可以正常工作,但在启用身份验证时会失败。
我的猜测是,我需要子类化 urllib.FancyURLopener 来覆盖 get_user_passwd() 和/或prompt_user_passwd() 方法。所以我这样做了:
class my_opener (urllib.FancyURLopener):
# Redefine
def get_user_passwd(self, host, realm, clear_cache=0):
print "get_user_passwd() called; host %s, realm %s" % (host, realm)
return ('name', 'password')
然后我尝试打开页面:
try:
opener = my_opener()
f = opener.open ('http://1.2.3.4/whatever.html')
content = f.read()
print "Got it: ", content
except IOError:
print "Failed!"
我希望 FancyURLopener 处理 401,调用我的 get_user_passwd(),然后重试该请求。
事实并非如此;当我调用“f = opener.open()”时,出现 IOError 异常。
Wireshark 告诉我请求已发送,并且服务器正在发送“401 Unauthorized”响应,其中包含两个感兴趣的标头:
WWW-Authenticate: BASIC
Connection: close
然后连接关闭,我捕获了异常,一切都结束了。
即使我在 IOError 之后重试“f = opener.open()”,它也会以同样的方式失败。
我已经通过使用简单的“print 'Got 401 error'”覆盖 http_error_401() 方法来验证我的 my_opener() 类正在工作。我还尝试覆盖 Prompt_user_passwd() 方法,但这也没有发生。
我认为没有办法主动指定用户名和密码。
那么如何让 urllib 重试请求呢?
谢谢。
I'm new to Python, so forgive me if I am missing something obvious.
I am using urllib.FancyURLopener to retrieve a web document. It works fine when authentication is disabled on the web server, but fails when authentication is enabled.
My guess is that I need to subclass urllib.FancyURLopener to override the get_user_passwd() and/or prompt_user_passwd() methods. So I did:
class my_opener (urllib.FancyURLopener):
# Redefine
def get_user_passwd(self, host, realm, clear_cache=0):
print "get_user_passwd() called; host %s, realm %s" % (host, realm)
return ('name', 'password')
Then I attempt to open the page:
try:
opener = my_opener()
f = opener.open ('http://1.2.3.4/whatever.html')
content = f.read()
print "Got it: ", content
except IOError:
print "Failed!"
I expect FancyURLopener to handle the 401, call my get_user_passwd(), and retry the request.
It does not; I get the IOError exception when I call "f = opener.open()".
Wireshark tells me that the request is sent, and that the server is sending a "401 Unauthorized" response with two headers of interest:
WWW-Authenticate: BASIC
Connection: close
The connection is then closed, I catch my exception, and it's all over.
It fails the same way even if I retry the "f = opener.open()" after IOError.
I have verified that my my_opener() class is working by overriding the http_error_401() method with a simple "print 'Got 401 error'". I have also tried to override the prompt_user_passwd() method, but that doesn't happen either.
I see no way to proactively specify the user name and password.
So how do I get urllib to retry the request?
Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我刚刚在我的网络服务器(nginx)上尝试了您的代码,它按预期工作:
HTTP/1.1 401 Unauthorized from server with Headers
客户端再次尝试使用授权标头
服务器响应 200 OK + 内容
所以我猜你的代码是正确的(我用 python 2.7 尝试过) 1) 也许您尝试访问的网络服务器未按预期工作。这是使用免费的 http basic auth testsite browserspy.dk 测试的代码(似乎他们正在使用 apache - 代码按预期工作):
I just tried your code on my webserver (nginx) and it works as expected:
HTTP/1.1 401 Unauthorized from server with Headers
client tries again with Authorization header
Server responds with 200 OK + Content
So I guess your code is right (I tried it with python 2.7.1) and maybe the webserver you are trying to access is not working as expected. Here is the code tested using the free http basic auth testsite browserspy.dk (seems they are using apache - the code works as expected):