为什么 Python 的 urllib2.urlopen() 会针对成功的状态代码引发 HTTPError ?
根据 urllib2 文档,
由于默认处理程序处理重定向(300 范围内的代码),而 100-299 范围内的代码表示成功,因此您通常只会看到 400-599 范围内的错误代码。
然而,下面的代码
request = urllib2.Request(url, data, headers)
response = urllib2.urlopen(request)
会引发一个 HTTPError,代码为 201(已创建):
ERROR 2011-08-11 20:40:17,318 __init__.py:463] HTTP Error 201: Created
那么为什么 urllib2
在这个成功的请求上抛出 HTTPErrors?
我可以轻松地将代码扩展为:
try:
request = urllib2.Request(url, data, headers)
response = urllib2.urlopen(request)
except HTTPError, e:
if e.code == 201:
# success! :)
else:
# fail! :(
else:
# when will this happen...?
但这似乎不是预期的行为,基于文档以及我找不到关于这种奇怪行为的类似问题的事实。
此外,else
块应该期待什么? 如果成功的状态代码都被解释为 HTTPError
,那么 urllib2. urlopen()
只是返回一个普通的类似文件的响应对象,就像所有 urllib2
文档所引用的那样?
According to the urllib2 documentation,
Because the default handlers handle redirects (codes in the 300 range), and codes in the 100-299 range indicate success, you will usually only see error codes in the 400-599 range.
And yet the following code
request = urllib2.Request(url, data, headers)
response = urllib2.urlopen(request)
raises an HTTPError with code 201 (created):
ERROR 2011-08-11 20:40:17,318 __init__.py:463] HTTP Error 201: Created
So why is urllib2
throwing HTTPErrors on this successful request?
It's not too much of a pain; I can easily extend the code to:
try:
request = urllib2.Request(url, data, headers)
response = urllib2.urlopen(request)
except HTTPError, e:
if e.code == 201:
# success! :)
else:
# fail! :(
else:
# when will this happen...?
But this doesn't seem like the intended behavior, based on the documentation and the fact that I can't find similar questions about this odd behavior.
Also, what should the else
block be expecting? If successful status codes are all interpreted as HTTPError
s, then when does urllib2.urlopen()
just return a normal file-like response object like all the urllib2
documentation refers to?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您可以编写一个自定义
Handler
类以与urllib2
一起使用,以防止将特定错误代码引发为HTTError
。这是我以前用过的一个:然后你可以像这样使用它:
You can write a custom
Handler
class for use withurllib2
to prevent specific error codes from being raised asHTTError
. Here's one I've used before:Then you can use it like:
正如实际的库文档提到的:
http://docs.python.org/library/urllib2.html#httperrorprocessor-objects
As the actual library documentation mentions:
http://docs.python.org/library/urllib2.html#httperrorprocessor-objects
我个人认为这是一个错误,并且将其作为默认行为非常不直观。
确实,非 2XX 代码意味着协议级错误,但将其转变为异常就太过分了(至少在我看来)。
无论如何,我认为避免这种情况的最优雅的方法是:
现在您有了响应对象。您可以检查它的状态代码、标头、正文等。
I personally think it was a mistake and very nonintuitive for this to be the default behavior.
It's true that non-2XX codes imply a protocol level error, but turning that into an exception is too far (in my opinion at least).
In any case, I think the most elegant way to avoid this is:
Now you have the response object. You can check it's status code, headers, body, etc.