为什么 urllib.urlopen(url) 失败而 urllib2.urlopen(url) 有效。服务器响应具体是什么导致了这种情况?

发布于 2025-01-08 14:37:26 字数 1436 浏览 0 评论 0原文

我只是想更好地了解这里发生的情况,我当然可以使用 urllib2“解决”这个问题。

import urllib
import urllib2

url = "http://www.crutchfield.com/S-pqvJFyfA8KG/p_15410415/Dynamat-10415-Xtreme-Speaker-Kit.html"

# urllib2 works fine (foo.headers / foo.read() also behave)
foo = urllib2.urlopen(url)

# urllib throws errors though, what specifically is causing this?
bar = urllib.urlopen(url)

http://pae.st/AxDW/ 显示了此代码的运行情况以及异常/堆栈跟踪。 foo.headersfoo.read() 工作正常

[电子邮件受保护] ~ $curl -I “http://www.crutchfield.com/S-pqvJFyfA8KG/p_15410415/Dynamat-10415-Xtreme-Speaker-Kit.html”

HTTP/1.1 302 Object Moved
Cache-Control: private
Transfer-Encoding: chunked
Content-Type: text/html; charset=utf-8
Location: /S-FSTWJcduy5w/p_15410415/Dynamat-10415-Xtreme-Speaker-Kit.html
Server: Microsoft-IIS/7.5
Set-Cookie: SESSIONID=FSTWJcduy5w; domain=.crutchfield.com; expires=Fri, 22-Feb-2013 22:06:43 GMT; path=/
Set-Cookie: SYSTEMID=0; domain=.crutchfield.com; expires=Fri, 22-Feb-2013 22:06:43 GMT; path=/
Set-Cookie: SESSIONDATE=02/23/2012 17:07:00; domain=.crutchfield.com; expires=Fri, 22-Feb-2013 22:06:43 GMT; path=/
X-AspNet-Version: 4.0.30319
HostName: cws105
Date: Thu, 23 Feb 2012 22:06:43 GMT

谢谢。

I just want a better idea of what's going on here, I can of course "work around" the problem by using urllib2.

import urllib
import urllib2

url = "http://www.crutchfield.com/S-pqvJFyfA8KG/p_15410415/Dynamat-10415-Xtreme-Speaker-Kit.html"

# urllib2 works fine (foo.headers / foo.read() also behave)
foo = urllib2.urlopen(url)

# urllib throws errors though, what specifically is causing this?
bar = urllib.urlopen(url)

http://pae.st/AxDW/ shows this code in action with the exception/stacktrace. foo.headers and foo.read() work fine

[email protected] ~ $: curl -I "http://www.crutchfield.com/S-pqvJFyfA8KG/p_15410415/Dynamat-10415-Xtreme-Speaker-Kit.html"

HTTP/1.1 302 Object Moved
Cache-Control: private
Transfer-Encoding: chunked
Content-Type: text/html; charset=utf-8
Location: /S-FSTWJcduy5w/p_15410415/Dynamat-10415-Xtreme-Speaker-Kit.html
Server: Microsoft-IIS/7.5
Set-Cookie: SESSIONID=FSTWJcduy5w; domain=.crutchfield.com; expires=Fri, 22-Feb-2013 22:06:43 GMT; path=/
Set-Cookie: SYSTEMID=0; domain=.crutchfield.com; expires=Fri, 22-Feb-2013 22:06:43 GMT; path=/
Set-Cookie: SESSIONDATE=02/23/2012 17:07:00; domain=.crutchfield.com; expires=Fri, 22-Feb-2013 22:06:43 GMT; path=/
X-AspNet-Version: 4.0.30319
HostName: cws105
Date: Thu, 23 Feb 2012 22:06:43 GMT

Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

樱花落人离去 2025-01-15 14:37:26

该服务器既不确定又对 HTTP 版本敏感。 urllib2 是 HTTP/1.1,urllib 是 HTTP/1.0。您可以通过运行 curl --http1.0 -I "http://www.crutchfield.com/S-pqvJFyfA8KG/p_15410415/Dynamat-10415-Xtreme-Speaker-Kit.html" 来重现此内容
连续几次。您应该偶尔会看到输出 curl: (52) Empty returned from server;这就是 urllib 报告的错误。 (如果您使用 urllib 多次重新发出请求,有时应该会成功。)

This server is both non-deterministic and sensitive to HTTP version. urllib2 is HTTP/1.1, urllib is HTTP/1.0. You can reproduce this by running curl --http1.0 -I "http://www.crutchfield.com/S-pqvJFyfA8KG/p_15410415/Dynamat-10415-Xtreme-Speaker-Kit.html"
a few times in a row. You should see the output curl: (52) Empty reply from server occasionally; that's the error urllib is reporting. (If you re-issue the request a bunch of times with urllib, it should succeed sometimes.)

没企图 2025-01-15 14:37:26

我解决了问题。我现在只是使用 urrlib 而不是 urllib2,一切正常,谢谢大家:)

I solved the Problem. I simply using now the urrlib instead of urllib2 and anything works fine thank you all :)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文