如果超时则跳过 URL
我有一个 URL 列表,
我正在使用以下内容来检索其内容:
for url in url_list:
req = urllib2.Request(url)
resp = urllib2.urlopen(req, timeout=5)
resp_page = resp.read()
print resp_page
当超时时,程序就会崩溃。我只想读取下一个 URL(如果有 socket.timeout: timed out
)。如何做到这一点?
谢谢
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
尽管已经有了答案,但我想指出
URLlib2
可能不是造成此行为的唯一原因。正如此处所指出的(而且它似乎也基于问题描述),异常可能属于
socket
库。在这种情况下,只需添加另一个
except
:Although there already is an answer, I'd like to point out that
URLlib2
might not be the sole responsible with this behavior.As pointed out here (and as it also seems based on the problem description), the exception may belong to the
socket
library.In that case just add another
except
:我将继续假设“崩溃”是指“引发 URLError”,如
urllib2.urlopen
文档。请参阅 Python 教程的错误和异常部分。I'm going to go ahead and assume that by "crashes" you mean "raises a URLError", as described by the
urllib2.urlopen
docs. See the Errors and Exceptions section of the Python Tutorial.听起来你只需要捕获超时异常。我没有收到您收到的 socket.timeout 消息。
显然,您需要有一个实际上会超时的 URL(127.0.0.2 可能不在您的盒子上)。
Sounds like you just need to catch the timeout exception. I don't get a socket.timeout message that you do.
Obviously, you need to have a URL that will actually timeout (127.0.0.2 may not on your box).