当前位置：文江博客话题详情

如果超时则跳过 URL

发布于 2024-12-14 08:25:22 字数 310 浏览 1 评论 0 原文

我有一个 URL 列表，

我正在使用以下内容来检索其内容：

for url in url_list:
    req = urllib2.Request(url)
    resp = urllib2.urlopen(req, timeout=5)
    resp_page = resp.read()
    print resp_page

当超时时，程序就会崩溃。我只想读取下一个 URL（如果有 socket.timeout: timed out）。如何做到这一点？

谢谢

原文

I have a list of URL's

I am using the following to retrieve their contents:

for url in url_list:
    req = urllib2.Request(url)
    resp = urllib2.urlopen(req, timeout=5)
    resp_page = resp.read()
    print resp_page

When there is a timeout, the program just crashes. I just want to read the next URL if there is a socket.timeout: timed out. How to do this?

Thanks

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

聽兲甴掵 2024-12-21 08:25:22

尽管已经有了答案，但我想指出 URLlib2 可能不是造成此行为的唯一原因。

正如此处所指出的（而且它似乎也基于问题描述），异常可能属于socket库。

在这种情况下，只需添加另一个 except：

import socket

try:
    resp = urllib2.urlopen(req, timeout=5)
except urllib2.URLError:
    print "Bad URL or timeout"
except socket.timeout:
    print "socket timeout"

Although there already is an answer, I'd like to point out that URLlib2 might not be the sole responsible with this behavior.

As pointed out here (and as it also seems based on the problem description), the exception may belong to the socket library.

In that case just add another except:

import socket

try:
    resp = urllib2.urlopen(req, timeout=5)
except urllib2.URLError:
    print "Bad URL or timeout"
except socket.timeout:
    print "socket timeout"

回复收藏 0 原文

萌能量女王 2024-12-21 08:25:22

我将继续假设“崩溃”是指“引发 URLError”，如 urllib2.urlopen 文档。请参阅 Python 教程的错误和异常部分。

for url in url_list:
    req = urllib2.Request(url)
    try:
        resp = urllib2.urlopen(req, timeout=5)
    except urllib2.URLError:
        print "Bad URL or timeout"
        continue # skips to the next iteration of the loop
    resp_page = resp.read()
    print resp_page

I'm going to go ahead and assume that by "crashes" you mean "raises a URLError", as described by the urllib2.urlopen docs. See the Errors and Exceptions section of the Python Tutorial.

for url in url_list:
    req = urllib2.Request(url)
    try:
        resp = urllib2.urlopen(req, timeout=5)
    except urllib2.URLError:
        print "Bad URL or timeout"
        continue # skips to the next iteration of the loop
    resp_page = resp.read()
    print resp_page

回复收藏 0 原文

意犹 2024-12-21 08:25:22

听起来你只需要捕获超时异常。我没有收到您收到的 socket.timeout 消息。

req = urllib2.Request("http://127.0.0.2")
try:
    resp = urllib2.urlopen(req, timeout=5)
except urllib2.URLError:
    print "Timeout!"

显然，您需要有一个实际上会超时的 URL（127.0.0.2 可能不在您的盒子上）。

Sounds like you just need to catch the timeout exception. I don't get a socket.timeout message that you do.

req = urllib2.Request("http://127.0.0.2")
try:
    resp = urllib2.urlopen(req, timeout=5)
except urllib2.URLError:
    print "Timeout!"

Obviously, you need to have a URL that will actually timeout (127.0.0.2 may not on your box).

回复收藏 0 原文

~没有更多了~

关于作者

意犹

暂无简介

文章

28 人气

关注发私信

友情链接

文江博客

如果超时则跳过 URL

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

李珊平

Quxin

范无咎

github_ZOJ2N8YxBm

若言

南…巷孤猫

友情链接

如果超时则跳过 URL

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

李珊平

Quxin

范无咎

github_ZOJ2N8YxBm

若言

南…巷孤猫

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。