检测 Python 的 urllib2 urlopen 中的超时错误

发布于 2024-09-14 18:36:40 字数 254 浏览 2 评论 0原文

我对 Python 还比较陌生,所以如果这是一个明显的问题,我很抱歉。

我的问题是关于 urllib2 库,它是 urlopen 函数。目前,我正在使用它从另一台服务器加载大量页面(它们都位于同一远程主机上),但脚本时不时地会因超时错误而被终止(我认为这是来自大型请求)。

有没有办法让脚本在超时后继续运行?我希望能够获取所有页面,因此我想要一个脚本,该脚本将不断尝试,直到获取页面,然后继续前进。

顺便说一句,保持与服务器的连接打开会有帮助吗?

I'm still relatively new to Python, so if this is an obvious question, I apologize.

My question is in regard to the urllib2 library, and it's urlopen function. Currently I'm using this to load a large amount of pages from another server (they are all on the same remote host) but the script is killed every now and then by a timeout error (I assume this is from the large requests).

Is there a way to keep the script running after a timeout? I'd like to be able to fetch all of the pages, so I want a script that will keep trying until it gets a page, and then moves on.

On a side note, would keeping the connection open to the server help?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

瘫痪情歌 2024-09-21 18:36:40

下次发生错误时,请记下错误消息。最后一行将告诉您异常的类型。例如,它可能是 urllib2.HTTPError。一旦知道引发的异常类型,就可以在 try... except 块中捕获它。例如:

import urllib2
import time

for url in urls:
    while True:
        try:
            sock=urllib2.urlopen(url)
        except (urllib2.HTTPError, urllib2.URLError) as err:
            # You may want to count how many times you reach here and
            # do something smarter if you fail too many times.
            # If a site is down, pestering it every 10 seconds may not
            # be very fruitful or polite.
            time.sleep(10)
        else:              
            # Success  
            contents=sock.read()
            # process contents
            break                # break out of the while loop

Next time the error occurs, take note of the error message. The last line will tell you the type of exception. For example, it might be a urllib2.HTTPError. Once you know the type of exception raised, you can catch it in a try...except block. For example:

import urllib2
import time

for url in urls:
    while True:
        try:
            sock=urllib2.urlopen(url)
        except (urllib2.HTTPError, urllib2.URLError) as err:
            # You may want to count how many times you reach here and
            # do something smarter if you fail too many times.
            # If a site is down, pestering it every 10 seconds may not
            # be very fruitful or polite.
            time.sleep(10)
        else:              
            # Success  
            contents=sock.read()
            # process contents
            break                # break out of the while loop
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文