关闭 urllib2 连接
我正在使用 urllib2 从 ftp 和 http 服务器加载文件。
某些服务器仅支持每个 IP 一个连接。问题是 urllib2 不会立即关闭连接。查看示例程序。
from urllib2 import urlopen
from time import sleep
url = 'ftp://user:pass@host/big_file.ext'
def load_file(url):
f = urlopen(url)
loaded = 0
while True:
data = f.read(1024)
if data == '':
break
loaded += len(data)
f.close()
#sleep(1)
print('loaded {0}'.format(loaded))
load_file(url)
load_file(url)
该代码从仅支持 1 个连接的 ftp 服务器加载两个文件(此处两个文件相同)。这将打印以下日志:
loaded 463675266
Traceback (most recent call last):
File "conection_test.py", line 20, in <module>
load_file(url)
File "conection_test.py", line 7, in load_file
f = urlopen(url)
File "/usr/lib/python2.6/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib/python2.6/urllib2.py", line 391, in open
response = self._open(req, data)
File "/usr/lib/python2.6/urllib2.py", line 409, in _open
'_open', req)
File "/usr/lib/python2.6/urllib2.py", line 369, in _call_chain
result = func(*args)
File "/usr/lib/python2.6/urllib2.py", line 1331, in ftp_open
fw = self.connect_ftp(user, passwd, host, port, dirs, req.timeout)
File "/usr/lib/python2.6/urllib2.py", line 1352, in connect_ftp
fw = ftpwrapper(user, passwd, host, port, dirs, timeout)
File "/usr/lib/python2.6/urllib.py", line 854, in __init__
self.init()
File "/usr/lib/python2.6/urllib.py", line 860, in init
self.ftp.connect(self.host, self.port, self.timeout)
File "/usr/lib/python2.6/ftplib.py", line 134, in connect
self.welcome = self.getresp()
File "/usr/lib/python2.6/ftplib.py", line 216, in getresp
raise error_temp, resp
urllib2.URLError: <urlopen error ftp error: 421 There are too many connections from your internet address.>
因此第一个文件已加载,第二个文件失败,因为第一个连接未关闭。
但是当我在 f.close()
之后使用 sleep(1)
时,不会发生错误:
loaded 463675266
loaded 463675266
有没有办法强制关闭连接,以便第二次下载不会失败?
I'm using urllib2 to load files from ftp- and http-servers.
Some of the servers support only one connection per IP. The problem is, that urllib2 does not close the connection instantly. Look at the example-program.
from urllib2 import urlopen
from time import sleep
url = 'ftp://user:pass@host/big_file.ext'
def load_file(url):
f = urlopen(url)
loaded = 0
while True:
data = f.read(1024)
if data == '':
break
loaded += len(data)
f.close()
#sleep(1)
print('loaded {0}'.format(loaded))
load_file(url)
load_file(url)
The code loads two files (here the two files are the same) from an ftp-server which supports only 1 connection. This will print the following log:
loaded 463675266
Traceback (most recent call last):
File "conection_test.py", line 20, in <module>
load_file(url)
File "conection_test.py", line 7, in load_file
f = urlopen(url)
File "/usr/lib/python2.6/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib/python2.6/urllib2.py", line 391, in open
response = self._open(req, data)
File "/usr/lib/python2.6/urllib2.py", line 409, in _open
'_open', req)
File "/usr/lib/python2.6/urllib2.py", line 369, in _call_chain
result = func(*args)
File "/usr/lib/python2.6/urllib2.py", line 1331, in ftp_open
fw = self.connect_ftp(user, passwd, host, port, dirs, req.timeout)
File "/usr/lib/python2.6/urllib2.py", line 1352, in connect_ftp
fw = ftpwrapper(user, passwd, host, port, dirs, timeout)
File "/usr/lib/python2.6/urllib.py", line 854, in __init__
self.init()
File "/usr/lib/python2.6/urllib.py", line 860, in init
self.ftp.connect(self.host, self.port, self.timeout)
File "/usr/lib/python2.6/ftplib.py", line 134, in connect
self.welcome = self.getresp()
File "/usr/lib/python2.6/ftplib.py", line 216, in getresp
raise error_temp, resp
urllib2.URLError: <urlopen error ftp error: 421 There are too many connections from your internet address.>
So the first file is loaded and the second fails because the first connection was not closed.
But when i use sleep(1)
after f.close()
the error does not occurr:
loaded 463675266
loaded 463675266
Is there any way to force close the connection so that the second download would not fail?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
原因确实是文件描述符泄漏。我们还发现,使用 jython 时,问题比使用 cpython 时要明显得多。
一位同事提出了这个解决方案:
修复方法很丑陋,但确实有效,不再有“太多开放连接”。
The cause is indeed a file descriptor leak. We found also that with jython, the problem is much more obvious than with cpython.
A colleague proposed this sollution:
The fix is ugly, but does the job, no more "too many open connections".
Biggie:我认为这是因为连接没有shutdown()。
您可以在 f.close() 之前尝试类似的操作:(
是的..如果可行,那就不对(tm),但您会知道问题是什么。)
Biggie: I think it's because the connection is not shutdown().
You could try something like this before f.close():
(And yes.. if that works, it's not Right(tm), but you'll know what the problem is.)
至于Python 2.7.1 urllib2确实泄漏了一个文件描述符:
https://bugs.pypy.org/issue867
as for Python 2.7.1 urllib2 indeed leaks a file descriptor:
https://bugs.pypy.org/issue867
亚历克斯·马尔泰利回答了类似的问题。阅读此内容: 我应该在 urllib.urlopen( 之后调用 close() 吗)?
简而言之:
Alex Martelli answers to the similar question. Read this : should I call close() after urllib.urlopen()?
In a nutshell: