需要 urllib.urlretrieve 和 urllib2.OpenerDirector 一起使用

发布于 2024-10-09 17:32:47 字数 530 浏览 8 评论 0原文

我正在用 Python 2.7 编写一个脚本，它通过 urllib2.build_opener() 来使用 urllib2.OpenerDirector 实例来利用 urllib2.HTTPCookieProcessor > 类，因为我需要存储并重新发送我得到的 cookie：

opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookielib.CookieJar()))

但是，在发出多个请求并移动 cookie 后，最终我需要检索 URL 列表。我想使用 urllib.urlretrieve() ，因为我读到它会分块下载文件，但我不能，因为我需要在请求中携带我的 cookie，并且 urllib.urlretrieve() 使用 urllib.URLOpener，它不支持像 OpenerDirector 这样的 cookie 处理程序。

这种奇怪的功能分割方式的原因是什么？我怎样才能实现我的目标？

原文

I'm writing a script in Python 2.7 which uses a urllib2.OpenerDirector instance via urllib2.build_opener() to take advantage of the urllib2.HTTPCookieProcessor class, because I need to store and re-send the cookies I get:

opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookielib.CookieJar()))

However, after making several requests and moving the cookies around, eventually I need to retrieve a list of URLs. I wanted to use urllib.urlretrieve() because I read it downloads the file in chunks, but I cannot because I need to carry my cookies on the request and
urllib.urlretrieve() uses a urllib.URLOpener, which doesn't have support for cookie handlers like OpenerDirector has.

What's the reason of this strange way of splitting functionality, and how can I achieve my goal?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

不回头走下去 2024-10-16 17:32:47

urlretrieve 是 urllib 的旧接口。它早在 urllib2 出现之前就已经存在了。它没有任何会话处理功能。它只是下载文件。更新后的 urllib2 使用其 Handler 接口 OpenerDirector 类提供了更好的处理会话、密码、代理的方法。为了将 url 下载为文件，您可以使用您创建的相同请求对象来使用 urllib2 的 urlopen 调用。这将维持会话。

回复收藏 0 原文

~没有更多了~