Python 3 - 通过代理从 Web 服务器拉取文件对象（无身份验证）

发布于 2024-08-11 18:26:14 字数 941 浏览 8 评论 0原文

我有一个非常简单的问题，我非常惊讶我没有看到任何具体的内容。我正在尝试遵循使用 python3 通过代理服务器（不需要身份验证）复制托管在网络服务器上的文件的最佳实践。

我已经使用 python 2.5 做过类似的事情，但我在这里确实做得不够。我正在尝试将其变成一个可以在该网络上的未来脚本中重用的函数。如能提供任何帮助，我们将不胜感激。

我觉得我的问题在于尝试使用 urllib.request 或 http.client ，而没有任何关于如何合并使用代理（无需身份验证）的明确文档。

我一直在看这里并拔掉我的头发...... http://docs.python.org/3.1/库/urllib.request.html#urllib.request.ProxyHandler http://docs.python.org/3.1/library/http.client。 html http://diveintopython3.org/http-web-services.html

甚至这个 stackoverflow文章：使用 urllib2 进行代理，

但在 python3 中 urllib2 已被弃用...

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

素衣风尘叹 2024-08-18 18:26:15

这是一个通过 http 代理检索文件的函数：（

import urllib.request

def retrieve( url, filename ):
    proxy = urllib.request.ProxyHandler( {'http': '127.0.0.1'} )
    opener = urllib.request.build_opener( proxy )
    remote = opener.open( url )
    local = open( filename, 'wb' )
    data = remote.read(100)
    while data:
        local.write(data)
        data = remote.read(100)
    local.close()
    remote.close()

错误处理留给读者作为练习......）

您最终可以保存 opener 对象以供以后使用，以防您需要检索多个文件。内容按原样写入文件，但如果使用了奇特的编码，则可能需要对其进行解码。

here is an function to retrieve a file through an http proxy:

import urllib.request

def retrieve( url, filename ):
    proxy = urllib.request.ProxyHandler( {'http': '127.0.0.1'} )
    opener = urllib.request.build_opener( proxy )
    remote = opener.open( url )
    local = open( filename, 'wb' )
    data = remote.read(100)
    while data:
        local.write(data)
        data = remote.read(100)
    local.close()
    remote.close()

(error handling is left as an exercise to the reader...)

you can eventually save the opener object for later use, in case you need to retrieve multiple files. the content is written as-is into the file, but it may need to be decoded if a fancy encoding has been used.

回复收藏 0 原文

~没有更多了~