将 urllib2 与 SOCKS 代理结合使用
是否可以通过 SOCKS 代理在每个 opener basic 的一台袜子服务器上使用 urllib2 获取页面?我已经看到使用 setdefaultproxy 方法的解决方案,但我需要在不同的开启器中使用不同的袜子。
所以有 SocksiPy 库,它工作得很好,但必须以这种方式使用:
import socks
import socket
socket.socket = socks.socksocket
import urllib2
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "x.x.x.x", y)
也就是说,它为所有 urllib2 请求设置相同的代理。如何为不同的开启器使用不同的代理?
Is it possible to fetch pages with urllib2 through a SOCKS proxy on a one socks server per opener basic? I've seen the solution using setdefaultproxy method, but I need to have different socks in different openers.
So there is SocksiPy library, which works great, but it has to be used this way:
import socks
import socket
socket.socket = socks.socksocket
import urllib2
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "x.x.x.x", y)
That is, it sets the same proxy for ALL urllib2 requests. How can I have different proxies for different openers?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
尝试使用 pycurl:
Try with pycurl:
是的,你可以。我重复我的答案 如何将 SOCKS 4/5 代理与 urllib2 一起使用?
您需要为每个代理创建一个开启程序,就像使用 http 代理一样。将此功能添加到 SocksiPy 的代码可在 GitHub https://gist.github.com/869791 并且很简单:
有关更多信息,我编写了一个运行多个 Tor 实例的示例,其行为类似于旋转代理: 使用多个 Tor 电路进行分布式抓取
Yes, you can. I repeat my answer on How can I use a SOCKS 4/5 proxy with urllib2?
You need to create an opener for every proxy like you do with an http proxy. The code for adding this feature to SocksiPy is available in GitHub https://gist.github.com/869791 and is as simple as:
For more information I've written an example running multiple Tor instances to behave like a rotating proxy: Distributed Scraping With Multiple Tor Circuits
您只有一个用于所有开启器的套接字,并且在套接字级别实现袜子。所以,你不能。
我建议你使用 pycurl 库,它更灵活。
You have only one socket for all openers and implementing socks is in socket level. So, you can't.
I suggest you to use pycurl library, it much more flexible.
==编辑==(旧的HTTP代理示例在这里..)
我的错..urllib2没有对SOCKS代理的内置支持..
有一些“黑客”将 SOCKS 添加到 urllib2 (或一般的套接字对象) 此处。
但我几乎不怀疑这将适用于您需要的多个代理。
只要您不想挂钩/子类 urllib2.ProxyHandler 我建议使用 pycurl。
== EDIT == (old HTTP-Proxy example was here..)
My fault.. urllib2 has no builtin support for SOCKS proxying..
There are some 'hacks' adding SOCKS to urllib2 (or the socket object in general) here.
But I hardly suspect that this will work with multiple proxies like you require it.
As long as you don't wan't to hook / subclass urllib2.ProxyHandler I would suggest to go with pycurl.
如果一次没有太多连接,并且需要从多个线程访问,您也许可以使用线程锁:
您也可以在每次需要获取连接时使用类似的东西:
这些显然不是很好的解决方案,但我还是投入了 2 美分:-)
You might be able to use threading locks if there aren't too many connections being made at once, and you need to access from multiple threads:
You might also be able to use something like this every time you need to get a connection:
These are obviously not fantastic solutions, but I've put in my 2¢ anyway :-)
使用 SOCKS 代理的一个麻烦但有效的解决方案是使用代理链接设置 provixy,然后通过系统变量或任何其他方式设置 privoxy 提供的 HTTP_PROXY。
A cumbersome but working solution for using a SOCKS proxy is to set up provixy with proxy chaining and then set the HTTP_PROXY provided by privoxy via system variable or any other way.
您可以通过以下格式设置环境变量 HTTP_PROXY 来实现:
user:pass@proxy:port
或者如果您使用 bat/cmd,请在调用脚本之前添加:
set HTTP_PROXY=user:pass@proxy:port
我正在使用这样的 cmd-文件使 easy_install 在代理下工作。
You could do you it by setting evironmental variable HTTP_PROXY in following format:
user:pass@proxy:port
or if you use bat/cmd, add before calling script:
set HTTP_PROXY=user:pass@proxy:port
I am using such cmd-file to make easy_install work under proxy.