将 urllib2 与 SOCKS 代理结合使用

发布于 2024-08-26 13:15:29 字数 374 浏览 5 评论 0原文

是否可以通过 SOCKS 代理在每个 opener basic 的一台袜子服务器上使用 urllib2 获取页面？我已经看到使用 setdefaultproxy 方法的解决方案，但我需要在不同的开启器中使用不同的袜子。

所以有 SocksiPy 库，它工作得很好，但必须以这种方式使用：

import socks
import socket
socket.socket = socks.socksocket
import urllib2
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "x.x.x.x", y)

也就是说，它为所有 urllib2 请求设置相同的代理。如何为不同的开启器使用不同的代理？

原文

Is it possible to fetch pages with urllib2 through a SOCKS proxy on a one socks server per opener basic? I've seen the solution using setdefaultproxy method, but I need to have different socks in different openers.

So there is SocksiPy library, which works great, but it has to be used this way:

import socks
import socket
socket.socket = socks.socksocket
import urllib2
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "x.x.x.x", y)

That is, it sets the same proxy for ALL urllib2 requests. How can I have different proxies for different openers?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

梦在深巷 2024-09-02 13:15:29

尝试使用 pycurl：

import pycurl
c1 = pycurl.Curl()
c1.setopt(pycurl.URL, 'http://www.google.com')
c1.setopt(pycurl.PROXY, 'localhost')
c1.setopt(pycurl.PROXYPORT, 8080)
c1.setopt(pycurl.PROXYTYPE, pycurl.PROXYTYPE_SOCKS5)

c2 = pycurl.Curl()
c2.setopt(pycurl.URL, 'http://www.yahoo.com')
c2.setopt(pycurl.PROXY, 'localhost')
c2.setopt(pycurl.PROXYPORT, 8081)
c2.setopt(pycurl.PROXYTYPE, pycurl.PROXYTYPE_SOCKS5)

c1.perform() 
c2.perform()

Try with pycurl:

import pycurl
c1 = pycurl.Curl()
c1.setopt(pycurl.URL, 'http://www.google.com')
c1.setopt(pycurl.PROXY, 'localhost')
c1.setopt(pycurl.PROXYPORT, 8080)
c1.setopt(pycurl.PROXYTYPE, pycurl.PROXYTYPE_SOCKS5)

c2 = pycurl.Curl()
c2.setopt(pycurl.URL, 'http://www.yahoo.com')
c2.setopt(pycurl.PROXY, 'localhost')
c2.setopt(pycurl.PROXYPORT, 8081)
c2.setopt(pycurl.PROXYTYPE, pycurl.PROXYTYPE_SOCKS5)

c1.perform() 
c2.perform()

回复收藏 0 原文

明明#如月 2024-09-02 13:15:29

是的，你可以。我重复我的答案如何将 SOCKS 4/5 代理与 urllib2 一起使用？
您需要为每个代理创建一个开启程序，就像使用 http 代理一样。将此功能添加到 SocksiPy 的代码可在 GitHub https://gist.github.com/869791 并且很简单：

opener = urllib2.build_opener(SocksiPyHandler(socks.PROXY_TYPE_SOCKS4, 'localhost', 9999))
print opener.open('http://www.whatismyip.com/automation/n09230945.asp').read()

有关更多信息，我编写了一个运行多个 Tor 实例的示例，其行为类似于旋转代理：使用多个 Tor 电路进行分布式抓取

Yes, you can. I repeat my answer on How can I use a SOCKS 4/5 proxy with urllib2?
You need to create an opener for every proxy like you do with an http proxy. The code for adding this feature to SocksiPy is available in GitHub https://gist.github.com/869791 and is as simple as:

opener = urllib2.build_opener(SocksiPyHandler(socks.PROXY_TYPE_SOCKS4, 'localhost', 9999))
print opener.open('http://www.whatismyip.com/automation/n09230945.asp').read()

For more information I've written an example running multiple Tor instances to behave like a rotating proxy: Distributed Scraping With Multiple Tor Circuits

回复收藏 0 原文

方觉久 2024-09-02 13:15:29

您只有一个用于所有开启器的套接字，并且在套接字级别实现袜子。所以，你不能。
我建议你使用 pycurl 库，它更灵活。

回复收藏 0 原文

淡笑忘祈一世凡恋 2024-09-02 13:15:29

==编辑==（旧的HTTP代理示例在这里..）

我的错..urllib2没有对SOCKS代理的内置支持..

有一些“黑客”将 SOCKS 添加到 urllib2 （或一般的套接字对象）此处。
但我几乎不怀疑这将适用于您需要的多个代理。

只要您不想挂钩/子类 urllib2.ProxyHandler 我建议使用 pycurl。

回复收藏 0 原文

铁轨上的流浪者 2024-09-02 13:15:29

如果一次没有太多连接，并且需要从多个线程访问，您也许可以使用线程锁：

import socks
import socket
import thread
lock = thread.allocate_lock()
socket.socket = socks.socksocket

def GetConn():
    lock.acquire()
    import urllib2
    socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "x.x.x.x", y)
    conn = urllib2.urlopen(ARGUMENTS HERE)
    lock.release()
    return conn

您也可以在每次需要获取连接时使用类似的东西：

urllib2 = execfile('urllib2.py')
urllib2.socket = dummy_class() # dummy_class needs the socket module's methods

这些显然不是很好的解决方案，但我还是投入了 2 美分:-)

You might be able to use threading locks if there aren't too many connections being made at once, and you need to access from multiple threads:

import socks
import socket
import thread
lock = thread.allocate_lock()
socket.socket = socks.socksocket

def GetConn():
    lock.acquire()
    import urllib2
    socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "x.x.x.x", y)
    conn = urllib2.urlopen(ARGUMENTS HERE)
    lock.release()
    return conn

You might also be able to use something like this every time you need to get a connection:

urllib2 = execfile('urllib2.py')
urllib2.socket = dummy_class() # dummy_class needs the socket module's methods

These are obviously not fantastic solutions, but I've put in my 2¢ anyway :-)

回复收藏 0 原文

时光无声 2024-09-02 13:15:29

使用 SOCKS 代理的一个麻烦但有效的解决方案是使用代理链接设置 provixy，然后通过系统变量或任何其他方式设置 privoxy 提供的 HTTP_PROXY。

回复收藏 0 原文

来日方长 2024-09-02 13:15:29

您可以通过以下格式设置环境变量 HTTP_PROXY 来实现：

user:pass@proxy:port

或者如果您使用 bat/cmd，请在调用脚本之前添加：

set HTTP_PROXY=user:pass@proxy:port

我正在使用这样的 cmd-文件使 easy_install 在代理下工作。

回复收藏 0 原文

~没有更多了~

关于作者

多情出卖

暂无简介

0 文章

0 评论

22 人气

关注发私信

友情链接

文江博客

将 urllib2 与 SOCKS 代理结合使用

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（7）

关于作者

相关话题

热门标签

推荐作者

1CH1MKgiKxn9p

ゞ记忆︶ㄣ

JackDx

信远

yaoduoduo1995

霞映澄塘

友情链接

将 urllib2 与 SOCKS 代理结合使用

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（7）

关于作者

相关话题

热门标签

推荐作者

1CH1MKgiKxn9p

ゞ记忆︶ㄣ

JackDx

信远

yaoduoduo1995

霞映澄塘

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。