使用 Ruby 控制 Tor 客户端
我正在编写一个 Ruby 脚本,它会自动抓取网站进行数据分析,现在我有一个相当复杂的需求:我必须能够模拟来自多个国家(大约 20 个不同国家)的访问。该网站将根据 IP 位置包含不同的信息,因此完成该操作的唯一方法是从实际位于该国家/地区的服务器请求该信息。
由于我不想在这 20 个国家/地区中的每一个国家/地区购买服务器,因此我选择尝试一下 Tor - 正如你们许多人所知,通过编辑 torrc 配置文件,可以指定退出节点,从而指定来自的国家/地区实际请求将发起。
当我手动执行此操作时,例如通过编辑 torrc 文件以使用阿根廷服务器,然后使用 Vidalia 断开 Tor,重新连接 Vidalia,然后重新运行请求,它工作正常。然而,我想完全自动化这个过程,并尽可能高效地完成。 Tor 是用 C 语言编写的,我想避免为此分解它的整个源代码。您知道仅使用 Ruby 自动化整个过程的最简单方法是什么吗?
另外,如果我遗漏了一些东西,并且有一个更简单的替代方案可以替代整个痛苦,请告诉我。
谢谢!
I am writing a Ruby script which automatically crawls websites for data analysis, and now I have a requirement which is fairly complicated: I have to be able to simulate access from a variety of countries, about 20 different ones. The website will contain different information depending on the IP location, so the only way to get it done is to request it from a server which is actually in that country.
Since I don't want to buy servers in each of those 20 countries, I chose to give Tor a try - as many of you will know, by editing the torrc configuration file it is possible to specify the exit node and hence the country from which the actual request will originate.
When I do this manually, e.g. by editing the torrc file to use an Argentinian server, then disconnecting Tor using Vidalia, reconnecting Vidalia, and then rerunning the request, it works fine. However, I want to automate this process entirely, and do it as efficiently as possible. Tor is written in C, and I'd like to avoid taking apart its entire source code for this. Any idea of what's the easiest way to automate the whole process using only Ruby?
Also, if I'm missing something and there's a simpler alternative to this whole ordeal, let me know.
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
请查看 Tor 控制协议。您可以使用 telnet 控制电路。
http://thesprawl.org/memdump/?entry=8
切换到新电路即切换到新电路端点:
请注意创建新电路的延迟,可能需要几秒钟,因此您最好在代码中添加延迟,或者通过调用某些远程 IP 检测站点来检查您的地址是否已更改。
Please take a look at Tor control protocol. You can control circuits using telnet.
http://thesprawl.org/memdump/?entry=8
To switch to a new circuit wich switches to a new endpoint:
Be aware of the delay to make a new circuit, may take couple seconds, so you'd better add a delay in the code, or check if your address has changed by calling some remote IP detection site.