伪装抓取身份的优化配置
我正在运行一堆从网站抓取数据的脚本。出于不想让您厌烦的原因,我无法在同一主机上运行它们 - 相反,我需要设置六个不同的主机。我想配置我的托管设置以掩盖所有六台主机具有相同所有者的事实。
我已经获得了位于不同地理位置的六个不同的共享托管帐户。我还有什么需要做的吗?应该为每个主机购买不同的域名吗?如果不是,我应该为每个主机提供什么域?
I'm running a bunch of scripts that are scraping data from a website. For reasons I won't bore you with, I can't run them all off the same host--instead I need to set up six different hosts. I want to configure my hosting setup to disguise the fact that all six hosts have the same owner.
I have gotten six different shared hosting accounts that are located in different geographical locations. Is there anything else I need to do? Should a buy a different domain name for each host? If not, what domain should I give to each host?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以设置 TOR 的多个实例,为每个实例配置单独的控制端口,然后在一个实例上运行您的抓取计算机,每台计算机使用单独的 TOR。这将使每个 HTTP 请求跳转通过单独的代理链,因此当它们到达所需站点时,它们将来自唯一的 IP。
You could set up multiple instances of TOR, configure each with a seperate control port, and run your scrapes on one computer, each using a separate TOR. This will make each HTTP request jump through separate chains of proxies, and therefore when they get to the desired site, they will be coming from a unique IP.