雅虎网络抓取:有什么限制?
我们正在使用网络抓取工具,并将其设置为具有睡眠功能,该功能设置了随机功能(以便每次抓取之间的时间不同),但在 20-30 个请求后我们仍然被雅虎阻止。
有谁知道是否有限制(即:每分钟 20 个请求,每小时 200 个)现在我们每个请求之间的平均时间约为 3-6 秒。感谢您的帮助
We are using a web scraper and have it set up to have a sleep function which has a random function set up (so that it isn't the same time between each scrape) but we are still getting blocked from Yahoo after 20-30 requests.
Does any one know if there is a limit (i.e: 20 requests per minutes, 200 an hour) Right now our average between each request is around 3-6 seconds. Thanks for any help
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
每 3-6 秒 1 个请求非常低,因此您的爬网程序可能存在另一个问题。
一些想法:
如果您使用更高版本,这一切都会更容易关卡库,例如 Mechanize。
1 request every 3-6 seconds is quite low so perhaps there is another problem with your crawler.
A few ideas:
This will all be easier if you use a higher level library like Mechanize.
所以答案是 5000 个查询。摘自
http://forums.digitalpoint.com/showthread.php?t=736784< /a>
http:// 开发人员。雅虎。 com/search/rate.html
So the answer is 5000 queries. Taken from
http://forums.digitalpoint.com/showthread.php?t=736784
http:// developer. yahoo. com/search/rate.html