请求(URL)有5个迭代之后
我试图在使用“美丽的套件”上运行Webscraping Algo,并通过不同的页面进行循环。但是,经过2-6次迭代,请求(url)悬挂并停止查找下一页。我已经读到,它可能会阻止服务器有所作为,但这会阻止原始请求,并且还说在线上说确实可以进行网络刮擦。我还听说我应该设置一个标题,但我不确定该怎么做。我正在使用最新版本的Safari和Macos 12.4运行。
I am attempting to run a webscraping algo on indeed using beautifulSoup and loop through the different pages. However, after 2-6 iterations, the requests.get(url) hangs and stops finding the next page. I have read that it might do something with the server being blocked but that would have blocked the original requests and it also says online that Indeed allows for web scraping. I have also heard that I should set a header but I am unsure how to do that. I am running on the latest version of safari and MacOs 12.4.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我想到的解决方案,认为这不是专门回答问题,是使用试用期望语句并将超时值设置为请求。一旦达到超时值,它将输入尝试异常语句,设置一个布尔值,然后继续循环并重试。代码在下面插入。
我现在将这个问题视为目前尚未得到答复,但是我仍然不知道这个问题的根本原因,而这个解决方案只是解决方法。
A solution I came up with, thought this does not answer the question specifically, is by using a try expect statement and setting a timeout value to the request. Once the timeout value is reached, it enters the try except statement, sets a boolean value, and then continues the loop and try again. Code is inserted below.
I am leaving the question as unanswered for now however as I still don't know the root cause of this issue and this solution is just a workaround.