有没有一种方法可以结束 Nutch 蜘蛛而不丢失您抓取的信息?
如果我处于蜘蛛会话中间并且关闭蜘蛛,所有数据都不会显示在索引中。我必须等到索引完成它自己。有没有办法可以结束蜘蛛并仍然能够使用 Nutch 搜索来搜索该数据?如果是这样,怎么办?
If I'm in the middle of a spider session and I close the spider all the data won't show up in the index. I would have to wait until the indexing has completed it's self. Is there a way I can end the spider and still be able to search through that data with the Nutch search? If so, how?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
蜘蛛抓取和搜索是独立的,因此一旦您生成了索引,您就可以将其用于搜索并开始抓取下一个片段。
那么您真正想要实现的目标是什么?
Spider crawls and searching are independent so once you have produced an index you can make it available for search and start crawling your next segment.
So what is it really you are trying to achieve ?