是否可以恢复 Lucene.net 索引中的文档?
我需要从 Lucene 索引中删除文档,然后能够在以后重新添加它们。似乎如果我将文档标记为已删除,然后尝试再次添加它..该文档仍然被删除。如何“取消删除”文档?
这就是我将文档标记为“已删除”的方式:
Term = new Tearm("id", Id.Value);
IndexSearcher.reader.DeleteDocuments(term);
IndexSearcher.reader.Close();
因此,如果我想再次“激活”该文档..我该怎么做?
谢谢!
I have the need to delete documents from my Lucene index and then be able to re-add them later. It seems that if I mark a document as deleted and then attempt to add it again.. the document remains deleted. How can "undelete" a document?
This is how I am marking a document as "deleted":
Term = new Tearm("id", Id.Value);
IndexSearcher.reader.DeleteDocuments(term);
IndexSearcher.reader.Close();
So if I would like to "activate" this document again.. how would I do it?
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我不熟悉Lucene.Net,但Java版本有
IndexReader.undeleteAll()
方法。Lucene的删除是软删除。这意味着,当文档被删除时,它们会被标记为删除。只有优化索引后,删除的文档才会从索引中清除。文档列表保存在索引目录中的 .del 文件中。
undeleteAll()
方法清除文件的内容以使这些文档再次活动。 (不要尝试手动删除此文件,因为对此文件的引用保留在索引段文件中。)您无法取消删除文档子集。您必须取消删除所有文档。您可以通过获取所有已删除文档的列表来模拟所需的功能,调用 undeleteAll(),然后再次删除除您希望保留的文档之外的文档。
I'm not familiar with Lucene.Net, but Java version has
IndexReader.undeleteAll()
method.Lucene's deletions are soft-deletions. That means, when documents are deleted, they are marked for deletions. Only when index is optimized, the deleted documents are purged from the index. The list of documents is maintained in a .del file in the index directory.
undeleteAll()
method purges the contents of the file to make those documents active again. (Do not try to delete this file manually, as reference to this file is maintained in the index segment files.)You cannot undelete a subset of documents. You have to undelete all the documents. You can emulate the required functionality by getting list of all the deleted documents, invoke
undeleteAll()
, and then again delete the documents except the one(s) that you wish to preserve.我认为您最好不要删除文档,而是添加一个字段以将其标记为已删除,并从查询中过滤掉该字段。除非有人要求也删除已删除的文档,否则您可以轻松地显示它们。
I think you might be better off not deleting the docs but rated adding a field to mark them as deleted and filtering that field out of your queries. Unless someone asks form deleted documents too then you can easily show them.