Lucene.NET - 无法使用 IndexWriter 删除文档

发布于 2024-11-07 04:44:20 字数 501 浏览 3 评论 0原文

我正在接手一个项目,所以我还在学习这个。该项目使用Lucence.NET。我也不知道这个功能是否正确。无论如何,我正在实例化:

var writer = new IndexWriter(directory, analyzer, false);

对于特定文档,我正在调用:

writer.DeleteDocuments(new Term(...));

最后,我正在调用通常的 writer.Optimize()、writer.Commit() 和 writer.Close()。

Term 对象中的字段是一个 Guid,转换为字符串 (.ToString("D")),并使用 Field.Store.YES 和 Field.Index.NO 存储在文档中。

但是,通过这些设置,我似乎无法删除这些文档。目标是删除,然后添加更新的版本,因此我得到了同一文档的重复项。如果需要,我可以提供更多代码/解释。有什么想法吗?谢谢。

I'm taking over a project so I'm still learning this. The project uses Lucence.NET. I also have no idea if this piece of functionality is correct or not. Anyway, I am instantiating:

var writer = new IndexWriter(directory, analyzer, false);

For specific documents, I'm calling:

writer.DeleteDocuments(new Term(...));

In the end, I'm calling the usual writer.Optimize(), writer.Commit(), and writer.Close().

The field in the Term object is a Guid, converted to a string (.ToString("D")), and is stored in the document, using Field.Store.YES, and Field.Index.NO.

However, with these settings, I cannot seem to delete these documents. The goal is to delete, then add the updated versions, so I'm getting duplicates of the same document. I can provide more code/explanation if needed. Any ideas? Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

清风疏影 2024-11-14 04:44:20

该字段必须建立索引。如果字段未建立索引,则其术语将不会显示在枚举中。

The field must be indexed. If a field is not indexed, its terms will not show up in enumeration.

只是一片海 2024-11-14 04:44:20

我不认为你对待作者的方式有什么问题。

听起来好像您传递给DeleteDocuments 的术语没有返回任何文档。您是否尝试过使用相同的术语进行查询以查看它是否返回任何结果?

另外,如果您的目标是简单地重新创建文档,您可以调用 UpdateDocument:

//     Updates a document by first deleting the document(s) containing term and
//     then adding the new document. The delete and then add are atomic as seen
//     by a reader on the same index (flush may happen only after the add).  NOTE:
//     if this method hits an OutOfMemoryError you should immediately close the
//     writer. See above for details.

您可能还想查看 SimpleLucene (http://simplelucene.codeplex.com) - 它使执行基本 Lucene 任务变得更容易。

[更新]
不知道我是如何错过它的,但@Shashikant Kore 是正确的,您需要确保该字段已建立索引,否则您的术语查询将不会返回任何内容。

I don't think there is anything wrong with how you are handling the writer.

It sounds as if the term you are passing to DeleteDocuments is not returning any documents. Have you tried to do a query using the same term to see if it returns any results?

Also, if your goal is to simple recreate the document, you can call UpdateDocument:

//     Updates a document by first deleting the document(s) containing term and
//     then adding the new document. The delete and then add are atomic as seen
//     by a reader on the same index (flush may happen only after the add).  NOTE:
//     if this method hits an OutOfMemoryError you should immediately close the
//     writer. See above for details.

You may also want to check out SimpleLucene (http://simplelucene.codeplex.com) - it makes it a bit easier to do basic Lucene tasks.

[Update]
Not sure how I missed it but @Shashikant Kore is correct, you need to make sure the field is indexed otherwise your term query will not return anything.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文