使用 Examine 和 Lucene.Net 查询以逗号分隔的 ID 列表?

发布于 2024-10-19 10:34:29 字数 359 浏览 7 评论 0原文

我正在使用 Examine for Umbraco(构建于 Lucene.net 之上)来进行搜索。我很确定我的问题与 Lucene 有关。

我的一个字段包含逗号分隔 ID 的列表。如何以正确的方式查询该字段?

例如。我有一个值为“64,65”的字段。我尝试过使用 MultipleCharacterWildcard,它仅在查询 ID 64 时返回结果,但不返回 ID 65。 SingleCharacterWildcard 不返回任何内容,而 Fuzzy 仅在字段中只有一个 ID 时才返回结果。关于如何进行正确的搜索有什么想法吗?我想我正在寻找的是“包含”查询。

这也是处理带有逗号分隔列表的字段的正确方法吗?还是将逗号分隔列表拆分为单独的字段会更好?

I am using Examine for Umbraco (which is built on top of Lucene.net) to do my search. I am quite sure my problem is Lucene related.

One of my fields contains a list of comma separated IDs. How do I query this field in the right way?

Eg. I have a field with the values "64,65". I have tried using MultipleCharacterWildcard which only returns a result if I query for the ID 64, but not for ID 65. SingleCharacterWildcard does not return anything, and Fuzzy only returns something if there is only one ID in the field. Any ideas of how to do a proper search? I guess what I am looking for is a "Contains"-query.

Also is this the right way to handle fields with comma separated lists or would it be better to instead split the comma separated list up into individual fields?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

╰ゝ天使的微笑 2024-10-26 10:34:29

我当然会把列表分成不同的字段。文档中的同一字段名称可以有多个值,这是表示一组值的相当自然的方式:

venue_id: 12345
treatment_id_set: 1234
treatment_id_set: 2345

使用这样的文档,我可以简单地查询“treatment_id_set:1234”来查找支持该治疗的所有场所。当然,治疗的顺序会丢失。如果需要恢复它,请在对各个成员建立索引时存储逗号分隔的值:

# stored, indexed
venue_id: 12345
# stored, not indexed
treatment_id_list: 1234,2345
# not stored, indexed
treatment_id_set: 1234
treatment_id_set: 2345

I would certainly split the list up into separate fields. You can have multiple values for the same field name in a document, which is a fairly natural way to represent a set of values:

venue_id: 12345
treatment_id_set: 1234
treatment_id_set: 2345

With documents like this, I can simply query for "treatment_id_set:1234" to find all the venues supporting that treatment. Of course, the ordering of the treatments is lost. If you need to recover it, store the comma-separated value while indexing the individual members:

# stored, indexed
venue_id: 12345
# stored, not indexed
treatment_id_list: 1234,2345
# not stored, indexed
treatment_id_set: 1234
treatment_id_set: 2345
青瓷清茶倾城歌 2024-10-26 10:34:29

为了使用 Umbraco Examine 将具有相同键值的重复字段添加到 Lucene 中,您需要挂钩“文档写入”事件。

_index.DocumentWriting += _index_DocumentWriting;

这将公开底层的 Lucene 文档。

然后可以像这样添加字段:

foreach (var item in someList)
                {
                    e.Document.Add(new Field("fieldName", item, Field.Store.YES, Field.Index.NOT_ANALYZED));
                }

In order to add duplicate fields with the same key value into Lucene using Umbraco Examine, you need to hook on to the 'Document Writing' event.

_index.DocumentWriting += _index_DocumentWriting;

This will then expose the underlying Lucene document.

Fields can then be added like this:

foreach (var item in someList)
                {
                    e.Document.Add(new Field("fieldName", item, Field.Store.YES, Field.Index.NOT_ANALYZED));
                }
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文