Solr 中的精确单词搜索

发布于 2024-11-16 09:29:00 字数 532 浏览 3 评论 0原文

我有一个与这个问题密切相关的问题。

在我的模式中,我有一个字段

<field name="text" type="textgen" indexed="true" stored="true" required="true"/>

这给出了完全匹配,即。词干禁用

吃=吃

是否可以,同时配置为textgen来搜索该词的其他变体

例如。吃=吃,吃,吃

eat~0 会给出类似的发音单词,例如肉、beat 等,但这不是我想要的。

我开始认为实现这一目标的唯一方法是添加另一个字段,其中包含除 textgen 之外的其他内容,但如果有更简单的方法,我很想听听。

I have a question which closely relates to this question.

In my schema I have a field

<field name="text" type="textgen" indexed="true" stored="true" required="true"/>

This gives an exact match, ie. stemming disabled

eat = eat

Is it possible, while configured to textgen to search for other variants of the word

eg. eat = eat, eats, eating

eat~0 will give similar sounding words such as meat, beat etc. but this is not what I want.

I'm starting to think that the only way to achieve this is to add another field with something other then textgen but if there is a simpler way I am very interested to hear it.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

南街九尾狐 2024-11-23 09:29:00

使用 copyfield 语句是 Solr 中的常规方法。由于 stemming 正是您所要求的答案,因此我建议您使用它。如果您担心索引大小,可以设置stored=false

您还可以使用词形还原,这与词干提取相反 - 您可以在词干提取中添加所有变形形式的单词。这通常在搜索查询上执行,例如将 eat 扩展为 eat、eats、eating 等。

第三种替代方案可能是使用通配符搜索,尽管我不会鼓励它。尤其是因为它绕过了目标字段的所有架构配置过滤器。

Using copyfield statements is the normal approach in Solr. Since stemming is the answer to exactly what you're asking, this is what I recommend you to use. You can set stored=false if you are worried about index size.

You might also use lemmatisation, which is the opposite of stemming - where you instead add a words all inflected forms. This is typically performed on the search query, expanding e.g., eat to eat, eats, eating etc.

The third alternative might be to use wildcard search, although I wouldn't encourage it. Not least since it bypasses all schema configured filters for the target field.

怀念你的温柔 2024-11-23 09:29:00

如果您使用 text 作为字段类型,那么 eat、eats、eated 和 eat 都将存储为 eat 并搜索 FieldName:eat 将找到所有这些。如果您将字段类型更改为 text-gen,那么搜索 FieldName:eat 只会找到“eat”,而不是 eats、吃过或正在吃。

If you use text as the field type, then eat, eats, eaten and eating will all be stored as eat and a search for FieldName:eat will find all of them. If you change the field type to text-gen then the search for FieldName:eat will only find "eat", not eats, eaten or eating.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文