Lucene 对短语而不是单个单词进行模糊匹配

发布于 2024-08-28 06:46:09 字数 147 浏览 14 评论 0原文

我正在尝试使用 Apache Lucene 对短语“Grand Prarie”(故意拼写错误)进行模糊匹配。我的问题的一部分是 ~ 运算符仅对单个单词术语进行模糊匹配,并表现为短语的邻近匹配。

有没有办法用lucene对短语进行模糊匹配?

I'm trying to do a fuzzy match on the Phrase "Grand Prarie" (deliberately misspelled) using Apache Lucene. Part of my issue is that the ~ operator only does fuzzy matches on single word terms and behaves as a proximity match for phrases.

Is there a way to do a fuzzy match on a phrase with lucene?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

半葬歌 2024-09-04 06:46:09

Lucene 3.0 has ComplexPhraseQueryParser that supports fuzzy phrase query. This is in the contrib package.

雪落纷纷 2024-09-04 06:46:09

通过谷歌发现了这个问题,并找到了不是我想要的解决方案。
就我而言,解决方案是简单地针对 solr API 重复搜索序列。
因此,例如,如果我正在寻找: title_t 以包含“dog~”和“cat~”的匹配,我添加了一些手动代码来生成查询:

((title_t:dog~) and (title_t:cat~))

它可能只是上面查询的内容,但是链接似乎已死。

Came across this through Google and felt solutions where not what I was after.
In my case, solution was to simply repeat the search sequence against the solr API.
So for example if I was looking for: title_t to include match for "dog~" and "cat~", I added some manual code to generate query as:

((title_t:dog~) and (title_t:cat~))

It might just be what above queries are about, however links seems dead.

栖竹 2024-09-04 06:46:09

没有对模糊短语的直接支持,但您可以通过明确地模拟它 枚举模糊术语,然后将它们添加到 MultiPhraseQuery。结果查询将如下所示:

<MultiPhraseQuery: "grand (prarie prairie)">

There's no direct support for a fuzzy phrase, but you can simulate it by explicitly enumerating the fuzzy terms and then adding them to a MultiPhraseQuery. The resulting query would look like:

<MultiPhraseQuery: "grand (prarie prairie)">
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文