配置 lucene.net 来识别同音词

发布于 2024-11-24 20:27:59 字数 181 浏览 0 评论 0原文

我们有一个网站,用户可以在其中输入城市名称。 Lucene.net 2.1.0.3 是用于查找已创建城市的搜索引擎。根据配置,Lucene 无法识别 Saint Jerome 与 St. Jerome 相同,或者 Lake Phillip 与 Lac Phillip 相同。

关于扩大 Lucene.Net 的搜索策略有什么建议吗?

we have a site where the user can enter the name of a city. Lucene.net 2.1.0.3 is the search engine to look for cities that have already been created. As configured Lucene does not recognise that Saint Jerome is the same as St. Jerome or that Lake Phillip is the same as Lac Phillip.

Any tips on widening the search strategy for Lucene.Net?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

回梦 2024-12-01 20:27:59

我读过一些关于这个同义词和“听起来像”的内容(读“我目前对此没有经验”)。对我来说,这似乎是两个不同的问题:缩写“同义词”和“听起来像”。

听起来像

Soundex 是一种较旧的算法,专为“美国”名字的拼写错误而设计。有一种名为“Double Metaphone”的改进算法解决了 Soundex 的一些抱怨。这个库看起来很有前途:
http://sourceforge.net/projects/phonetixnet/

同义词

缩写 似乎可能有一个通用的同义词系统,我希望“花园城市”可能会得到“Plot Town”或“Patch burg”的同义词。我猜您会使用自己的特定于域的同义词获得更好的结果。

似乎像“Saint”(“St.”)和“Mount”(“Mt”)这样的词最好作为同义词处理。这是一篇文章,提出了一个相当简单的自定义同义词解决方案: http://www.codeproject .com/KB/cs/lucene_custom_analyzer.aspx

I've read a bit about this synonyming and "sounds like" (read "I currently have no experience with this"). To me it seems like two different problems: abbreviation "synonyms" and "sounds like".

Sounds Like

Soundex is an older algorithm which was designed for mispellings of "american" names. There is an improved algorithm called 'Double Metaphone' addressed some of the complaints of Soundex. This library looks promising:
http://sourceforge.net/projects/phonetixnet/

Abbreviation Synonyms

While it seems there could be a generic synonyming system, I would expect "Garden City" might get synonyms of "Plot Town" or "Patch burg". I am guessing you'll achieve better results with your own domain-specific synonyms.

It seems that words like 'Saint' ('St.') and 'Mount' ('Mt') would be best handled as synonyms. Here is an article that proposes a fairly simple solution to custom synonyming: http://www.codeproject.com/KB/cs/lucene_custom_analyzer.aspx .

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文