全文干扰词 - 背后的逻辑

发布于 2024-12-10 19:59:13 字数 147 浏览 0 评论 0 原文

正如标题所描述的,在全文搜索中实施干扰词以避免这些词被搜索背后的逻辑是什么?我的意思是,如果有人搜索“to be or not to be”怎么办?没有显示结果?如果有人能告诉我背后的逻辑,我将非常感激,因为我即将禁用 ft_stopword_file

As the title depicts, what is the logic behind implementing noise words in fulltext searches to avoid these words being searched? I mean, what if someone searches "to be or not to be"? No result shown? I'll highly appreciate if someone can tell me the logic behind, since I'm about to disable the ft_stopword_file.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

优雅的叶子 2024-12-17 19:59:13

使用这些停用词的原因是为了避免全文索引变得臃肿。它有助于提高性能和存储能力。如果您包含所有停用词(或禁用它们),那么它会在一定程度上降低全文搜索的性能。

The reason for these stop words is so that the full-text index doesn't become bloated. It aids in performance and storage. If you included all stop words (or disable them) then it would degrade the full-text searching to a certain extent.

妄想挽回 2024-12-17 19:59:13

如果禁用停用词,那么性能将急剧下降。解决方法是检查您的 php 代码以查看停用词在搜索查询中是否常见,并为这些查询调整“LIKE”搜索,或者简单地使用 sphinx 作为搜索引擎。停用词背后的逻辑是禁用搜索词,例如“is,are,be,there,not”等......

If you disable the stop words then the performance will decrease dramatically. The workaround for this is to either check in your php code to see whether the stop words are in common in the search query and adapt a 'LIKE' search for those queries, or simply use sphinx as a search engine. The logic behind the stop words is to disable searching words like 'is,are,be,there,not' etc etc...

请远离我 2024-12-17 19:59:13

逻辑是这些词非常常见,它们会创建大型索引节点并降低系统性能,并且对用户来说毫无用处,因为“to”和“be”如此常见且没有上下文。

更好的索引方法是使用 ngram 来查找像“to be”这样的引用短语,但这种索引非常罕见。

The logic is that these words are so common, that they will create large index nodes and degrade the system as well as be useless to users since the words "to" and "be" are so common and contextless.

A better method of indexing would be ngrams to find quoted phrases like "to be" but this kind of indexing is pretty rare.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文