sql server 2005全文索引查询帮助查找内容中的干扰词

发布于 2024-08-26 06:22:54 字数 67 浏览 5 评论 0 原文

有没有办法查询全文索引来帮助确定其他干扰词?我想添加一些自定义干扰词,并想知道是否有一种方法可以分析索引以帮助确定建议。

Is there a way to query a full text index to help determine additional noise words? I would like to add some custom noise words and wondered if theres a way to analyse the index to help determine suggestions.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

ˉ厌 2024-09-02 06:22:54

中一样简单

就像http://arcanecode.com/2008/05/29/creating-and-customizing-noise-words-in-sql-server-2005-full-text-search/

这是解释了(如何做)。然而,想出合适的方法是很困难的。

As simple as in

http://arcanecode.com/2008/05/29/creating-and-customizing-noise-words-in-sql-server-2005-full-text-search/

where this is explained (how to do it). Coming up with proper ones, though, is hard.

我也只是我 2024-09-02 06:22:54

我决定研究 lucene.net,因为我对 sql server 全文索引中的相关性计算不满意。

我设法弄清楚如何快速索引所有内容,然后使用 Luke 查找干扰词。我现在已经根据此分析编辑了 sql server 噪声文件。现在我有一个使用 sql server 全文索引工作得相当好的搜索解决方案,但我计划将来迁移到 lucene.net。

使用 sql server 全文索引作为基础,我开发了一种以域为中心的方法,使用我理解的工具查找相关内容。经过认真的思考和测试,除了通过分析文本内容的词频和词距离提供的方法之外,我还使用了许多其他方法来确定搜索结果的相关性。 SQL Server 全文索引为我提供了一个良好的开端,现在我有了一个可以使用 lucene 表达的策略,该策略将非常有效。

我需要花费更长的时间来理解 lucene 并制定搜索策略。如果有人仍在阅读本文,请使用全文索引来测试您的想法,然后在您知道适合您的领域的策略后转向 lucene。

I decided to look into lucene.net because I wasn't happy with the relevance calculations in sql server full text indexing.

I managed to figure out how to index all the content pretty quickly and then used Luke to find noise words. I have now edited the sql server noise files based on this analysis. Now I have a search solution that works reasonably well using sql server full text indexing, but I plan to move to lucene.net in the future.

Using sql server full text indexing as a base, I developed a domain centric approach to finding relevant content using tool I understood. After some serious thinking and testing, I used many other measures to determine the relevance of a search result other than what is provided by analysing text content for term frequency and word distance. SQL Server full text indexing provided me a great start, and now I have a strategy I can express using lucene that will work very well.

It would have taken me a whole lot longer to understand lucene, and develop a strategy for the search. If anyone out there is still reading this, use full text indexing for testing your idea and then move to lucene once you have a strategy you know will work for your domain.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文