Sql Server 2008 - 如何避免使用字符“/”是一个停用词?
假设我们有一个大型全文索引表,并且有一些类似的字符串(当然,在全文索引列中):
123.456.789/14
111.222.22222.2/5111
这些字符串是只有在完全按照我的方式查询时才有意义的数字(对于我的应用程序)写道。
当我执行这样的查询时:
WHERE CONTAINS(field, "5111");
它返回包含第二个字符串的行,但我执行它不返回任何结果,因为除了字符串包含 5111 之外,它对我来说没有意义(只有意义整个数字,而不有意义)其中的一部分)。
有没有办法避免返回像我提到的那样的字符串部分?我猜Sql服务器正在处理“/”和“.”作为停用词,我对吗?
Let's assume we have a large full text index table and there are some strings like that (in the full text indexed collumn, of course):
123.456.789/14
111.222.22222.2/5111
Those strings are numbers that only make sense (for my application) when they are queried exactly the way I wrote.
When I perform a query like this:
WHERE CONTAINS(field, "5111");
it returns the line that contains the second string, but I exected it to not return any results because besides the string contains 5111, it does not make sense to me (only makes sense the entire number, not part of it).
Is there any way of avoiding returning parts of strings like those I mentioned? I guess Sql server is treating "/" and "." as stop words, Am I right?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您的问题实际上是断词器,而不是停止词。
“/“ 和 ”。”您正在使用的(我假设是英语)分词器将其视为单词分隔符。
可以安装自定义分词器,但我不确定这是否能真正解决您的问题,因为您希望“/”在单词周围而不是数字周围时被视为单词分隔符。
理论上可以启用自定义词典支持以允许指定包含被视为单词的单词分隔符的短语,但这可能无法提供您想要的内容。
从您的示例中,您可以使用自定义词典定义“789/14”和“2/5111”。这意味着搜索“789”、“14”、“2”或“5111”时不会返回这些行,但搜索“789”、“14”、“2”或“5111”将会返回这些行“789/14”或“2/5111”。
以下博客文章描述了在 SQL 2008 中设置自定义词典支持,但我无法使其工作:
创建自定义在 SQL Server 2008 全文索引中“按原样”索引的特殊术语词典
Your problem is actually with the word breaker, not stop words.
"/" and "." are being considered word separators by the (I assume English) word breaker you are using.
It is possible to install custom word breakers, but I'm not sure if this would actually resolve your problem since you want "/" considered a word separator when it is around words but not numbers.
It is theoretically possible to enable Custom Dictionary support to allow specifying phrases that contain word separators that are considered words, but this may not deliver what you want.
From your example you could define "789/14" and "2/5111" with a Custom Dictionary. This would mean that these rows would not be returned for searches for "789", "14", "2" or "5111" but they would be returned for searches on "789/14" or "2/5111".
The following blog entry describes setting up Custom Dictionary support in SQL 2008, however I have not been able to make it work:
Creating Custom Dictionaries for special terms to be indexed 'as-is' in SQL Server 2008 Full-Text Indexes