使用 FORMSOF 和不区分重音的 SQL Server 2008 R2 全文搜索

发布于 2024-10-31 23:56:20 字数 578 浏览 4 评论 0原文

我使用带有全文搜索功能的 MS SQL Server 2008 R2 来搜索以不同语言存储的文本数据。

我对 CONTAINS 谓词如何与重音一起使用有点困惑。

使用以下谓词时

CONTAINS([Text], @keywords  , Language @language)

当我在 ACCENT_SENSITIVITY = OFF 的目录上

,当将德国指定为语言时,例如“Lächeln”和“lacheln”的搜索结果是相同的。但是,如果我将谓词更改为看起来像这样

CONTAINS([Text], FORMSOF(INFLECTIONAL, @keywords)  , Language @language) 

,那么结果会有所不同,在我看来,Accent Insensitivity 不适用于 FORMSOF

我试图在 MSDN 和 Google 上找到答案,但没有找到找到任何有用的东西。

有谁知道为什么结果不同?

谢谢!

I'm using MS SQL Server 2008 R2 with Full Text Search for searching text data stored in different languages.

I'm a bit confused about how CONTAINS predicate works with accents.

When I use the following predicate

CONTAINS([Text], @keywords  , Language @language)

on a catalog with ACCENT_SENSITIVITY = OFF the search results are the same for e.g. 'Lächeln' and 'lacheln' when Germany is specified as language.

But if I change the predicate to look like

CONTAINS([Text], FORMSOF(INFLECTIONAL, @keywords)  , Language @language) 

then results are different and it seems to me that Accent Insensitivity doesn't work with FORMSOF

I've tried to find an answer on MSDN and Google but didn't find anything useful.

Does anybody know why the results are different?

Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

书信已泛黄 2024-11-07 23:56:20

我的理解是,它们在查找全文搜索的匹配项方面有两个不同的目的。对于不区分重音的目录,会为术语匹配执行简单的字符相等,以便 eñya = enya,因为“n”被认为是“ñ”的不区分重音的等效项。

使用 FORMSOF,您请求搜索对术语执行词干操作,以便动词和名词形式将作为搜索中的附加术语进行搜索。例如,搜索“foot”将包括“feet”,“run”将包括“ran”。

如果 FORMSOF 似乎根本不适合您的值,您可能需要确保为全文语言安装了适当的语言支持。
SELECT * FROM sys.fulltext_languages

如果您还没有机会查看 MSDN,SQL Word Breakers 文档可能会对观察到的行为有所帮助。 http://msdn.microsoft.com/en-us/library/ms142509.aspx

My understanding is that these serve two separate purposes in finding matches for a full-text search. With an accent insensitive catalog there is a simple character equality performed for the term matching so that eñya = enya because 'n' is considered the accent insensitive equivalent of 'ñ'.

With FORMSOF you're requesting that the search perform a stemming operation on the terms so that verb and noun forms will be searched as additional terms in the search. e.g. searching for 'foot' would include 'feet' and 'run' would include 'ran'.

If the FORMSOF seems to be fundamentally not working for your values you may want to make sure that you have the appropriate language support installed for full-text languages.
SELECT * FROM sys.fulltext_languages

If you haven't had a chance to review MSDN the SQL Word Breakers documentation may shed some light on the observed behavior. http://msdn.microsoft.com/en-us/library/ms142509.aspx

岁月蹉跎了容颜 2024-11-07 23:56:20

FORMSOF 从您的单词中删除变音符号:

SELECT * FROM sys.dm_fts_parser(N'FORMSOF(INFLECTIONAL, "Lächeln")', 1031, 0, 1)

检查“display_term”列。

FORMSOF cuts diacritics from Your word:

SELECT * FROM sys.dm_fts_parser(N'FORMSOF(INFLECTIONAL, "Lächeln")', 1031, 0, 1)

check column "display_term".

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文