SQL Server 2008 Containstable 使用weighted_term 生成负排名
我有一个在标题列上启用全文搜索的表。我尝试使用包含表进行加权搜索,但我得到了排名值的算术溢出。查询如下
SELECT ID, CAST(Res_Tbl.RANK AS Decimal) AS Relevancy , Title
FROM table1 AS INNER JOIN
CONTAINSTABLE(table1,Title,'ISABOUT("pétoncle" weight (.8), "pétoncle" weight (.8), "PÉTONCLE" weight (.8))',LANGUAGE 1036 ) AS Res_Tbl
ON ID = Res_Tbl.[KEY]
当我执行此查询时,我得到: int 类型的算术溢出错误,值 = -83886083125.000076。
如果我删除两个“;”之一在 ISABOUT 函数中,查询成功完成。
请注意,如果没有结果则查询成功完成,您需要有一些结果。
有人知道如何解决这个问题吗?
这个问题也在 dba.stackexchange.com
I have a table with full text search enabled on Title column. I try to make a weighted search with a containstable but i get an Arithmetic overflow for the Rank value. The query is as follow
SELECT ID, CAST(Res_Tbl.RANK AS Decimal) AS Relevancy , Title
FROM table1 AS INNER JOIN
CONTAINSTABLE(table1,Title,'ISABOUT("pétoncle" weight (.8), "pétoncle" weight (.8), "PÉTONCLE" weight (.8))',LANGUAGE 1036 ) AS Res_Tbl
ON ID = Res_Tbl.[KEY]
When I execute this query I get : Arithmetic overflow error for type int, value = -83886083125.000076.
If I remove one of the two ';' in the ISABOUT function the query complete successfully.
Note you need to have some results if there is no result the query complete successfully.
Does anybody know how to solve this ?
This question is also on dba.stackexchange.com
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
限定符:由于我无法重新创建此问题,因此我无法确定这是否可以解决问题。然而,这些是我所看到的一些事情。
首先,与号、井号和分号是断字字符。这意味着,您实际上搜索的是“p”、“233”和“toncle”,而不是搜索字符串“pétoncle”。显然,这不是你的意图。
我必须假设您的数据集中某处有文本“pétoncle”。这意味着您需要完整的整个字符串。
您可以做一些事情。
1) 一起关闭停用词。您可以通过更改全文索引将其关闭来做到这一点。
请注意,您必须将数据库设置为 SQL Server 2008 兼容性,这样才不会生成语法错误:
2) 创建新的非索引字表
如果创建空的非索引字表,您也许能够添加所需的非索引字或复制系统非索引字表并删除您不需要的非索引字。 (我建议采用第二种方法)。
话虽如此,我无法找到 &或 # 在系统非索引字表中,因此它们可能是硬编码的。您可能只需关闭非索引字表即可。
3) 更改搜索以忽略“pétoncle”大小写。
如果您从 ISABOUT 中删除“pétoncle”并将其更改为“p toncle”,它可能会起作用:
这些只是一些想法。就像我说的,如果无法访问系统或重新创建场景,我们将无法提供太多帮助。
为您的研究乐趣提供更多信息:
Qualifier: Since I can't recreate this, I'm unable to know for sure if this will fix the problem. However, these are some things that I'm seeing.
First off, the ampersand, pound sign, and semicolon are word-break characters. That means, that instead of searching for the string "pétoncle", what you're actually searching for is "p", "233", and "toncle". Clearly, that's not your intent.
I have to presume that you have the text "pétoncle" somewhere in your dataset. That means you need that entire string to be complete.
There are a few things you can do.
1) Turn off Stopwords all together. You can do that by altering the full text index to turn it off.
Note that you have to have your database set to SQL Server 2008 compatability for this to not generate a syntax error:
2) Create a new stoplist
If you create an empty StopList, you might be able to add the stopwords that you want or copy the system stoplist and remove the stopwords that you don't want. (I would advise the second approach).
Having said that, I wasn't able to find the & or # in the system stoplist, so they may be hard coded. You may have to simply turn the stoplist off.
3) Change your search to ignore the "pétoncle" case.
If you drop the "pétoncle" from the ISABOUT and change them to "p toncle", it might work:
Those are just some ideas. Like I said, without being able to access the system or recreate the scenario, we won't be able to help much.
Some more information for your researching pleasure:
对于进入此页面搜索 SQL Server 返回的负排名结果的人(如我所做的那样),事实证明,如果某些匹配项太长(超出某些字符限制),则可能会发生这种情况。 SQL Server实际上不会在查询时抱怨或产生错误,相反,排名大部分都是垃圾,对某些权重选择产生负排名(在我的例子中,特别是在超长术语上具有较低的权重值)。限制标记/字长并避免此问题(可能是 SQL Server 2008 全文搜索内部的一个错误)。
For people who got to this page searching for negative rank results returned by SQL Server, as I did, it turns out that can happen if some of your match terms are too long (beyond some character limit). SQL Server will not actually complain or produce an error at query time, instead, the ranking will be mostly garbage, producing negative rank for some choices of weights (in my case, esp. with low weight values on the overlong terms). Limit token/word length and avoid this problem (probably a bug deep inside SQL Server 2008 fulltext search).