使用 sql server 全文搜索搜索单个字符后跟连字符后跟更多文本的方法
我们的数据可能类似于:
A-10001
A-10002
A-10003
B-10001
B-10002
B-10003
我们希望在其中查找包含“A-100”的所有匹配项。使用全文搜索,我们正在做类似的事情:
where contains(e.*, '"A-100*"')
但是,因为“-”被认为是打破边界的单词,并且因为 A 是干扰词(因为它是单个字符),所以查询不会返回任何结果。我想知道是否有任何(简单)方法可以返回预期结果。
请注意,我知道在这个简化的场景中,我可以使用 like 而不是 contains,例如:
where
(e.myColumn1 like '%A-100%'
or e.myColumn2 like '%A-100%'
or e.myColumn3 like '%A-100%',
etc)
但对于我的情况,这是行不通的(主要是出于性能原因,并且因为有很多列我需要查看) 。
我也知道我可以将搜索字符串分成两个单词,然后执行以下操作:
where
contains(e.*, '"100*"')
and (e.myColumn1 like '%A-100%'
or e.myColumn2 like '%A-100%'
or e.myColumn3 like '%A-100%',
etc)
但同样,由于有很多列,这不是一个理想的解决方案。
我对 sql server 2005 及更高版本的解决方案感兴趣,但如果有 sql server 2008 特定的解决方案,我也会对此感兴趣。
感谢您的帮助, 埃里克
We have data that might look something like:
A-10001
A-10002
A-10003
B-10001
B-10002
B-10003
And where we'd like to find all of the matches containing "A-100". Using full text search, we are doing something like:
where contains(e.*, '"A-100*"')
However, because "-" is considered a word breaking boundary and because A is a noise word (since it's a single character), the query does not return any results. I am wondering if there is any (easy) way to return the expected results.
Note, I understand that in this simplified scenario I could use like instead of contains, something like:
where
(e.myColumn1 like '%A-100%'
or e.myColumn2 like '%A-100%'
or e.myColumn3 like '%A-100%',
etc)
But for my situation that is not going to work (mainly for performance reasons and because there are lots of columns that I would need to look at).
I also understand I could break the search string into two words, and do something like:
where
contains(e.*, '"100*"')
and (e.myColumn1 like '%A-100%'
or e.myColumn2 like '%A-100%'
or e.myColumn3 like '%A-100%',
etc)
But again, since there are lots of columns, this is not an ideal solution.
I am interested in a solution for sql server 2005 and later, but if there's a sql server 2008 specific solution, I'd be interested in that too.
Thanks for any help,
Eric
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我认为您不会为此找到一个非常干净的解决方案。我可能过于简单化了问题,但我的建议是前 2 个字符的派生(计算)列。如果您的优先级是读取性能,请在计算列上放置非聚集列并包含完整列。这将能够搜索“A-”,而“like”则始终需要扫描。即使您有多个列需要评估,这仍然比每列的类似谓词表现得更好。
I don't think you're going to find a very clean solution for this. I may be oversimplifying the problem, but my advice would be a derived (computed) column of the first 2 characters. If your priority is read performance, put a nonclustered on the computed column and include the full column. This would enable a seek for 'A-' whereas like will always require a scan. Even if you have several columns to evaluate, this should still perform better than a like predicate for each column.