如何在 SQL 中排序,忽略冠词(“the”、“a”、“an”等)
这个问题出现了很多次,我可以看到它出现在 StackOverflow 上的 XSLT、Ruby 和Drupal 但我没有看到它专门用于 SQL。
所以问题是,当标题以“The”、“A”或“An”开头时,如何正确排序?
一种方法是简单地 TRIM() 这些字符串:
ORDER BY TRIM(
LEADING 'a ' FROM
TRIM(
LEADING 'an ' FROM
TRIM(
LEADING 'the ' FROM LOWER( title )
)
)
)
这是 不久前在 AskMeFi 上建议(它需要 LOWER()
函数吗?)。
我知道我也见过某种 Case/Switch 实现,但在 Google 上搜索有点困难。
显然,有多种可能的解决方案。如果 SQL 专家能权衡一下对性能的影响,那就太好了。
This comes up a lot, and I can see it's come up on StackOverflow for XSLT, Ruby and Drupal but I don't see it specifically for SQL.
So the question is, how do you sort titles correctly when they begin with "The", "A", or "An"?
One way is simply to TRIM() those strings:
ORDER BY TRIM(
LEADING 'a ' FROM
TRIM(
LEADING 'an ' FROM
TRIM(
LEADING 'the ' FROM LOWER( title )
)
)
)
which was suggested on AskMeFi a while back (does it need that LOWER()
function?).
I know I've also seen some kind of Case/Switch implementation of this but it's a little hard to Google for.
Obviously there are a number of possible solutions. What would be good is SQL gurus weighing in on which have performance implications.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
我见过的一种方法是有两列 - 一列用于显示,另一列用于排序:
我没有进行任何实际测试,但这具有能够使用索引并且不需要字符串操作的好处每次您想按说明订购时。除非您的数据库支持物化视图(MySQL 不支持),否则将逻辑实现为视图中的计算列不会带来任何好处,因为您无法对计算列建立索引。
One approach I've seen was to have two columns - one for display and the other for sorting:
I haven't done any real world testing, but this has the benefit of being able to use an index and doesn't require string manipulation every time you want to order by the description. Unless your database supports materialized views (which MySQL doesn't), implementing the logic as a computed column in a view wouldn't provide any benefit because you can't index the computed column.
我已经使用这个多年了,但不记得在哪里找到它
:
I've been using this for years, but can't remember where I found it:
Yielding:
特别是对于 Postgres,您可以使用 regexp_replace 来为您完成这项工作:
For Postgres specifically, you can use regexp_replace to do the work for you:
我只能代表 SQL Server:您在 CASE 语句中使用 LTRIM。不需要 LOWER 函数,因为默认情况下选择不区分大小写。但是,如果您想忽略文章,那么我建议您使用干扰词词典并设置全文索引目录。我不确定其他实现是否 SQL 支持这一点。
I can only speak for SQL Server: you use LTRIM within CASE statements. No LOWER function is needed because selections are not case sensitive by default. However, if you want to ignore articles then I would suggest you use a noise word dictionary and set up a full text indexing catalog. I am unsure if other implementations are SQL support this.
需要
LOWER
。虽然SELECT
不区分大小写,但ORDER BY
却区分大小写。LOWER
is needed. WhileSELECT
is not case-sensitive,ORDER BY
is.尝试以下操作:
ORDER BY
替换(替换(替换(YOURCOLUMN,'THE',''),'a\'',''),'an','')
未测试!
Try the following:
ORDER BY
replace(replace(replace(YOURCOLUMN,'THE',''),'a\'',''),'an','')
Not tested!