在 Ruby/Rails 中使用特定排序规则对值进行排序
是否可以使用 Ruby 中的特定排序规则对值数组进行排序?我需要根据 da_DK 排序规则进行排序。
给定数组 %w(Aarhus Aalborg Assens)
我希望返回 ['Assens', 'Aalborg', 'Aarhus']
这是丹麦语的正确顺序。
标准排序方法
%w(Aarhus Aalborg Assens).sort
返回看起来像 ascii 顺序的内容(至少不是丹麦顺序):
["Aalborg", "Aarhus", "Assens"]
环境是 Snow Leopard 和运行 ruby 1.9.2 和 Rails 3.0.5 的 linux。
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
根据维基百科:
这会导致排序失败。
这样做可以解决问题:
对于其他字母也可以采取类似的做法具有复合字符组合以转换为单个字符。
这样做的原因是
sort_by
执行了 Schwartzian Transformation,所以它实际上是按从块返回的返回值进行排序,在本例中,是将名称中的“Aa”替换为“Å”。替换是临时的,当数组返回时会被丢弃。sort_by
非常强大,但对于简单排序,您应该使用
sort
,因为它对于比较两个简单值的排序更快。在对象的顶层,那么您是否应该使用sort
还是sort_by
就变得很麻烦了。如果您必须进行更复杂的计算或在对象中进行挖掘,那么<。 code>sort_by 可以证明更快。没有真正硬性的方法来知道哪个更好,因此,如果您必须对大型数组进行排序或处理对象,我强烈建议使用基准测试进行测试,因为差异可能很大,有时排序
可能是更好的选择。编辑:
Ruby 本身不会做你想做的事,因为它不知道那里设置的每个字符的排序顺序。有一个关于合并 讨论 .org/" rel="noreferrer">IBM 的 ICU 解释了原因。如果你想要 ICU 的功能,你可以查看 ICU4R。我还没有玩过它,但这听起来像是 Ruby 中唯一真正的解决方案。
您也许可以使用 Postgres 这样的数据库做一些事情。它们支持各种排序规则选项,但通常会强制您在创建数据库时声明排序规则...或者可能是在创建表时...自从我创建新表以来已经有一段时间了。无论如何,这将是一个选择,尽管这会很痛苦。
According to Wikipedia:
This would throw off sorting.
Do this to fix the problem:
and something similar for the other letters that have compound character combinations to convert to the single character.
The reason this works is
sort_by
does a Schwartzian Transformation, so it's actually sorting by the return value returned from the block, which, in this case, is the name with 'Aa' replaced with 'Å'. The replacement is temporary, and discarded when the array is sorted.sort_by
is very powerful, but does have some overhead. For a simple sort you should usesort
because its faster. For sorts where you're comparing two simple values at the top level of an object then it becomes a wash whether you should usesort
orsort_by
. If you have to do more complex calculations or dig around in an object thensort_by
can prove to be faster. There isn't a real hard-and-fast way to know which is better, so I strongly recommend testing with a benchmark if you have to sort large arrays or deal with objects because the difference can be large, and sometimessort
can be the better choice.EDIT:
Ruby, by itself, isn't going to do what you want, because it has no knowledge of the sort order of every character set out there. There's a discussion regarding incorporating IBM's ICU that explains why that is. If you want ICU's abilities, you could look into ICU4R. I haven't played with it, but it sounds like your only real solution in Ruby.
You might be able to do something with a database like Postgres. They support various collating options but usually force you to declare the collation when you create the database... or maybe it's when the table is created... it's been a while since I created a new table. Anyway, that'd be an option, though it would be a pain.
我在 Github 上找到了 ffi-locale ,据我所知,它解决了我的问题。
它允许使用以下代码:
返回正确的结果:
我还没有研究性能,但它调用本机代码,因此它应该比 Ruby 字符替换代码更快...
更新
它并不完美 :( 它在 Snow Leopard 上无法正常工作 - 似乎 strcoll 功能在 OS X 上被破坏了一段时间。这对我来说很烦人,但主要的部署平台是 linux - 它可以在其中工作- 所以这是我目前首选的解决方案。
I found the ffi-locale on Github and that solves my problem as far as I can see.
It allows the following code:
Which returns the correct result:
I haven't investigated performance yet but it calls out to native code so it ought to be faster that Ruby character replacement code...
Update
It is not perfect :( It does not work properly on Snow Leopard - it seems that the strcoll function is broken on OS X and have been for some time. It is annoying to me but the main platform for deployment is linux - where it works - so it is my currently preferred solution.