在Python中对带有重音字符的字符串进行排序

发布于 2024-10-07 04:13:21 字数 931 浏览 3 评论 0原文

可能的重复:
Python 无法正确排序 unicode。 Strcoll 没有帮助。

我正在尝试按字母顺序对一些单词进行排序。我是这样做的:

#!/opt/local/bin/python2.7
# -*- coding: utf-8 -*-

import locale

# Make sure the locale is in french
locale.setlocale(locale.LC_ALL, "fr_FR.UTF-8")
print "locale: " + str(locale.getlocale())

# The words are in alphabetical order
words = ["liche", "lichée", "lichen", "lichénoïde", "licher", "lichoter"]

for word in sorted(words, cmp=locale.strcoll):
    print word.decode("string-escape")

我期望单词按照定义的顺序打印,但这是我得到的:

locale: ('fr_FR', 'UTF8')
liche
lichen
licher
lichoter
lichée
lichénoïde

é 字符被视为大于 <强>z。

看来我误解了 locale.strcoll 如何比较字符串。我应该使用什么比较器函数来按字母顺序对单词进行排序?

Possible Duplicate:
Python not sorting unicode properly. Strcoll doesn't help.

I'm trying to sort some words in alphabetical order. Here is how I do it:

#!/opt/local/bin/python2.7
# -*- coding: utf-8 -*-

import locale

# Make sure the locale is in french
locale.setlocale(locale.LC_ALL, "fr_FR.UTF-8")
print "locale: " + str(locale.getlocale())

# The words are in alphabetical order
words = ["liche", "lichée", "lichen", "lichénoïde", "licher", "lichoter"]

for word in sorted(words, cmp=locale.strcoll):
    print word.decode("string-escape")

I'm expecting that the words are printed in the same order as they are defined, but here is what I get:

locale: ('fr_FR', 'UTF8')
liche
lichen
licher
lichoter
lichée
lichénoïde

The é character is treated as if it's greater than z.

It seems I'm misunderstanding how locale.strcoll is comparing strings. What comparator function should I use to get the words sorted alphabetically?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

妖妓 2024-10-14 04:13:21

我最终选择剥离变音符号 并比较字符串的剥离版本,这样我就不必添加 PyICU 依赖项。

I finally chose to strip diacritics and compare the stripped version of the strings so that I don't have to add the PyICU dependency.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文