Unicode 大小写转换
我得到了一个字符或一个字符串,并且正在使用 Python。
根据 Unicode 提出的标准(标准和特殊大小写映射),如何确定特定字符是否具有等效的小写字符?
根据 Unicode 提出的标准(标准和特殊情况映射),如何确定字符串中是否有一个或多个具有小写等效项的字符?
I am given either a single character or a string, and am using Python.
How do I find out if a specific character has a lowercase equivalent according to the standards (standard and special case mappings) proposed by Unicode?
And how do I find out if a string has one or more characters that have a lowercase equivalent according to the standards (standard and special case mappings) proposed by Unicode?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
当然,只有当您使用的 Python 版本已按照 unicode 标准正确实现
.lower()
方法时,这才能正常工作。另外,我假设您不考虑例如u'a'
来“有一个小写等效项”(当然它有一个大写字母) 。如果您的意思不同,请考虑(我已将参数重命名为 uc 以避免行长度过长;-)——如果这就是您想要的,我建议不要命名函数以“小写等效”形式表示,因为这肯定会让代码的读者/维护者感到困惑!-)
This will only work correctly in as much as the Python version you're using has correctly implemented the
.lower()
method per unicode standards, of course. Also, I'm assuming that you don't consider, e.g.,u'a'
, to "have a lowercase equivalent" (it has an uppercase one of course). If you mean something different, consider(I've renamed the argument to
uc
to avoid excessive line length;-) -- if that's what you want I recommend not naming the function in terms of "lowercase equivalent" as that would be sure to confuse readers/maintainers of your code!-)@Albert,当你还没有弄清楚(也没有向回答者解释)你真正想做的事情时,你似乎过度关注大小写转换的细节。
=== 您之前的解释尝试(在我对 这个问题) ===
@John:嗯,我实际上正在为我的网络服务制作一个 API。我的网络服务接受映射到数据库中特定记录的密钥。密钥区分大小写,并且密钥可以由任何 unicode 字符组成。因此,为了标准化所有输入,我将所有关键查询转换为小写(如果它们有大写等效项)。这样做的结果是,当我创建记录键(我的用户可以自定义)时,我无法接受任何可以通过 toLower() 函数转换为等效小写字符的大写字符。所以我正在尝试为此制作一个过滤器。有什么建议吗?
=== 和我的回复评论 ===
@Albert:如果你的键区分大小写,为什么要标准化它们??? “记录用户可以自定义的按键”是什么意思??? “任何 unicode 字符”与“不能接受任何大写字符”??? 从字面上回答你的问题:看起来当 c.lower() != c 时你不能接受字符 c,这意味着如果 key.lower() != key 你不能接受任何键。我认为你应该开始一个新问题,用例子准确解释你想要做什么。
...并且您确实提出了一个新问题(实际上是其中两个),但您没有解释任何内容。这个“新”问题太新了,@Alex Martelli 的答案基本上与我上面强调的评论相同。
我认为你应该开始一个新问题,用新内容,用例子准确解释你想要做什么。
@Albert, You appear to be overly concerned with the minutiae of case conversion, when you haven't yet sorted out (nor explained to answerers) what you really want to do.
=== Your previous attempt at explanation (in comment on my answer to this question) ===
@John: Well, I'm actually making an API for my web service. My webservice accepts a key that maps out to a specific record in my database. The key is case-sensitive, and the key can be composed of any unicode characteer. So in order to normalize all input, I will convert all key queries into lowercase (if they have uppercase equivalents). A consequence of that is when I create the record keys (which my users can customize), I cannot accept any uppercase character that can be converted to a lowercase equivalent by the toLower() function. So I'm trying to make a filter for that. Any suggestions?
=== and my replying comment ===
@Albert: If your keys are case sensitive, why are you normalising them??? "record keys which users can customize" means what??? "any unicode char" vs "cannot accept any uppercase char" ??? To answer your question literally: Looks like you can't accept a character c when c.lower() != c which means that you can't accept any key if key.lower() != key. I think that you should start a NEW QUESTION, explaining exactly what you are trying to do, with examples.
... and you've certainly asked a new question (in fact 2 of them) but you haven't explained anything. This "new" question is so new that @Alex Martelli's answer is essentially the same as my comment highlighted above.
I think that you should start a NEW QUESTION, with new content, explaining exactly what you are trying to do, with examples.