string.title() 认为撇号是一个新词的开头。为什么?
>>> myStr="madam. i'm adam! i also tried c,o,m,m,a"
>>> myStr.title()
"Madam. I'M Adam! I Also Tried C,O,M,M,A"
这当然是不正确的。为什么撇号被视为新单词的开头。这是一个陷阱还是我假设 标题 的概念有问题?
>>> myStr="madam. i'm adam! i also tried c,o,m,m,a"
>>> myStr.title()
"Madam. I'M Adam! I Also Tried C,O,M,M,A"
This is certainly incorrect. Why would an apostrophe be considered be considered as the start of a new word. Is this a gotcha or a am I assuming something wrong about the concept of title?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
因为该实现是通过查看前一个字符来工作的,如果它是字母数字,则将当前字符小写,否则将其大写。也就是说,它比较简单,这是它的纯Python版本的样子:
Because the implementation works by looking at the previous character, and if it's alphanumeric it lower cases the current character, otherwise it upper cases it. That is to say, it's relatively simple, here's what a pure-python version of it looks like:
您可以使用:
string.capwords()
并且,由于
title()
与语言环境相关,请检查您的语言环境以查看这是否是故意的:You could use:
string.capwords()
And, since
title()
is locale-dependent, check your locale to see if this is intentional:http://en.wikibooks.org/wiki/Python_Programming /Strings#title.2C_upper.2C_lower.2C_swapcase.2C_capitalize
基本上,按预期工作。由于撇号确实是非字母的,因此您会得到上面概述的“意外行为”。
一些谷歌搜索表明,其他人认为这并不是最好的事情,并且已经编写了替代实现。请参阅:http://muffinresearch.co.uk/档案/2008/05/27/titlecasepy-titlecase-in-python/
http://en.wikibooks.org/wiki/Python_Programming/Strings#title.2C_upper.2C_lower.2C_swapcase.2C_capitalize
Basically, working as intended. Since apostrophe is indeed non-alphabetic, you get the "unexpected behavior" outlined above.
A bit of googling shows that other people feel this is not exactly the best thing and alternate implementations have been written. See: http://muffinresearch.co.uk/archives/2008/05/27/titlecasepy-titlecase-in-python/
这里的问题是“标题大小写”是一个非常依赖文化的概念。即使在英语中,也有太多的极端情况无法全部容纳。 (另请参阅http://bugs.python.org/issue7008)
如果你想要更好的东西,你需要想想你想要处理什么类型的文本(这意味着错误地处理其他文本),并编写你自己的函数。
The problem here is that "title case" is a very culturally dependent concept. Even in English, there are too many corner cases to fit them all. (See also http://bugs.python.org/issue7008)
If you want something better, you need to think of what kinds of texts you want to handle (and that means doing others incorrectly), and write your own function.