Python 如何在输出中获取西里尔字母?
我如何获得西里尔字母而不是 u'...
代码就像这样
def openfile(filename):
with codecs.open(filename, encoding="utf-8") as F:
raw = F.read()
do stuff...
print some_text
打印
>>>[u'.', u',', u':', u'\u0432', u'<', u'>', u'(', u')', u'\u0437', u'\u0456']
how do I get Cyrillic instead of u'...
the code is like this
def openfile(filename):
with codecs.open(filename, encoding="utf-8") as F:
raw = F.read()
do stuff...
print some_text
prints
>>>[u'.', u',', u':', u'\u0432', u'<', u'>', u'(', u')', u'\u0437', u'\u0456']
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
看起来
some_text
是一个 unicode 对象列表。当您打印这样的列表时,它会打印列表内元素的reprs
。因此,请尝试:join 方法连接
some_text 的元素
,元素之间有一个空格u''
。结果是一个 unicode 对象。It looks like
some_text
is a list of unicode objects. When you print such a list, it prints thereprs
of the elements inside the list. So instead try:The join method concatenates the elements of
some_text
, with an empty space,u''
, in between the elements. The result is one unicode object.我不清楚
some_text
来自哪里(你删掉了那段代码),所以我不知道为什么它打印为字符列表而不是字符串。但您应该知道,默认情况下,当您将字符串打印到终端时,Python 会尝试将字符串编码为 ASCII。如果您希望它们在其他编码系统中进行编码,您可以明确地执行此操作:
It's not clear to me where
some_text
comes from (you cut out that bit of your code), so I have no idea why it prints as a list of characters rather than a string.But you should be aware that by default, Python tries to encode strings as ASCII when you print them to the terminal. If you want them to be encoded in some other coding system, you can do that explicitly:
u'\uNNNN'
是字符串文字u'з'
的 ASCII 安全版本:但是,只有当您的控制台支持您所使用的字符时,这才会正确显示。尝试打印。在西欧 Windows 安装的控制台上尝试上述操作失败:
因为让 Windows 控制台输出 Unicode 很棘手,所以 Python 2 的
repr
函数始终选择 ASCII 安全文字版本。您的 print 语句输出的是 repr 版本,而不是直接打印字符,因为您将它们放在字符列表而不是字符串中。如果您对列表中的每个成员进行
print
操作,您将直接获得字符输出,而不是表示为u'...'
字符串文字。u'\uNNNN'
is the ASCII-safe version of the string literalu'з'
:However this will only display right for you if your console supports the character you are trying to print. Trying the above on the console on a Western European Windows install fails:
Because getting the Windows console to output Unicode is tricky, Python 2's
repr
function always opts for the ASCII-safe literal version.Your
print
statement is outputting therepr
version and not printing characters directly because you've got them inside a list of characters instead of a string. If you didprint
on each of the members of the list, you'd get the characters output directly and not represented asu'...'
string literals.