python中的编码:变量是什么类型
Python文件
# -*- coding: UTF-8 -*-
a = 'Köppler'
print a
print a.__class__.__name__
mydict = {}
mydict['name'] = a
print mydict
print mydict['name']
输出:
Köppler
str
{'name': 'K\xc3\xb6ppler'}
Köppler
看起来名称保持不变,但只有在打印字典时我才得到这个奇怪的转义字符串。那我在看什么呢?这是 UTF-8 表示吗?
Python file
# -*- coding: UTF-8 -*-
a = 'Köppler'
print a
print a.__class__.__name__
mydict = {}
mydict['name'] = a
print mydict
print mydict['name']
Output:
Köppler
str
{'name': 'K\xc3\xb6ppler'}
Köppler
It seems that the name remains the same, but only when printing a dictionary I get this strange escaped character string. What am I looking at then? Is that the UTF-8 representation?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
出现这种行为的原因是 Python 2 中的
__repr__
函数 转义非 ASCII unicode 字符< /a>.正如链接所示,这个问题在 Python 3 中已得到修复。The reason for that behavior is that the
__repr__
function in Python 2 escapes non-ASCII unicode characters. As the link shows, this is fixed in Python 3.是的,这就是
ö
的 UTF-8 表示形式(U+00F6 带有分音符号的拉丁文小写字母 O)。它由 0xC3 八位位组和后跟 0xB6 八位位组组成。我认为 UTF-8 是一种非常优雅的编码,值得一读。其设计历史(在餐厅的餐垫上)是Yes, that's the UTF-8 representation of
ö
(U+00F6 LATIN SMALL LETTER O WITH DIAERESIS). It consists of a 0xC3 octet followed by a 0xB6 octet. UTF-8 is a very elegant encoding, I think, and worth reading up on. The history of its design (on a placemat in a diner) is described here by Rob Pike.就我而言,Python 中有两种显示对象的方法:str() 和 repr()。 Str() 在 print 内部使用,但是显然 dict 的 str() 使用 repr() 作为键和值。
正如已经提到的: repr() 转义 unicode 字符。
As far as I'm concerned there are two methods in Python for displaying objects: str() and repr(). Str() is used internally inside print, however Apparently dict's str() uses repr() for keys and values.
As it has been mentioned: repr() escapes unicode characters.
看来您正在使用 python 2.x,您必须指定该对象实际上是一个 unicode 字符串,而不是一个普通的 ascii。您指定代码为utf-8,因此您实际上为 ö 输入了 2 个字节,并且由于它是一个常规字符串,因此您得到了 2 个转义字符。
尝试指定 unicode
a= u'Köppler'
。您可能需要在打印之前对其进行编码,具体取决于您的控制台编码:print a.encode('utf-8')
It seems you are using python 2.x, where you have to specify that the object is actually a unicode string and not a plain ascii. You specified that the code is utf-8, thus you actually typed 2 bytes for your ö, and as it is a regular string, you got the 2 escaped chars.
Try to specify the unicode
a= u'Köppler'
. You may need to encode it before printing, depending on your consol encoding:print a.encode('utf-8')