在 Python 中将 UTF-8 字节转换为其他编码
我需要在 Python 2.4 中执行(是的,2.4 :-( )。
我有一个纯字符串对象,它表示一些用 UTF-8 编码的文本。它来自外部库,无法修改。
所以,我认为我需要做的是使用该源对象中的字节创建一个 Unicode 对象,然后将其转换为其他编码(实际上是 iso-8859-2)
。 unicode()" 似乎不起作用:
>>> x
'Sk\xc5\x82odowski'
>>> str(unicode(x, encoding='iso-8859-2'))
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode characters in position 2-3: ordinal not in range(128)
>>> unicode(x, encoding='iso-8859-2')
u'Sk\u0139\x82odowski'
I need to do in Python 2.4 (yes, 2.4 :-( ).
I've got a plain string object, which represents some text encoded with UTF-8. It comes from an external library, which can't be modified.
So, what I think I need to do, is to create an Unicode object using bytes from that source object, and then convert it to some other encoding (iso-8859-2, actually).
The plain string object is 'x'. "unicode()" seems to not work:
>>> x
'Sk\xc5\x82odowski'
>>> str(unicode(x, encoding='iso-8859-2'))
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode characters in position 2-3: ordinal not in range(128)
>>> unicode(x, encoding='iso-8859-2')
u'Sk\u0139\x82odowski'
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)