Python:将字符串从 UTF-8 转换为 Latin-1
我感觉在这里尝试使用 Python 2.5 更改编码
我有 XML 响应,我将其编码为 UTF-8:response.encode('utf-8')
。这很好,但是使用此信息的程序不喜欢这种编码,我必须将其转换为其他代码页。真实的例子是,我使用 Ghostscript Python 模块将 pdfmark 数据嵌入到 PDF 文件中 - 最终结果是 Acrobat 中出现错误的字符。
我已经在“utf-8”和“latin-1”之间使用 .encode()
和 .decode()
进行了多次组合,这让我疯狂没有输出正确的结果。
如果我使用 .encode('utf-8')
将字符串输出到文件,然后使用 ie iconv.exe 将此文件从 UTF-8 转换为 CP1252(又名 latin-1) 并嵌入数据一切都很好。
基本上有人可以帮我将 ie 字符 á 转换为 UTF-8 编码为十六进制:C3 A1
到 latin-1 十六进制:E1
?
I feel stacked here trying to change encodings with Python 2.5
I have XML response, which I encode to UTF-8: response.encode('utf-8')
. That is fine, but the program which uses this info doesn't like this encoding and I have to convert it to other code page. Real example is that I use ghostscript python module to embed pdfmark data to a PDF file - end result is with wrong characters in Acrobat.
I've done numerous combinations with .encode()
and .decode()
between 'utf-8' and 'latin-1' and it drives me crazy as I can't output correct result.
If I output the string to a file with .encode('utf-8')
and then convert this file from UTF-8 to CP1252 (aka latin-1) with i.e. iconv.exe and embed the data everything is fine.
Basically can someone help me convert i.e. character á which is UTF-8 encoded as hex: C3 A1
to latin-1 as hex: E1
?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
使用
.encode('latin-1')
,而不是.encode('utf-8')
。Instead of
.encode('utf-8')
, use.encode('latin-1')
.应该做。
Should do it.
您能否提供有关您正在尝试做什么的更多详细信息?一般来说,如果你有一个unicode字符串,你可以使用encode将其转换为具有适当编码的字符串。例如:
Can you provide more details about what you are trying to do? In general, if you have a unicode string, you can use encode to convert it into string with appropriate encoding. Eg:
如果前面的答案不能解决您的问题,请检查无法正确打印/转换的数据源。
就我而言,我在不使用
encoding="utf-8"
的情况下,对从文件中错误读取的数据使用了json.load
。尝试将结果字符串反/编码为latin-1
只是没有帮助......If the previous answers do not solve your problem, check the source of the data that won't print/convert properly.
In my case, I was using
json.load
on data incorrectly read from file by not using theencoding="utf-8"
. Trying to de-/encode the resulting string tolatin-1
just does not help...