如何在Python 3.0中制作print()输出UTF-8?
我正在Winxp 5.1.2600工作,撰写了涉及中国拼音的Python应用程序,该应用程序涉及我无尽的Unicode问题。切换到Python 3.0解决了其中的许多。但是,由于某些奇怪的原因,控制台输出的print()函数并不是Unicode-ware。这是一个小计划。
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys
print('sys.stdout encoding is "' + sys.stdout.encoding + '"')
str1 = 'lüelā'
print(str1)
输出为(将角括号更改为方括号以获得可读性):
sys.stdout encoding is "cp1252" Traceback (most recent call last): File "TestPrintEncoding.py", line 22, in [module] print(str1) File "C:\Python30\lib\io.py", line 1491, in write b = encoder.encode(s) File "C:\Python30\lib\encodings\cp1252.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\u0101' in position 4: character maps to [undefined]
请注意,ü= '\ xfc'
= 252
没有任何问题,因为它是上层ASCII。但是ā= '\ u0101'
超出了8位。
任何人都知道如何将sys.stdout
的编码更改为'utf-8'
?请记住,如果正确理解该文档,则Python 3.0不再使用编解码器
模块。
(请注意,“编码:”行指定的编码是源代码的编码,而不是控制台输出的编码。但是,谢谢您的想法!)
I'm working in WinXP 5.1.2600, writing a Python application involving Chinese pinyin, which has involved me in endless Unicode problems. Switching to Python 3.0 has solved many of them. But the print() function for console output is not Unicode-aware for some odd reason. Here's a teeny program.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys
print('sys.stdout encoding is "' + sys.stdout.encoding + '"')
str1 = 'lüelā'
print(str1)
Output is (changing angle brackets to square brackets for readability):
sys.stdout encoding is "cp1252" Traceback (most recent call last): File "TestPrintEncoding.py", line 22, in [module] print(str1) File "C:\Python30\lib\io.py", line 1491, in write b = encoder.encode(s) File "C:\Python30\lib\encodings\cp1252.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\u0101' in position 4: character maps to [undefined]
Note that ü = '\xfc'
= 252
gives no problem since it's upper ASCII. But ā = '\u0101'
is beyond 8 bits.
Anyone have an idea how to change the encoding of sys.stdout
to 'utf-8'
? Bear in mind that Python 3.0 no longer uses the codecs
module, if I understand the documentation right.
(Note that the coding specified by the "coding:" line is the coding of the source code, not of the console output. But thank you for your thoughts!)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
Windows命令提示符(CMD.EXE)也无法显示您正在使用的Unicode字符,即使Python内部以正确的方式处理它。您需要使用可以正确显示Unicode的闲置,cygwin或其他程序。
请参阅此线程以进行完整的说明:
http://www.nabble.com /python-python-3-td21670662.html
The Windows command prompt (cmd.exe) cannot display the Unicode characters you are using, even though Python is handling it in a correct manner internally. You need to use IDLE, Cygwin, or another program that can display Unicode correctly.
See this thread for a full explanation:
http://www.nabble.com/unable-to-print-Unicode-characters-in-Python-3-td21670662.html
您可能需要尝试将环境变量“ PythonioCoding”更改为“ UTF_8”。我写了我的折磨中的页面上有这个问题。
You may want to try changing the environment variable "PYTHONIOENCODING" to "utf_8." I have written a page on my ordeal with this problem.
查看问题并回答在这里,我认为他们有一些有价值的线索。具体来说,请注意
setDefeaultEncodencoding
sys
模块,但您可能不应该使用它的事实。Check out the question and answer here, I think they have some valuable clues. Specifically, note the
setdefaultencoding
in thesys
module, but also the fact that you probably shouldn't use it.这是一个肮脏的黑客:
但是一切都会打破它:
简单的静音第一行已经打破了它:
检查OS类型会破坏它:
如果块:
,它甚至都无法工作。
但是可以使用CMD的Echo打印:
这是使此跨平台的简单方法:
但是窗口的
echo echo
trifing空线无法抑制。Here's a dirty hack:
However everything breaks it:
simple muting first line already breaks it:
checking for OS type breaks it:
it doesn't even works under if block:
But one can print with cmd's echo:
and here's a simple way to make this cross-platform:
but the window's
echo
trailing empty line can't be suppressed.已知在Windows中显示Unicode Charater的问题。还没有官方解决方案。正确的做法是使用Winapi函数WriteConsolew。由于还有其他相关问题,建立一个工作解决方案是不平凡的。但是,我开发了一个程序包,该软件包试图解决此问题的Python。参见 https://github.com/drekin/drekin/win-unicode-console 。您也可以在此阅读该问题的更深入解释。该软件包也位于pypi上( https://pypi.python.org/pypi/pypi/pypi/win_un_unicode_console 可以使用PIP安装。
The problem of displaying Unicode charaters in Python in Windows is known. There is no official solution yet. The right thing to do is to use winapi function WriteConsoleW. It is nontrivial to build a working solution as there are other related issues. However, I have developed a package which tries to fix Python regarding this issue. See https://github.com/Drekin/win-unicode-console. You can also read there a deeper explanation of the problem. The package is also on pypi (https://pypi.python.org/pypi/win_unicode_console) and can be installed using pip.