Unicode 无法正确打印到 cp850 (cp437),玩纸牌套装
总结一下:如何独立打印unicode系统来生成扑克牌符号?
我做错了什么,我认为自己的Python非常流利,只是我似乎无法正确打印!
# coding: utf-8
from __future__ import print_function
from __future__ import unicode_literals
import sys
symbols = ('♥','♦','♠','♣')
# red suits to sdterr for IDLE
print(' '.join(symbols[:2]), file=sys.stderr)
print(' '.join(symbols[2:]))
sys.stdout.write(symbols) # also correct in IDLE
print(' '.join(symbols))
打印到控制台(这是控制台应用程序的主要问题)却惨遭失败:
J:\test>chcp
Aktiivinen koodisivu: 850
J:\test>symbol2
Traceback (most recent call last):
File "J:\test\symbol2.py", line 9, in <module>
print(''.join(symbols))
File "J:\Python26\lib\encodings\cp850.py", line 12, in encode
return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-3: character maps to <unde
fined>
J:\test>chcp 437
Aktiivinen koodisivu: 437
J:\test>d:\Python27\python.exe symbol2.py
Traceback (most recent call last):
File "symbol2.py", line 6, in <module>
print(' '.join(symbols))
File "d:\Python27\lib\encodings\cp437.py", line 12, in encode
return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode character u'\u2660' in position 0: character maps
o <undefined>
J:\test>
所以我有一个控制台应用程序,只要您不使用控制台,而是使用空闲状态,它就可以工作。
我当然可以通过 chr: 生成符号来自己生成符号:
# correct symbols for cp850
print(''.join(chr(n) for n in range(3,3+4)))
但这看起来非常愚蠢。而且我不会让程序只在 Windows 上运行或有许多特殊情况(例如条件编译)。我想要可读的代码。
我不介意它输出哪些字母,只要它看起来正确,无论它是诺基亚手机、Windows还是Linux。 Unicode 应该可以,但它无法正确打印到控制台
To summarize: How do I print unicode system independently to produce play card symbols?
What I do wrong, I consider myself quite fluent in Python, except I seem not able to print correctly!
# coding: utf-8
from __future__ import print_function
from __future__ import unicode_literals
import sys
symbols = ('♥','♦','♠','♣')
# red suits to sdterr for IDLE
print(' '.join(symbols[:2]), file=sys.stderr)
print(' '.join(symbols[2:]))
sys.stdout.write(symbols) # also correct in IDLE
print(' '.join(symbols))
Printing to console, which is main consern for console application, is failing miserably though:
J:\test>chcp
Aktiivinen koodisivu: 850
J:\test>symbol2
Traceback (most recent call last):
File "J:\test\symbol2.py", line 9, in <module>
print(''.join(symbols))
File "J:\Python26\lib\encodings\cp850.py", line 12, in encode
return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-3: character maps to <unde
fined>
J:\test>chcp 437
Aktiivinen koodisivu: 437
J:\test>d:\Python27\python.exe symbol2.py
Traceback (most recent call last):
File "symbol2.py", line 6, in <module>
print(' '.join(symbols))
File "d:\Python27\lib\encodings\cp437.py", line 12, in encode
return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode character u'\u2660' in position 0: character maps
o <undefined>
J:\test>
So summa summarum I have console application which works as long as you are not using console, but IDLE.
I can of course generate the symbols myself by producing them by chr:
# correct symbols for cp850
print(''.join(chr(n) for n in range(3,3+4)))
But this looks very stupid way to do it. And I do not make programs only run on Windows or have many special cases (like conditional compiling). I want readable code.
I do not mind which letters it outputs, as long as it looks correct no matter if it is Nokia phone, Windows or Linux. Unicode should do it but it does not print correctly to Console
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
每当我需要输出 utf-8 字符时,我都会使用以下方法:
每次需要将某些内容发送到 sdtout/stderr 时,这都会为我节省一个
encode('utf-8')
。Whenever I need to output utf-8 characters, I use the following approach:
This saves me an
encode('utf-8')
every time something needs to be sent to sdtout/stderr.使用 Unicode 字符串和
codecs
模块:要么:
要么:
无需重新实现
print
。Use Unicode strings and the
codecs
module:Either:
or:
No need to re-implement
print
.回应更新的问题
由于您只想在 CMD 上打印出 UTF-8 字符,所以您很不幸,CMD 不支持 UTF-8:
是否有可以显示 Unicode 的 Windows 命令 shell字符?
旧答案
目前还不完全清楚您要在这里做什么,我最好的选择是您想要编写编码< /strong> 文件的 UTF-8。
您的问题是:
symbols = ('♠','♥', '♦','♣')
而您的文件编码可能是 UTF-8,除非您使用的是 Python 3默认情况下,字符串不是 UTF-8,您需要在它们前面加上一个小u
前缀:符号 = (u'♠', u'♥', u'◆', u'♣')
您的
str(arg)
将 unicode 字符串转换回来转换为普通字符串,只需将其保留或使用unicode(arg)
转换为 unicode 字符串.decode()
的命名可能会令人困惑,这个解码字节转换为 UTF-8,但您需要做的是将 UTF-8 编码为字节,因此请使用.encode()
您不是以二进制模式写入文件,而是使用
open('test.txt', 'w')
您需要使用open('test.txt', 'wb')
(注意wb
),这将以二进制模式打开文件,这一点很重要在 Windows 上我们将所有这些放在一起,我们会得到:
如果 编码 UTF-8 到文件的字节(至少在我的 Ubuntu 盒子上)。
In response to the updated question
Since all you want to do is to print out UTF-8 characters on the CMD, you're out of luck, CMD does not support UTF-8:
Is there a Windows command shell that will display Unicode characters?
Old Answer
It's not totally clear what you're trying to do here, my best bet is that you want to write the encoded UTF-8 to a file.
Your problems are:
symbols = ('♠','♥', '♦','♣')
while your file encoding maybe UTF-8, unless you're using Python 3 your strings wont be UTF-8 by default, you need to prefix them with a smallu
:symbols = (u'♠', u'♥', u'♦', u'♣')
Your
str(arg)
converts the unicode string back into a normal one, just leave it out or useunicode(arg)
to convert to a unicode stringThe naming of
.decode()
may be confusing, this decodes bytes into UTF-8, but what you need to do is to encode UTF-8 into bytes so use.encode()
You're not writing to the file in binary mode, instead of
open('test.txt', 'w')
your need to useopen('test.txt', 'wb')
(notice thewb
) this will open the file in binary mode which is important on windowsIf we put all of this together we get:
That happily writes the byte encoded UTF-8 to the file (at least on my Ubuntu box here).
Windows 控制台中的 UTF-8 是一个漫长而痛苦的故事。
您可以阅读 问题 1602 和 issue 6058 并拥有或多或少有用的东西,但它很脆弱。
让我总结一下:
Lib/encodings/aliases.py
中添加“cp65001”作为“utf8”的别名,Lucida Console
或Consolas
作为你的控制台字体chcp 65001
UTF-8 in the Windows console is a long and painful story.
You can read issue 1602 and issue 6058 and have something that works, more or less, but it's fragile.
Let me summarise:
Lib/encodings/aliases.py
Lucida Console
orConsolas
as your console fontchcp 65001