Unicode 无法正确打印到 cp850 (cp437),玩纸牌套装

发布于 2024-10-04 05:06:45 字数 1727 浏览 4 评论 0原文

总结一下:如何独立打印unicode系统来生成扑克牌符号?

我做错了什么,我认为自己的Python非常流利,只是我似乎无法正确打印!

# coding: utf-8
from __future__ import print_function
from __future__ import unicode_literals
import sys

symbols = ('♥','♦','♠','♣')
# red suits to sdterr for IDLE
print(' '.join(symbols[:2]), file=sys.stderr)
print(' '.join(symbols[2:]))

sys.stdout.write(symbols) # also correct in IDLE
print(' '.join(symbols))

打印到控制台(这是控制台应用程序的主要问题)却惨遭失败:

J:\test>chcp
Aktiivinen koodisivu: 850


J:\test>symbol2
Traceback (most recent call last):
  File "J:\test\symbol2.py", line 9, in <module>
    print(''.join(symbols))
  File "J:\Python26\lib\encodings\cp850.py", line 12, in encode
    return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-3: character maps to <unde
fined>
J:\test>chcp 437
Aktiivinen koodisivu: 437

J:\test>d:\Python27\python.exe symbol2.py
Traceback (most recent call last):
  File "symbol2.py", line 6, in <module>
    print(' '.join(symbols))
  File "d:\Python27\lib\encodings\cp437.py", line 12, in encode
    return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode character u'\u2660' in position 0: character maps
o <undefined>

J:\test>

所以我有一个控制台应用程序,只要您不使用控制台,而是使用空闲状态,它就可以工作。

我当然可以通过 chr: 生成符号来自己生成符号:

# correct symbols for cp850
print(''.join(chr(n) for n in range(3,3+4)))

但这看起来非常愚蠢。而且我不会让程序只在 Windows 上运行或有许多特殊情况(例如条件编译)。我想要可读的代码。

我不介意它输出哪些字母,只要它看起来正确,无论它是诺基亚手机、Windows还是Linux。 Unicode 应该可以,但它无法正确打印到控制台

To summarize: How do I print unicode system independently to produce play card symbols?

What I do wrong, I consider myself quite fluent in Python, except I seem not able to print correctly!

# coding: utf-8
from __future__ import print_function
from __future__ import unicode_literals
import sys

symbols = ('♥','♦','♠','♣')
# red suits to sdterr for IDLE
print(' '.join(symbols[:2]), file=sys.stderr)
print(' '.join(symbols[2:]))

sys.stdout.write(symbols) # also correct in IDLE
print(' '.join(symbols))

Printing to console, which is main consern for console application, is failing miserably though:

J:\test>chcp
Aktiivinen koodisivu: 850


J:\test>symbol2
Traceback (most recent call last):
  File "J:\test\symbol2.py", line 9, in <module>
    print(''.join(symbols))
  File "J:\Python26\lib\encodings\cp850.py", line 12, in encode
    return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-3: character maps to <unde
fined>
J:\test>chcp 437
Aktiivinen koodisivu: 437

J:\test>d:\Python27\python.exe symbol2.py
Traceback (most recent call last):
  File "symbol2.py", line 6, in <module>
    print(' '.join(symbols))
  File "d:\Python27\lib\encodings\cp437.py", line 12, in encode
    return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode character u'\u2660' in position 0: character maps
o <undefined>

J:\test>

So summa summarum I have console application which works as long as you are not using console, but IDLE.

I can of course generate the symbols myself by producing them by chr:

# correct symbols for cp850
print(''.join(chr(n) for n in range(3,3+4)))

But this looks very stupid way to do it. And I do not make programs only run on Windows or have many special cases (like conditional compiling). I want readable code.

I do not mind which letters it outputs, as long as it looks correct no matter if it is Nokia phone, Windows or Linux. Unicode should do it but it does not print correctly to Console

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

嘦怹 2024-10-11 05:06:45

每当我需要输出 utf-8 字符时,我都会使用以下方法:

import codecs

out = codecs.getwriter('utf-8')(sys.stdout)

str = u'♠'

out.write("%s\n" % str)

每次需要将某些内容发送到 sdtout/stderr 时,这都会为我节省一个 encode('utf-8')

Whenever I need to output utf-8 characters, I use the following approach:

import codecs

out = codecs.getwriter('utf-8')(sys.stdout)

str = u'♠'

out.write("%s\n" % str)

This saves me an encode('utf-8') every time something needs to be sent to sdtout/stderr.

如此安好 2024-10-11 05:06:45

使用 Unicode 字符串和 codecs 模块:

要么:

# coding: utf-8
from __future__ import print_function
import sys
import codecs

symbols = (u'♠',u'♥',u'♦',u'♣')

print(u' '.join(symbols))
print(*symbols)
with codecs.open('test.txt','w','utf-8') as testfile:
    print(*symbols, file=testfile)

要么:

# coding: utf-8
from __future__ import print_function
from __future__ import unicode_literals
import sys
import codecs

symbols = ('♠','♥','♦','♣')

print(' '.join(symbols))
print(*symbols)
with codecs.open('test.txt','w','utf-8') as testfile:
    print(*symbols, file=testfile)

无需重新实现 print

Use Unicode strings and the codecs module:

Either:

# coding: utf-8
from __future__ import print_function
import sys
import codecs

symbols = (u'♠',u'♥',u'♦',u'♣')

print(u' '.join(symbols))
print(*symbols)
with codecs.open('test.txt','w','utf-8') as testfile:
    print(*symbols, file=testfile)

or:

# coding: utf-8
from __future__ import print_function
from __future__ import unicode_literals
import sys
import codecs

symbols = ('♠','♥','♦','♣')

print(' '.join(symbols))
print(*symbols)
with codecs.open('test.txt','w','utf-8') as testfile:
    print(*symbols, file=testfile)

No need to re-implement print.

灼痛 2024-10-11 05:06:45

回应更新的问题

由于您只想在 CMD 上打印出 UTF-8 字符,所以您很不幸,CMD 不支持 UTF-8:
是否有可以显示 Unicode 的 Windows 命令 shell字符?

旧答案

目前还不完全清楚您要在这里做什么,我最好的选择是您想要编写编码< /strong> 文件的 UTF-8。

您的问题是:

  1. symbols = ('♠','♥', '♦','♣') 而您的文件编码可能是 UTF-8,除非您使用的是 Python 3默认情况下,字符串不是 UTF-8,您需要在它们前面加上一个小 u 前缀:
    符号 = (u'♠', u'♥', u'◆', u'♣')

  2. 您的 str(arg) 将 unicode 字符串转换回来转换为普通字符串,只需将其保留或使用 unicode(arg) 转换为 unicode 字符串

  3. .decode() 的命名可能会令人困惑,这个解码字节转换为 UTF-8,但您需要做的是将 UTF-8 编码为字节,因此请使用 .encode()

  4. 您不是以二进制模式写入文件,而是使用 open('test.txt', 'w') 您需要使用 open('test.txt', 'wb') (注意 wb),这将以二进制模式打开文件,这一点很重要在 Windows 上

我们将所有这些放在一起,我们会得到:

# -*- coding: utf-8 -*-
from __future__ import print_function
import sys

symbols = (u'♠',u'♥', u'♦',u'♣')

print(' '.join(symbols))
print('Failure!')

def print(*args,**kwargs):
    end = kwargs[end] if 'end' in kwargs else '\n'
    sep = kwargs[sep] if 'sep' in kwargs else ' '
    stdout = sys.stdout if 'file' not in kwargs else kwargs['file']
    stdout.write(sep.join(unicode(arg).encode('utf-8') for arg in args))
    stdout.write(end)

print(*symbols)
print('Success!')
with open('test.txt', 'wb') as testfile:
    print(*symbols, file=testfile)

如果 编码 UTF-8 到文件的字节(至少在我的 Ubuntu 盒子上)。

In response to the updated question

Since all you want to do is to print out UTF-8 characters on the CMD, you're out of luck, CMD does not support UTF-8:
Is there a Windows command shell that will display Unicode characters?

Old Answer

It's not totally clear what you're trying to do here, my best bet is that you want to write the encoded UTF-8 to a file.

Your problems are:

  1. symbols = ('♠','♥', '♦','♣') while your file encoding maybe UTF-8, unless you're using Python 3 your strings wont be UTF-8 by default, you need to prefix them with a small u:
    symbols = (u'♠', u'♥', u'♦', u'♣')

  2. Your str(arg) converts the unicode string back into a normal one, just leave it out or use unicode(arg) to convert to a unicode string

  3. The naming of .decode() may be confusing, this decodes bytes into UTF-8, but what you need to do is to encode UTF-8 into bytes so use .encode()

  4. You're not writing to the file in binary mode, instead of open('test.txt', 'w') your need to use open('test.txt', 'wb') (notice the wb) this will open the file in binary mode which is important on windows

If we put all of this together we get:

# -*- coding: utf-8 -*-
from __future__ import print_function
import sys

symbols = (u'♠',u'♥', u'♦',u'♣')

print(' '.join(symbols))
print('Failure!')

def print(*args,**kwargs):
    end = kwargs[end] if 'end' in kwargs else '\n'
    sep = kwargs[sep] if 'sep' in kwargs else ' '
    stdout = sys.stdout if 'file' not in kwargs else kwargs['file']
    stdout.write(sep.join(unicode(arg).encode('utf-8') for arg in args))
    stdout.write(end)

print(*symbols)
print('Success!')
with open('test.txt', 'wb') as testfile:
    print(*symbols, file=testfile)

That happily writes the byte encoded UTF-8 to the file (at least on my Ubuntu box here).

潇烟暮雨 2024-10-11 05:06:45

Windows 控制台中的 UTF-8 是一个漫长而痛苦的故事。

您可以阅读 问题 1602issue 6058 并拥有或多或少有用的东西,但它很脆弱。

让我总结一下:

  • Lib/encodings/aliases.py中添加“cp65001”作为“utf8”的别名,
  • 选择Lucida ConsoleConsolas作为你的控制台字体
  • 运行chcp 65001
  • 运行python

UTF-8 in the Windows console is a long and painful story.

You can read issue 1602 and issue 6058 and have something that works, more or less, but it's fragile.

Let me summarise:

  • add 'cp65001' as an alias for 'utf8' in Lib/encodings/aliases.py
  • select Lucida Console or Consolas as your console font
  • run chcp 65001
  • run python
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文