__repr__() 函数的最佳输出类型和编码实践?
最近,我在 __repr__()
、format()
和编码方面遇到了很多麻烦。 __repr__()
的输出应该编码还是 unicode 字符串? Python 中 __repr__()
的结果是否有最佳编码?我想要输出的内容确实有非 ASCII 字符。
我使用 Python 2.x,并且想要编写可以轻松适应 Python 3 的代码。因此,该程序使用
# -*- coding: utf-8 -*-
from __future__ import unicode_literals, print_function # The 'Hello' literal represents a Unicode object
以下一些其他问题一直困扰着我,我正在寻找解决这些问题的解决方案:
- 打印到 UTF-8 终端应该可以工作(我有 sys.stdout.encoding ) > 设置为
UTF-8
,但如果其他情况也能工作那就最好了)。 - 将输出通过管道传输到文件(以 UTF-8 编码)应该可以工作(在本例中,
sys.stdout.encoding
为None
)。 - 我的许多
__repr__()
函数的代码目前有许多return ....encode('utf-8')
,这很重。有没有什么东西又坚固又轻便? - 在某些情况下,我什至有像
return ('<{}>'.format(repr(x).decode('utf-8'))).encode('utf-8') 这样的丑陋野兽
,即对象的表示被解码,放入格式化字符串中,然后重新编码。我想避免这种复杂的转变。
为了编写能够很好地解决这些编码问题的简单 __repr__()
函数,您建议做什么?
Lately, I've had lots of trouble with __repr__()
, format()
, and encodings. Should the output of __repr__()
be encoded or be a unicode string? Is there a best encoding for the result of __repr__()
in Python? What I want to output does have non-ASCII characters.
I use Python 2.x, and want to write code that can easily be adapted to Python 3. The program thus uses
# -*- coding: utf-8 -*-
from __future__ import unicode_literals, print_function # The 'Hello' literal represents a Unicode object
Here are some additional problems that have been bothering me, and I'm looking for a solution that solves them:
- Printing to an UTF-8 terminal should work (I have
sys.stdout.encoding
set toUTF-8
, but it would be best if other cases worked too). - Piping the output to a file (encoded in UTF-8) should work (in this case,
sys.stdout.encoding
isNone
). - My code for many
__repr__()
functions currently has manyreturn ….encode('utf-8')
, and that's heavy. Is there anything robust and lighter? - In some cases, I even have ugly beasts like
return ('<{}>'.format(repr(x).decode('utf-8'))).encode('utf-8')
, i.e., the representation of objects is decoded, put into a formatting string, and then re-encoded. I would like to avoid such convoluted transformations.
What would you recommend to do in order to write simple __repr__()
functions that behave nicely with respect to these encoding questions?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
在Python2中,
__repr__
(和__str__
)必须返回一个字符串对象,而不是一个统一码对象。在Python3中,情况相反,
__repr__
和__str__
必须返回 unicode 对象,而不是字节(née string)对象:
在 Python2 中,你实际上没有选择。您必须选择一种编码
__repr__
的返回值。顺便问一下,您阅读过 PrintFails wiki 吗?可能不会直接回答
你的其他问题,但我确实发现它有助于阐明为什么某些
发生错误。
当使用
from __future__ import unicode_literals
时,可以更简单地编写为
假设
str
在您的系统上编码为utf-8
。如果没有
from __future__ import unicode_literals
,表达式可以写为:In Python2,
__repr__
(and__str__
) must return a string object, not aunicode object. In Python3, the situation is reversed,
__repr__
and__str__
must return unicode objects, not byte (née string) objects:
In Python2, you don't really have a choice. You have to pick an encoding for the
return value of
__repr__
.By the way, have you read the PrintFails wiki? It may not directly answer
your other questions, but I did find it helpful in illuminating why certain
errors occur.
When using
from __future__ import unicode_literals
,can be more simply written as
assuming
str
encodes toutf-8
on your system.Without
from __future__ import unicode_literals
, the expression can be written as:我认为装饰器可以以合理的方式管理 __repr__ 不兼容性。这是我使用的:
I think a decorator can manage
__repr__
incompatibilities in a sane way. Here's what i use:我使用如下函数:
然后我的 __repr__ 函数如下所示:
I use a function like the following:
Then my
__repr__
functions look like this: