__str__ 与 __unicode__
对于何时应该实现 __str__()
与 __unicode__()
是否存在 Python 约定?我发现类重写 __unicode__() 的频率比 __str__()
更频繁,但它似乎并不一致。是否有具体的规则来说明什么时候实施其中一种比另一种更好?两者都有必要/良好实践吗?
Is there a Python convention for when you should implement __str__()
versus __unicode__()
? I've seen classes override __unicode__()
more frequently than __str__()
but it doesn't appear to be consistent. Are there specific rules when it is better to implement one versus the other? Is it necessary/good practice to implement both?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
__str__()
是旧方法——它返回字节。__unicode__()
是新的首选方法——它返回字符。这些名称有点令人困惑,但在 2.x 中,出于兼容性原因,我们坚持使用它们。一般来说,您应该将所有字符串格式放在__unicode__()
中,并创建一个存根__str__()
方法:在 3.0 中,
str
包含字符,因此相同的方法被命名为__bytes__()
和__str__()
。这些行为符合预期。__str__()
is the old method -- it returns bytes.__unicode__()
is the new, preferred method -- it returns characters. The names are a bit confusing, but in 2.x we're stuck with them for compatibility reasons. Generally, you should put all your string formatting in__unicode__()
, and create a stub__str__()
method:In 3.0,
str
contains characters, so the same methods are named__bytes__()
and__str__()
. These behave as expected.如果我不是特别关心给定类的微优化字符串化,我总是只实现
__unicode__
,因为它更通用。当我确实关心如此微小的性能问题时(这是例外,而不是规则),仅使用__str__
(当我可以证明字符串化输出中永远不会有非 ASCII 字符时)或两者兼而有之(当两者都可能时),可能会有所帮助。我认为这些都是可靠的原则,但在实践中,很常见的是知道除了 ASCII 字符之外什么都没有,而无需努力证明这一点(例如,字符串化形式只有数字、标点符号,可能还有一个简短的 ASCII 名称;-)在这种情况下,直接转向“just
__str__
”方法是很典型的(但如果与我合作的编程团队提出了一个本地指南来避免这种情况,我会对该提案+1,因为在这些问题上很容易犯错误,并且“过早的优化是编程中万恶之源”;-)。If I didn't especially care about micro-optimizing stringification for a given class I'd always implement
__unicode__
only, as it's more general. When I do care about such minute performance issues (which is the exception, not the rule), having__str__
only (when I can prove there never will be non-ASCII characters in the stringified output) or both (when both are possible), might help.These I think are solid principles, but in practice it's very common to KNOW there will be nothing but ASCII characters without doing effort to prove it (e.g. the stringified form only has digits, punctuation, and maybe a short ASCII name;-) in which case it's quite typical to move on directly to the "just
__str__
" approach (but if a programming team I worked with proposed a local guideline to avoid that, I'd be +1 on the proposal, as it's easy to err in these matters AND "premature optimization is the root of all evil in programming";-).随着世界变得越来越小,您遇到的任何字符串最终都可能包含 Unicode。因此,对于任何新应用程序,您至少应该提供
__unicode__()
。是否也重写 __str__() 只是一个品味问题。With the world getting smaller, chances are that any string you encounter will contain Unicode eventually. So for any new apps, you should at least provide
__unicode__()
. Whether you also override__str__()
is then just a matter of taste.如果您在 Django 中同时使用 python2 和 python3,我推荐 python_2_unicode_completed 装饰器:
正如之前对另一个答案的评论所述,future.utils 的某些版本也支持此装饰器。在我的系统上,我需要为 python2 安装较新的 future 模块,并为 python3 安装 future 模块。之后,这是一个功能示例:
这是示例输出(其中 venv2/venv3 是 virtualenv 实例):
If you are working in both python2 and python3 in Django, I recommend the python_2_unicode_compatible decorator:
As noted in earlier comments to another answer, some versions of future.utils also support this decorator. On my system, I needed to install a newer future module for python2 and install future for python3. After that, then here is a functional example:
Here is example output (where venv2/venv3 are virtualenv instances):
Python 2:仅实现 __str__(),并返回 unicode。
当省略
__unicode__()
且有人调用unicode(o)
或 < code>u"%s"%o,Python 调用o.__str__()
并使用系统编码转换为 unicode。 (请参阅__unicode__()
的文档.)相反的情况则不然。如果您实现了
__unicode__()
但没有实现__str__()
,那么当有人调用str(o)
或"%s"% 时o
,Python 返回repr(o)
。基本原理
为什么从
__str__()
返回unicode
有效?如果 __str__() 返回 unicode,Python 会使用系统编码自动将其转换为
str
。有什么好处?
① 它使您不必担心系统编码是什么(即
locale.getpreferredencoeding(…)
)。就我个人而言,这不仅很混乱,而且我认为无论如何,这是系统应该处理的事情。 ② 如果你小心的话,你的代码可能会与 Python 3 交叉兼容,其中__str__()
返回 unicode。从名为
__str__()
的函数返回 unicode 不是具有欺骗性吗?一点。但是,您可能已经在这样做了。如果您的文件顶部有
from __future__ import unicode_literals
,那么您很有可能在不知情的情况下返回了 unicode。Python 3 怎么样?
Python 3 不使用
__unicode__()
。但是,如果您实现 __str__() 以便它在 Python 2 或 Python 3 下返回 unicode,那么这部分代码将是交叉兼容的。如果我希望
unicode(o)
与str()
本质上不同怎么办?同时实现
__str__()
(可能返回str
)和__unicode__()
。我想这种情况很少见,但您可能需要实质上不同的输出(例如,特殊字符的 ASCII 版本,例如":)"
表示u"☺"
)。我意识到有些人可能会觉得这有争议。
Python 2: Implement __str__() only, and return a unicode.
When
__unicode__()
is omitted and someone callsunicode(o)
oru"%s"%o
, Python callso.__str__()
and converts to unicode using the system encoding. (See documentation of__unicode__()
.)The opposite is not true. If you implement
__unicode__()
but not__str__()
, then when someone callsstr(o)
or"%s"%o
, Python returnsrepr(o)
.Rationale
Why would it work to return a
unicode
from__str__()
?If
__str__()
returns a unicode, Python automatically converts it tostr
using the system encoding.What's the benefit?
① It frees you from worrying about what the system encoding is (i.e.,
locale.getpreferredencoeding(…)
). Not only is that messy, personally, but I think it's something the system should take care of anyway. ② If you are careful, your code may come out cross-compatible with Python 3, in which__str__()
returns unicode.Isn't it deceptive to return a unicode from a function called
__str__()
?A little. However, you might be already doing it. If you have
from __future__ import unicode_literals
at the top of your file, there's a good chance you're returning a unicode without even knowing it.What about Python 3?
Python 3 does not use
__unicode__()
. However, if you implement__str__()
so that it returns unicode under either Python 2 or Python 3, then that part of your code will be cross-compatible.What if I want
unicode(o)
to be substantively different fromstr()
?Implement both
__str__()
(possibly returningstr
) and__unicode__()
. I imagine this would be rare, but you might want substantively different output (e.g., ASCII versions of special characters, like":)"
foru"☺"
).I realize some may find this controversial.
值得向那些不熟悉 __unicode__ 函数的人指出 Python 2.x 中围绕该函数的一些默认行为,尤其是与 __str__ 并排定义时。
产生以下控制台输出...
现在当我取消注释
__str__
方法时It's worth pointing out to those unfamiliar with the
__unicode__
function some of the default behaviors surrounding it back in Python 2.x, especially when defined side by side with__str__
.yields the following console output...
Now when I uncomment out the
__str__
method