str 与 unicode

发布于 2024-08-02 06:05:20 字数 186 浏览 5 评论 0原文

对于何时应该实现 __str__() 与 __unicode__() 是否存在 Python 约定？我发现类重写 __unicode__() 的频率比 __str__() 更频繁，但它似乎并不一致。是否有具体的规则来说明什么时候实施其中一种比另一种更好？两者都有必要/良好实践吗？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

奈何桥上唱咆哮 2024-08-09 06:05:21

__str__() 是旧方法——它返回字节。 __unicode__() 是新的首选方法——它返回字符。这些名称有点令人困惑，但在 2.x 中，出于兼容性原因，我们坚持使用它们。一般来说，您应该将所有字符串格式放在 __unicode__() 中，并创建一个存根 __str__() 方法：

def __str__(self):
    return unicode(self).encode('utf-8')

在 3.0 中，str 包含字符，因此相同的方法被命名为 __bytes__() 和 __str__()。这些行为符合预期。

__str__() is the old method -- it returns bytes. __unicode__() is the new, preferred method -- it returns characters. The names are a bit confusing, but in 2.x we're stuck with them for compatibility reasons. Generally, you should put all your string formatting in __unicode__(), and create a stub __str__() method:

def __str__(self):
    return unicode(self).encode('utf-8')

In 3.0, str contains characters, so the same methods are named __bytes__() and __str__(). These behave as expected.

回复收藏 0 原文

最丧也最甜 2024-08-09 06:05:21

如果我不是特别关心给定类的微优化字符串化，我总是只实现 __unicode__ ，因为它更通用。当我确实关心如此微小的性能问题时（这是例外，而不是规则），仅使用 __str__ （当我可以证明字符串化输出中永远不会有非 ASCII 字符时）或两者兼而有之（当两者都可能时），可能会有所帮助。

我认为这些都是可靠的原则，但在实践中，很常见的是知道除了 ASCII 字符之外什么都没有，而无需努力证明这一点（例如，字符串化形式只有数字、标点符号，可能还有一个简短的 ASCII 名称；-）在这种情况下，直接转向“just __str__”方法是很典型的（但如果与我合作的编程团队提出了一个本地指南来避免这种情况，我会对该提案+1，因为在这些问题上很容易犯错误，并且“过早的优化是编程中万恶之源”；-)。

回复收藏 0 原文

如日中天 2024-08-09 06:05:21

随着世界变得越来越小，您遇到的任何字符串最终都可能包含 Unicode。因此，对于任何新应用程序，您至少应该提供 __unicode__()。是否也重写 __str__() 只是一个品味问题。

回复收藏 0 原文

伤感在游骋 2024-08-09 06:05:21

如果您在 Django 中同时使用 python2 和 python3，我推荐 python_2_unicode_completed 装饰器：

Django 提供了一种简单的方法来定义适用于 Python 2 和 3 的 str() 和 unicode() 方法：您必须定义一个 str() 方法返回文本并应用 python_2_unicode_known() 装饰器。

正如之前对另一个答案的评论所述，future.utils 的某些版本也支持此装饰器。在我的系统上，我需要为 python2 安装较新的 future 模块，并为 python3 安装 future 模块。之后，这是一个功能示例：

#! /usr/bin/env python

from future.utils import python_2_unicode_compatible
from sys import version_info

@python_2_unicode_compatible
class SomeClass():
    def __str__(self):
        return "Called __str__"


if __name__ == "__main__":
    some_inst = SomeClass()
    print(some_inst)
    if (version_info > (3,0)):
        print("Python 3 does not support unicode()")
    else:
        print(unicode(some_inst))

这是示例输出（其中 venv2/venv3 是 virtualenv 实例）：

~/tmp$ ./venv3/bin/python3 demo_python_2_unicode_compatible.py 
Called __str__
Python 3 does not support unicode()

~/tmp$ ./venv2/bin/python2 demo_python_2_unicode_compatible.py 
Called __str__
Called __str__

If you are working in both python2 and python3 in Django, I recommend the python_2_unicode_compatible decorator:

Django provides a simple way to define str() and unicode() methods that work on Python 2 and 3: you must define a str() method returning text and to apply the python_2_unicode_compatible() decorator.

As noted in earlier comments to another answer, some versions of future.utils also support this decorator. On my system, I needed to install a newer future module for python2 and install future for python3. After that, then here is a functional example:

#! /usr/bin/env python

from future.utils import python_2_unicode_compatible
from sys import version_info

@python_2_unicode_compatible
class SomeClass():
    def __str__(self):
        return "Called __str__"


if __name__ == "__main__":
    some_inst = SomeClass()
    print(some_inst)
    if (version_info > (3,0)):
        print("Python 3 does not support unicode()")
    else:
        print(unicode(some_inst))

Here is example output (where venv2/venv3 are virtualenv instances):

~/tmp$ ./venv3/bin/python3 demo_python_2_unicode_compatible.py 
Called __str__
Python 3 does not support unicode()

~/tmp$ ./venv2/bin/python2 demo_python_2_unicode_compatible.py 
Called __str__
Called __str__

回复收藏 0 原文

老娘不死你永远是小三 2024-08-09 06:05:21

Python 2：仅实现 __str__()，并返回 unicode。

当省略 __unicode__() 且有人调用 unicode(o) 或 < code>u"%s"%o，Python 调用 o.__str__() 并使用系统编码转换为 unicode。（请参阅 __unicode__() 的文档.)

相反的情况则不然。如果您实现了 __unicode__() 但没有实现 __str__()，那么当有人调用 str(o) 或 "%s"% 时o，Python 返回 repr(o)。

基本原理

为什么从 __str__() 返回 unicode 有效？
如果 __str__() 返回 unicode，Python 会使用系统编码自动将其转换为 str。

有什么好处？
① 它使您不必担心系统编码是什么（即locale.getpreferredencoeding(…)）。就我个人而言，这不仅很混乱，而且我认为无论如何，这是系统应该处理的事情。 ② 如果你小心的话，你的代码可能会与 Python 3 交叉兼容，其中 __str__() 返回 unicode。

从名为 __str__() 的函数返回 unicode 不是具有欺骗性吗？
一点。但是，您可能已经在这样做了。如果您的文件顶部有 from __future__ import unicode_literals ，那么您很有可能在不知情的情况下返回了 unicode。

Python 3 怎么样？
Python 3 不使用__unicode__()。但是，如果您实现 __str__() 以便它在 Python 2 或 Python 3 下返回 unicode，那么这部分代码将是交叉兼容的。

如果我希望 unicode(o) 与 str() 本质上不同怎么办？
同时实现__str__()（可能返回str）和__unicode__()。我想这种情况很少见，但您可能需要实质上不同的输出（例如，特殊字符的 ASCII 版本，例如 ":)" 表示 u"☺"）。

我意识到有些人可能会觉得这有争议。

回复收藏 0 原文

只是偏爱你 2024-08-09 06:05:21

值得向那些不熟悉 __unicode__ 函数的人指出 Python 2.x 中围绕该函数的一些默认行为，尤其是与 __str__ 并排定义时。

class A :
    def __init__(self) :
        self.x = 123
        self.y = 23.3

    #def __str__(self) :
    #    return "STR      {}      {}".format( self.x , self.y)
    def __unicode__(self) :
        return u"UNICODE  {}      {}".format( self.x , self.y)

a1 = A()
a2 = A()

print( "__repr__ checks")
print( a1 )
print( a2 )

print( "\n__str__ vs __unicode__ checks")
print( str( a1 ))
print( unicode(a1))
print( "{}".format( a1 ))
print( u"{}".format( a1 ))

产生以下控制台输出...

__repr__ checks
<__main__.A instance at 0x103f063f8>
<__main__.A instance at 0x103f06440>

__str__ vs __unicode__ checks
<__main__.A instance at 0x103f063f8>
UNICODE 123      23.3
<__main__.A instance at 0x103f063f8>
UNICODE 123      23.3

现在当我取消注释 __str__ 方法时

__repr__ checks
STR      123      23.3
STR      123      23.3

__str__ vs __unicode__ checks
STR      123      23.3
UNICODE  123      23.3
STR      123      23.3
UNICODE  123      23.3

It's worth pointing out to those unfamiliar with the __unicode__ function some of the default behaviors surrounding it back in Python 2.x, especially when defined side by side with __str__.

class A :
    def __init__(self) :
        self.x = 123
        self.y = 23.3

    #def __str__(self) :
    #    return "STR      {}      {}".format( self.x , self.y)
    def __unicode__(self) :
        return u"UNICODE  {}      {}".format( self.x , self.y)

a1 = A()
a2 = A()

print( "__repr__ checks")
print( a1 )
print( a2 )

print( "\n__str__ vs __unicode__ checks")
print( str( a1 ))
print( unicode(a1))
print( "{}".format( a1 ))
print( u"{}".format( a1 ))

yields the following console output...

__repr__ checks
<__main__.A instance at 0x103f063f8>
<__main__.A instance at 0x103f06440>

__str__ vs __unicode__ checks
<__main__.A instance at 0x103f063f8>
UNICODE 123      23.3
<__main__.A instance at 0x103f063f8>
UNICODE 123      23.3

Now when I uncomment out the __str__ method

__repr__ checks
STR      123      23.3
STR      123      23.3

__str__ vs __unicode__ checks
STR      123      23.3
UNICODE  123      23.3
STR      123      23.3
UNICODE  123      23.3

回复收藏 0 原文

~没有更多了~