python：如何覆盖str.join？

发布于 2024-12-13 03:20:10 字数 340 浏览 2 评论 0原文

我们有一个 str 的子类（称之为 MyStr），我需要能够控制 str.join 如何与我的子类交互。

至少，所有 MyStr 的连接应该生成另一个 MyStr，并且 MyStr 和“普通”str 的连接应该抛出 TypeError。

目前，发生的情况如下：（MyStr 子类 unicode）

>>> m = MyStr(':')

>>> m.join( [MyStr('A'), MyStr('B')] )
u'A:B'

>>> ':'.join( [MyStr('A'), 'B', u'C'] )
u'A:B:C'

原文

We have a subclass of str (call it MyStr), and I need to be able to control how str.join interacts with my subclass.

At minimum, a join of all MyStr's should produce another MyStr, and joining of MyStr and "plain" str should throw a TypeError.

Currently, this is what happens: (MyStr subclasses unicode)

>>> m = MyStr(':')

>>> m.join( [MyStr('A'), MyStr('B')] )
u'A:B'

>>> ':'.join( [MyStr('A'), 'B', u'C'] )
u'A:B:C'

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

很糊涂小朋友 2024-12-20 03:20:11

join() 是一个 str 方法。如果您想最终得到一个 MyStr 对象，则可以使用 MyStr 对象来进行连接。

如果你想要一个 TypeError ，你就必须不继承 str 并自己提供 str 的所有方法（至少是那些你需要）。不过，这很可能会使它们对于正常的字符串操作基本上毫无用处。

回复收藏 0 原文

安稳善良 2024-12-20 03:20:10

你的类不能重写join吗：

class MyStr(unicode):
    def join(self, strs):
        # your code here

这至少会涵盖MyStr(...).join(...)的情况

在@bukzor的评论之后，我查了一下它是如何工作的，看起来 join 是一个 C 函数，当使用 unicode 分隔符调用时，它总是返回一个 unicode 对象。

可以在此处查看代码。看一下 PyUnicode_Join 函数，尤其是这一行：

res = _PyUnicode_New(res_alloc);

因此，PyUnicode_Join 的结果将始终是 PyUnicode 的实例。

我能看到的唯一错误情况是输入不是 unicode：

/* Convert item to Unicode. */
if (! PyUnicode_Check(item) && ! PyString_Check(item)) {
    PyErr_Format(PyExc_TypeError,
                 "sequence item %zd: expected string or Unicode,"
                 " %.80s found",
                 i, Py_TYPE(item)->tp_name);
    goto onError;
}

所以我认为这种情况不可能失败（至少，当您的对象从 unicode 扩展时不会失败）：

':'.join( [MyStr('A'), 'B', u'C'] )

Couldn't your class just override join:

class MyStr(unicode):
    def join(self, strs):
        # your code here

This will at least cover the case of MyStr(...).join(...)

After @bukzor's comment, I looked up how this works, and it looks like join is a C function that always returns a unicode object when called using a unicode seperator.

The code can be seen here. Take a look at the PyUnicode_Join function, especially this line:

res = _PyUnicode_New(res_alloc);

So, the result of PyUnicode_Join will always be an instance of PyUnicode.

The only error case I can see is if the input isn't unicode:

/* Convert item to Unicode. */
if (! PyUnicode_Check(item) && ! PyString_Check(item)) {
    PyErr_Format(PyExc_TypeError,
                 "sequence item %zd: expected string or Unicode,"
                 " %.80s found",
                 i, Py_TYPE(item)->tp_name);
    goto onError;
}

So I don't think it's possible to make this case fail (at least, not while your object extends from unicode):