Python:列表理解背后的机制

发布于 2024-10-15 03:48:08 字数 344 浏览 5 评论 0原文

在 for 循环上下文中使用列表理解或 in 关键字时,即:

for o in X:
    do_something_with(o)

l=[o for o in X]
  • in 背后的机制如何工作?
  • 它调用 X 中的哪些函数\方法?
  • 如果 X 可以遵循多种方法,那么优先顺序是什么?
  • 如何编写一个高效的 X,以便列表理解很快?

When using list comprehension or the in keyword in a for loop context, i.e:

for o in X:
    do_something_with(o)

or

l=[o for o in X]
  • How does the mechanism behind in works?
  • Which functions\methods within X does it call?
  • If X can comply to more than one method, what's the precedence?
  • How to write an efficient X, so that list comprehension will be quick?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

淡紫姑娘! 2024-10-22 03:48:08

据我所知,完整且正确的答案。

for,无论是在 for 循环还是列表推导式中,都会在 X 上调用 iter()。如果 X 有一个 __iter__ 方法或 __getitem__ 方法,iter() 将返回一个可迭代对象。如果它同时实现,则使用__iter__。如果两者都没有,你会得到TypeError: 'Nothing' object is not iterable

这实现了 __getitem__

class GetItem(object):
    def __init__(self, data):
        self.data = data

    def __getitem__(self, x):
        return self.data[x]

用法:

>>> data = range(10)
>>> print [x*x for x in GetItem(data)]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

这是实现 __iter__ 的示例:

class TheIterator(object):
    def __init__(self, data):
        self.data = data
        self.index = -1

    # Note: In  Python 3 this is called __next__
    def next(self):
        self.index += 1
        try:
            return self.data[self.index]
        except IndexError:
            raise StopIteration

    def __iter__(self):
        return self

class Iter(object):
    def __init__(self, data):
        self.data = data

    def __iter__(self):
        return TheIterator(data)

用法:

>>> data = range(10)
>>> print [x*x for x in Iter(data)]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

如您所见,您需要实现迭代器和 __iter__ > 返回迭代器。

您可以将它们组合起来:

class CombinedIter(object):
    def __init__(self, data):
        self.data = data

    def __iter__(self):
        self.index = -1
        return self

    def next(self):
        self.index += 1
        try:
            return self.data[self.index]
        except IndexError:
            raise StopIteration

用法:

>>> well, you get it, it's all the same...

但是这样您一次只能有一个迭代器。
好的,在这种情况下,您可以这样做:

class CheatIter(object):
    def __init__(self, data):
        self.data = data

    def __iter__(self):
        return iter(self.data)

但这是作弊,因为您只是重用了 list__iter__ 方法。
一个更简单的方法是使用yield,并将__iter__变成生成器:

class Generator(object):
    def __init__(self, data):
        self.data = data

    def __iter__(self):
        for x in self.data:
            yield x

最后是我推荐的方法。简单高效。

The, afaik, complete and correct answer.

for, both in for loops and list comprehensions, calls iter() on X. iter() will return an iterable if X either has an __iter__ method or a __getitem__ method. If it implements both, __iter__ is used. If it has neither you get TypeError: 'Nothing' object is not iterable.

This implements a __getitem__:

class GetItem(object):
    def __init__(self, data):
        self.data = data

    def __getitem__(self, x):
        return self.data[x]

Usage:

>>> data = range(10)
>>> print [x*x for x in GetItem(data)]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

This is an example of implementing __iter__:

class TheIterator(object):
    def __init__(self, data):
        self.data = data
        self.index = -1

    # Note: In  Python 3 this is called __next__
    def next(self):
        self.index += 1
        try:
            return self.data[self.index]
        except IndexError:
            raise StopIteration

    def __iter__(self):
        return self

class Iter(object):
    def __init__(self, data):
        self.data = data

    def __iter__(self):
        return TheIterator(data)

Usage:

>>> data = range(10)
>>> print [x*x for x in Iter(data)]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

As you see you need both to implement an iterator, and __iter__ that returns the iterator.

You can combine them:

class CombinedIter(object):
    def __init__(self, data):
        self.data = data

    def __iter__(self):
        self.index = -1
        return self

    def next(self):
        self.index += 1
        try:
            return self.data[self.index]
        except IndexError:
            raise StopIteration

Usage:

>>> well, you get it, it's all the same...

But then you can only have one iterator going at once.
OK, in this case you could just do this:

class CheatIter(object):
    def __init__(self, data):
        self.data = data

    def __iter__(self):
        return iter(self.data)

But that's cheating because you are just reusing the __iter__ method of list.
An easier way is to use yield, and make __iter__ into a generator:

class Generator(object):
    def __init__(self, data):
        self.data = data

    def __iter__(self):
        for x in self.data:
            yield x

This last is the way I would recommend. Easy and efficient.

亚希 2024-10-22 03:48:08

X 必须是可迭代的。它必须实现返回迭代器对象的__iter__();迭代器对象必须实现 next(),每次调用它时都会返回下一个项目,如果没有下一个项目,则引发 StopIteration

列表、元组和生成器都是可迭代的。

请注意,普通的 for 运算符使用相同的机制。

X must be iterable. It must implement __iter__() which returns an iterator object; the iterator object must implement next(), which returns next item every time it is called or raises a StopIteration if there's no next item.

Lists, tuples and generators are all iterable.

Note that the plain for operator uses the same mechanism.

Smile简单爱 2024-10-22 03:48:08

回答问题的评论我可以说在这种情况下阅读源代码并不是最好的主意。负责执行已编译代码的代码 (ceval. c) 对于第一次看到 Python 源代码的人来说似乎并不是很冗长。下面是表示 for 循环中迭代的代码片段:

   TARGET(FOR_ITER)
        /* before: [iter]; after: [iter, iter()] *or* [] */
        v = TOP();

        /*
          Here tp_iternext corresponds to next() in Python
        */
        x = (*v->ob_type->tp_iternext)(v); 
        if (x != NULL) {
            PUSH(x);
            PREDICT(STORE_FAST);
            PREDICT(UNPACK_SEQUENCE);
            DISPATCH();
        }
        if (PyErr_Occurred()) {
            if (!PyErr_ExceptionMatches(
                            PyExc_StopIteration))
                break;
            PyErr_Clear();
        }
        /* iterator ended normally */
        x = v = POP();
        Py_DECREF(v);
        JUMPBY(oparg);
        DISPATCH();

要查找此处实际发生的情况,您需要深入研究一堆其他文件,这些文件的冗长程度也好不了多少。因此,我认为在这种情况下,文档和像 SO 这样的网站是第一个去的地方,而应该只检查源代码是否有未发现的实现细节。

Answering question's comments I can say that reading source is not the best idea in this case. The code that is responsible for execution of compiled code (ceval.c) does not seem to be very verbose for a person that sees Python sources for the first time. Here is the snippet that represents iteration in for loops:

   TARGET(FOR_ITER)
        /* before: [iter]; after: [iter, iter()] *or* [] */
        v = TOP();

        /*
          Here tp_iternext corresponds to next() in Python
        */
        x = (*v->ob_type->tp_iternext)(v); 
        if (x != NULL) {
            PUSH(x);
            PREDICT(STORE_FAST);
            PREDICT(UNPACK_SEQUENCE);
            DISPATCH();
        }
        if (PyErr_Occurred()) {
            if (!PyErr_ExceptionMatches(
                            PyExc_StopIteration))
                break;
            PyErr_Clear();
        }
        /* iterator ended normally */
        x = v = POP();
        Py_DECREF(v);
        JUMPBY(oparg);
        DISPATCH();

To find what actually happens here you need to dive into bunch of other files which verbosity is not much better. Thus I think that in such cases documentation and sites like SO are the first place to go while the source should be checked only for uncovered implementation details.

风柔一江水 2024-10-22 03:48:08

X 必须是一个可迭代对象,这意味着它需要有一个 __iter__() 方法。

因此,要启动 for..in 循环或列表理解,首先调用 X__iter__() 方法来获取迭代器对象;然后每次迭代都会调用该对象的 next() 方法,直到引发 StopIteration 为止,此时迭代停止。

我不确定你的第三个问题是什么意思,以及如何为你的第四个问题提供有意义的答案,除了你的迭代器不应该立即在内存中构造整个列表。

X must be an iterable object, meaning it needs to have an __iter__() method.

So, to start a for..in loop, or a list comprehension, first X's __iter__() method is called to obtain an iterator object; then that object's next() method is called for each iteration until StopIteration is raised, at which point the iteration stops.

I'm not sure what your third question means, and how to provide a meaningful answer to your fourth question except that your iterator should not construct the entire list in memory at once.

忘年祭陌 2024-10-22 03:48:08

也许这有帮助(教程 http://docs.python.org/tutorial/classes.html 第 9.9 节):

在幕后,for 语句
在容器对象上调用 iter()。
该函数返回一个迭代器
定义方法 next() 的对象
它访问中的元素
一次一个容器。当那里
不再有元素,next() 引发
StopIteration 异常告诉
for 循环终止。

Maybe this helps (tutorial http://docs.python.org/tutorial/classes.html Section 9.9):

Behind the scenes, the for statement
calls iter() on the container object.
The function returns an iterator
object that defines the method next()
which accesses elements in the
container one at a time. When there
are no more elements, next() raises a
StopIteration exception which tells
the for loop to terminate.

回眸一遍 2024-10-22 03:48:08

回答你的问题:

背后的机制是如何运作的?

正如其他人已经指出的那样,它与普通 for 循环使用的机制完全相同。

它调用 X 中的哪些函数\方法?

正如下面的评论所述,它调用 iter(X) 来获取迭代器。如果X定义了方法函数__iter__(),则会调用该方法返回一个迭代器;否则,如果X定义了__getitem__(),则会重复调用该函数来迭代X。请参阅此处的 iter() 的 Python 文档:http:// /docs.python.org/library/functions.html#iter

如果 X 可以遵循多个方法,那么优先级是什么?

我不确定你的问题到底是什么,但是 Python 对于如何解析方法名称有标准规则,并且在这里遵循它们。以下是对此的讨论:

方法解析顺序 (MRO) in新风格的Python类

如何编写一个高效的X,以便列表理解会很快?

我建议您阅读更多有关 Python 中的迭代器和生成器的内容。让任何类支持迭代的一种简单方法是为 iter() 创建一个生成器函数。以下是生成器的讨论:

http://linuxgazette.net/100/pramode.html

To answer your questions:

How does the mechanism behind in works?

It is the exact same mechanism as used for ordinary for loops, as others have already noted.

Which functions\methods within X does it call?

As noted in a comment below, it calls iter(X) to get an iterator. If X has a method function __iter__() defined, this will be called to return an iterator; otherwise, if X defines __getitem__(), this will be called repeatedly to iterate over X. See the Python documentation for iter() here: http://docs.python.org/library/functions.html#iter

If X can comply to more than one method, what's the precedence?

I'm not sure what your question is here, exactly, but Python has standard rules for how it resolves method names, and they are followed here. Here is a discussion of this:

Method Resolution Order (MRO) in new style Python classes

How to write an efficient X, so that list comprehension will be quick?

I suggest you read up more on iterators and generators in Python. One easy way to make any class support iteration is to make a generator function for iter(). Here is a discussion of generators:

http://linuxgazette.net/100/pramode.html

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文