Python的List是如何实现的?
是链表还是数组?我环顾四周,只发现人们在猜测。我的C知识还不够好,无法看源代码。
Is it a linked list, an array? I searched around and only found people guessing. My C knowledge isn't good enough to look at the source code.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
实际上,C 代码非常简单。扩展一个宏,修剪掉一些不相关的注释,基本结构在 < code>listobject.h,它将列表定义为:
PyObject_HEAD
包含引用计数和类型标识符。所以,它是一个过度分配的向量/数组。当数组已满时调整数组大小的代码位于listobject.c
。它实际上并不使数组加倍,而是通过每次分配容量来增长,其中
newsize
是请求的大小(不一定是allocated + 1
因为您可以>扩展
任意数量的元素,而不是逐个追加
它们)。另请参阅 Python 常见问题解答 。
The C code is pretty simple, actually. Expanding one macro and pruning some irrelevant comments, the basic structure is in
listobject.h
, which defines a list as:PyObject_HEAD
contains a reference count and a type identifier. So, it's a vector/array that overallocates. The code for resizing such an array when it's full is inlistobject.c
. It doesn't actually double the array, but grows by allocatingto the capacity each time, where
newsize
is the requested size (not necessarilyallocated + 1
because you canextend
by an arbitrary number of elements instead ofappend
'ing them one by one).See also the Python FAQ.
我建议Laurent Luce 的文章“Python 列表实现”。对我来说真的很有用,因为作者解释了如何在 CPython 中实现该列表,并为此使用了优秀的图表。
...
...
...
...
I would suggest Laurent Luce's article "Python list implementation". Was really useful for me because the author explains how the list is implemented in CPython and uses excellent diagrams for this purpose.
...
...
...
...
这是一个动态数组。实际证明:无论索引如何,索引都需要相同的时间(当然差异极小(0.0013 微秒!)):
如果 IronPython 或 Jython 使用链表,我会感到惊讶 - 它们会破坏许多广泛使用的库的性能假设列表是动态数组。
It's a dynamic array. Practical proof: Indexing takes (of course with extremely small differences (0.0013 µsecs!)) the same time regardless of index:
I would be astounded if IronPython or Jython used linked lists - they would ruin the performance of many many widely-used libraries built on the assumption that lists are dynamic arrays.
这取决于实现,但 IIRC:
因此它们都具有 O(1) 随机访问。
This is implementation dependent, but IIRC:
ArrayList
Thus they all have O(1) random access.
在 CPython 中,列表是指针数组。 Python 的其他实现可能会选择以不同的方式存储它们。
In CPython, lists are arrays of pointers. Other implementations of Python may choose to store them in different ways.
根据文档,
According to the documentation,
正如其他人上面所说,列表(当相当大时)是通过分配固定量的空间来实现的,并且,如果该空间应该填满,则分配更大的空间并复制元素。
为了理解为什么该方法是 O(1) 摊销,不失一般性,假设我们已经插入了 a = 2^n 个元素,并且现在必须将表的大小加倍到 2^(n+1) 。这意味着我们当前正在执行 2^(n+1) 次操作。最后一个副本,我们做了 2^n 次操作。在此之前我们做了 2^(n-1)...一直到 8,4,2,1。现在,如果我们将它们相加,我们会得到 1 + 2 + 4 + 8 + ... + 2^(n+1) = 2^(n+2) - 1 < 4*2^n = O(2^n) = O(a) 总插入次数(即 O(1) 摊销时间)。另外,应该注意的是,如果表允许删除,则必须以不同的因子(例如 3x)进行表收缩
As others have stated above, the lists (when appreciably large) are implemented by allocating a fixed amount of space, and, if that space should fill, allocating a larger amount of space and copying over the elements.
To understand why the method is O(1) amortized, without loss of generality, assume we have inserted a = 2^n elements, and we now have to double our table to 2^(n+1) size. That means we're currently doing 2^(n+1) operations. Last copy, we did 2^n operations. Before that we did 2^(n-1)... all the way down to 8,4,2,1. Now, if we add these up, we get 1 + 2 + 4 + 8 + ... + 2^(n+1) = 2^(n+2) - 1 < 4*2^n = O(2^n) = O(a) total insertions (i.e. O(1) amortized time). Also, it should be noted that if the table allows deletions the table shrinking has to be done at a different factor (e.g 3x)
Python 中的列表类似于数组,可以在其中存储多个值。列表是可变的,这意味着您可以更改它。你应该知道的更重要的事情是,当我们创建一个列表时,Python 会自动为该列表变量创建一个reference_id。如果您通过分配其他变量来更改它,则主列表将会更改。让我们尝试举个例子:
我们追加了
my_list
但我们的主列表已更改。这意味着列表没有指定为复制列表指定为其参考。A list in Python is something like an array, where you can store multiple values. List is mutable that means you can change it. The more important thing you should know, when we create a list, Python automatically creates a reference_id for that list variable. If you change it by assigning others variable the main list will be change. Let's try with a example:
We append
my_list
but our main list has changed. That mean's list didn't assign as a copy list assign as its reference.我发现这篇文章对于理解如何使用Python代码实现列表非常有帮助。
I found this article really helpful to understand how lists are implemented using python code.