Python 切片分配内存使用情况

发布于 2024-10-16 15:39:24 字数 844 浏览 5 评论 0原文

我在 Stack Overflow 上的评论中读到，更改列表时进行切片分配可以提高内存效率。例如，

a[:] = [i + 6 for i in a]

因此内存效率更高

a = [i + 6 for i in a]

应该比前者替换现有列表中的元素，而后者创建一个新列表并将 a 重新绑定到该新列表，留下旧的 a ， code> 在内存中，直到可以被垃圾回收。对两者的速度进行基准测试，后者稍微快一些：

$ python -mtimeit -s 'a = [1, 2, 3]' 'a[:] = [i + 6 for i in a]'
1000000 loops, best of 3: 1.53 usec per loop
$ python -mtimeit -s 'a = [1, 2, 3]' 'a = [i + 6 for i in a]'
1000000 loops, best of 3: 1.37 usec per loop

这就是我所期望的，因为重新绑定变量应该比替换列表中的元素更快。但是，我找不到任何支持内存使用声明的官方文档，并且我不确定如何对其进行基准测试。

从表面上看，内存使用声明对我来说是有道理的。然而，再多考虑一下，我希望在前一种方法中，解释器会从列表理解中创建一个新列表，然后将值从该列表复制到 a，让匿名列表浮动，直到它被垃圾收集。如果是这种情况，那么前一种方法将使用相同数量的内存，但速度也较慢。

任何人都可以明确地表明（使用基准或官方文档）这两种方法中哪一种更具有内存效率/哪一种是首选方法？

提前致谢。

原文

I read in a comment here on Stack Overflow that it is more memory efficient to do slice assignment when changing lists. For example,

a[:] = [i + 6 for i in a]

should be more memory efficient than

a = [i + 6 for i in a]

because the former replaces elements in the existing list, while the latter creates a new list and rebinds a to that new list, leaving the old a in memory until it can be garbage collected. Benchmarking the two for speed, the latter is slightly quicker:

$ python -mtimeit -s 'a = [1, 2, 3]' 'a[:] = [i + 6 for i in a]'
1000000 loops, best of 3: 1.53 usec per loop
$ python -mtimeit -s 'a = [1, 2, 3]' 'a = [i + 6 for i in a]'
1000000 loops, best of 3: 1.37 usec per loop

That is what I'd expect, as rebinding a variable should be faster than replacing elements in a list. However, I can't find any official documentation which supports the memory usage claim, and I'm not sure how to benchmark that.

On the face of it, the memory usage claim makes sense to me. However, giving it some more thought, I would expect that in the former method, the interpreter would create a new list from the list comprehension and then copy the values from that list to a, leaving the anonymous list in floating around until it is garbage collected. If that's the case, then the former method would use the same amount of memory while also being slower.

Can anyone show definitively (with a benchmark or official documentation) which of the two methods is more memory efficient/which is the preferred method?

Thanks in advance.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

心在旅行 2024-10-23 15:39:24

该行

a[:] = [i + 6 for i in a]

不会节省任何内存。 Python 首先评估右侧，如语言文档：

赋值语句计算表达式列表（请记住，这可以是单个表达式或逗号分隔的列表，后者生成一个元组），并将单个结果对象从左到右分配给每个目标列表。

在当前的情况下，单个结果对象将是一个新列表，目标列表中的单个目标将是a[:]。

我们可以用生成器表达式替换列表理解：

a[:] = (i + 6 for i in a)

现在，右侧计算结果为生成器而不是列表。基准测试表明，这仍然比天真的慢

a = [i + 6 for i in a]

那么生成器表达式实际上节省了内存吗？乍一看，您可能会认为确实如此。但深入研究该函数的源代码list_ass_slice() 显示它没有。该行

v_as_SF = PySequence_Fast(v, "can only assign an iterable");

使用 PySequence_Fast() 来转换可迭代对象（在本例中）生成器）首先转换为元组，然后将其复制到旧列表中。元组使用与列表相同的内存量，因此在这种情况下使用生成器表达式基本上与使用列表理解相同。在最后一次复制期间，将重用原始列表中的项目。

其寓意似乎是，在这种情况下，最简单的方法在各个方面都是最好的。

The line

a[:] = [i + 6 for i in a]

would not save any memory. Python does evaluate the right hand side first, as stated in the language documentation:

An assignment statement evaluates the expression list (remember that this can be a single expression or a comma-separated list, the latter yielding a tuple) and assigns the single resulting object to each of the target lists, from left to right.

In the case at hand, the single resulting object would be a new list, and the single target in the target list would be a[:].

We could replace the list comprehension by a generator expression:

a[:] = (i + 6 for i in a)

Now, the right hand side evaluates to a generator instead of a list. Benchmarking shows that this is still slower than the naive

a = [i + 6 for i in a]

So does the generator expression actually save any memory? At first glance, you might think it does. But delving in to the source code of the function list_ass_slice() shows that it does not. The line

v_as_SF = PySequence_Fast(v, "can only assign an iterable");

uses PySequence_Fast() to convert the iterable (in this case the generator) into a tuple first, which is then copied into the old list. A tuple uses the same amount of memory as a list, so using a generator expression is basically the same as using a list comprehension in this case. During the last copy, the items of the original list are reused.

The moral seems to be that in this case the simplest approach is best in every regard.

回复收藏 0 原文

~没有更多了~