Python这里不重用内存吗？ Tracemalloc 的输出是什么意思？

发布于 2025-01-13 15:01:04 字数 1436 浏览 3 评论 0原文

我创建了一个包含一百万个 int 对象的列表，然后用其负值替换每个对象。 tracemalloc 报告 28 MB 额外内存（每个新 int 对象 28 字节）。为什么？ Python 不会为新对象重用垃圾收集的 int 对象的内存吗？或者我误解了 tracemalloc 结果？为什么会提到这些数字，它们的真正含义是什么？

import tracemalloc

xs = list(range(10**6))
tracemalloc.start()
for i, x in enumerate(xs):
    xs[i] = -x
print(tracemalloc.get_traced_memory())

输出（在线试用！）：

(27999860, 27999972)

如果我将 xs[i] = -x 替换为 < code>x = -x （因此新对象而不是原始对象被垃圾收集），输出仅仅是 (56, 196) (尝试一下）。我保留/丢失这两个物品中的哪一个有什么区别？

如果我执行循环两次，它仍然只报告 (27992860, 27999972) (尝试一下）。为什么不是 56MB？第二次运行与第一次运行有何不同？

原文

I create a list of a million int objects, then replace each with its negated value. tracemalloc reports 28 MB extra memory (28 bytes per new int object). Why? Does Python not reuse the memory of the garbage-collected int objects for the new ones? Or am I misinterpreting the tracemalloc results? Why does it say those numbers, what do they really mean here?

import tracemalloc

xs = list(range(10**6))
tracemalloc.start()
for i, x in enumerate(xs):
    xs[i] = -x
print(tracemalloc.get_traced_memory())

Output (Try it online!):

(27999860, 27999972)

If I replace xs[i] = -x with x = -x (so the new object rather than the original object gets garbage-collected), the output is a mere (56, 196) (try it). How does it make any difference which of the two objects I keep/lose?

And if I do the loop twice, it still only reports (27992860, 27999972) (try it). Why not 56 MB? How is the second run any different for this than the first?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

忆沫 2025-01-20 15:01:04

简短回答

tracemalloc 启动得太晚，无法跟踪初始内存块，因此
没有意识到这是一个重用。在您给出的示例中，您释放了 27999860 字节
并分配 27999860 字节，但tracemalloc无法“看到”空闲空间。考虑
下面是稍微修改过的示例：

import tracemalloc

tracemalloc.start()

xs = list(range(10**6))
print(tracemalloc.get_traced_memory())
for i, x in enumerate(xs):
    xs[i] = -x
print(tracemalloc.get_traced_memory())

在我的机器上（python 3.10，但相同的分配器），显示：

(35993436, 35993436)
(36000576, 36000716)

在我们分配 xs 后，系统已分配 35993436 字节，在我们运行后
该循环的净总数为 36000576。这表明内存使用量不是
实际上增加了 28 Mb。

为什么它会这样？

Tracemalloc 的工作原理是重写标准内部分配方法
使用 tracemalloc_alloc 以及类似的 free 和 realloc 方法。采取一个
查看源：

static void*
tracemalloc_alloc(int use_calloc, void *ctx, size_t nelem, size_t elsize)
{
    PyMemAllocatorEx *alloc = (PyMemAllocatorEx *)ctx;
    void *ptr;

    assert(elsize == 0 || nelem <= SIZE_MAX / elsize);

    if (use_calloc)
        ptr = alloc->calloc(alloc->ctx, nelem, elsize);
    else
        ptr = alloc->malloc(alloc->ctx, nelem * elsize);
    if (ptr == NULL)
        return NULL;

    TABLES_LOCK();
    if (ADD_TRACE(ptr, nelem * elsize) < 0) {
        /* Failed to allocate a trace for the new memory block */
        TABLES_UNLOCK();
        alloc->free(alloc->ctx, ptr);
        return NULL;
    }
    TABLES_UNLOCK();
    return ptr;
}

我们看到新的分配器做了两件事：

1.) 调用“旧”分配器来获取内存

2.) 将跟踪添加到特殊表中，以便我们可以跟踪该内存

如果我们查看关联的释放函数，它非常相似：

1.) 释放内存

2.) 从在您的示例

中，您在调用 tracemalloc.start() 之前分配了 xs，因此
此分配的跟踪记录永远不会放入内存跟踪中
桌子。因此，当您对初始数组数据调用 free 时，痕迹不会被删除，因此您会出现奇怪的分配行为。

为什么总内存使用量是 36000000 字节而不是 28000000

python 中的列表很奇怪。它们实际上是一个单独指向的指针列表
分配的对象。在内部，它们看起来像这样：

typedef struct {
    PyObject_HEAD
    Py_ssize_t ob_size;

    /* Vector of pointers to list elements.  list[0] is ob_item[0], etc. */
    PyObject **ob_item;

    /* ob_item contains space for 'allocated' elements.  The number
     * currently in use is ob_size.
     * Invariants:
     *     0 <= ob_size <= allocated
     *     len(list) == ob_size
     *     ob_item == NULL implies ob_size == allocated == 0
     */
    Py_ssize_t allocated;
} PyListObject;

PyObject_HEAD 是一个宏，它扩展为所有 python 的一些头信息
变量有。它只有 16 个字节，包含指向类型数据的指针。

重要的是，整数列表实际上是指向 PyObjects 的指针列表
那恰好是整数。在 xs = list(range(10**6)) 行上，我们期望
allocate:

1 PyListObject with inside size 1000000 -- true size:

sizeof(PyObject_HEAD) + sizeof(PyObject *) * 1000000 + sizeof(Py_ssize_t)
(     16 bytes      ) + (    8 bytes     ) * 1000000 + (     8 bytes    )
8000024 bytes

1000000 PyObject ints (底层实现中的 PyLongObject)

1000000 * sizeof(PyLongObject)
1000000 * (     28 bytes     )
28000000 bytes

总共 36000024 字节。这个数字看起来很眼熟！

当您覆盖数组中的值时，只需释放旧值并更新 PyListObject->ob_item 中的指针。这意味着数组结构被分配一次，占用 8000024 字节，并存活到程序结束。另外，每个都分配了 1000000 个 Integer 对象，并将引用放入数组中。它们占用了28000000字节。它们被一一释放，然后在循环中使用内存重新分配一个新对象。这就是为什么多个循环不会增加内存量的原因。

Short Answer

tracemalloc was started too late to track the inital block of memory, so it
didn't realize it was a reuse. In the example you gave, you free 27999860 bytes
and allocate 27999860 bytes, but tracemalloc can't 'see' the free. Consider the
following, slightly modified example:

import tracemalloc

tracemalloc.start()

xs = list(range(10**6))
print(tracemalloc.get_traced_memory())
for i, x in enumerate(xs):
    xs[i] = -x
print(tracemalloc.get_traced_memory())

On my machine (python 3.10, but same allocator), this displays:

(35993436, 35993436)
(36000576, 36000716)

After we allocate xs, the system has allocated 35993436 bytes, and after we run
the loop we have a net total of 36000576. This shows that the memory usage isn't
actually increasing by 28 Mb.

Why does it behave this way?

Tracemalloc works by overriding the standard internal methods for allocating
with tracemalloc_alloc, and the similar free and realloc methods. Taking a
peek at the source:

static void*
tracemalloc_alloc(int use_calloc, void *ctx, size_t nelem, size_t elsize)
{
    PyMemAllocatorEx *alloc = (PyMemAllocatorEx *)ctx;
    void *ptr;

    assert(elsize == 0 || nelem <= SIZE_MAX / elsize);

    if (use_calloc)
        ptr = alloc->calloc(alloc->ctx, nelem, elsize);
    else
        ptr = alloc->malloc(alloc->ctx, nelem * elsize);
    if (ptr == NULL)
        return NULL;

    TABLES_LOCK();
    if (ADD_TRACE(ptr, nelem * elsize) < 0) {
        /* Failed to allocate a trace for the new memory block */
        TABLES_UNLOCK();
        alloc->free(alloc->ctx, ptr);
        return NULL;
    }
    TABLES_UNLOCK();
    return ptr;
}

We see that the new allocator does two things:

1.) Call out to the "old" allocator to get memory

2.) Add a trace to a special table, so we can track this memory

If we look at the associated free functions, it's very similar:

1.) free the memory

2.) Remove the trace from the table

In your example, you allocated xs before you called tracemalloc.start(), so
the trace records for this allocation are never put in the memory tracking
table. Therefore, when you call free on the initial array data, the traces aren't removed, and thus your weird allocation behavior.

Why is the total memory usage 36000000 bytes and not 28000000

Lists in python are weird. They're actually a list of pointer to individually
allocated objects. Internally, they look like this:

typedef struct {
    PyObject_HEAD
    Py_ssize_t ob_size;

    /* Vector of pointers to list elements.  list[0] is ob_item[0], etc. */
    PyObject **ob_item;

    /* ob_item contains space for 'allocated' elements.  The number
     * currently in use is ob_size.
     * Invariants:
     *     0 <= ob_size <= allocated
     *     len(list) == ob_size
     *     ob_item == NULL implies ob_size == allocated == 0
     */
    Py_ssize_t allocated;
} PyListObject;

PyObject_HEAD is a macro that expands to some header information all python
variables have. It is just 16 bytes, and contains pointers to type data.

Importantly, a list of integers is actually a list of pointer to PyObjects
that happen to be ints. On the line xs = list(range(10**6)), we expect to
allocate:

1 PyListObject with internal size 1000000 -- true size:

sizeof(PyObject_HEAD) + sizeof(PyObject *) * 1000000 + sizeof(Py_ssize_t)
(     16 bytes      ) + (    8 bytes     ) * 1000000 + (     8 bytes    )
8000024 bytes

1000000 PyObject ints (A PyLongObject in the underlying implmentation)

1000000 * sizeof(PyLongObject)
1000000 * (     28 bytes     )
28000000 bytes

For a grand total of 36000024 bytes. That number looks pretty farmiliar!

When you overwrite a value in the array, your just freeing the old value, and updating the pointer in PyListObject->ob_item. This means the array structure is allocated once, takes up 8000024 bytes, and lives to the end of the program. Additionally, 1000000 Integer objects are each allocated, and references are put in the array. They take up the 28000000 bytes. One by one, they are deallocated, and then the memory is used to reallocate a new object in the loop. This is why multiple loops don't increase the amount of memory.

回复收藏 0 原文

~没有更多了~