numpy 的内存分析器

发布于 2024-11-07 09:00:29 字数 1329 浏览 1 评论 0原文

我有一个 numpy 脚本 - 根据 top - 使用大约 5GB 的 RAM:

  PID USER   PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
16994 aix    25   0 5813m 5.2g 5.1g S  0.0 22.1  52:19.66 ipython

是否有内存分析器可以让我了解有关对象的一些信息占用了大部分内存?

我尝试过heapy,但是guppy.hpy().heap()给了我这个:

Partition of a set of 90956 objects. Total size = 12511160 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0  42464  47  4853112  39   4853112  39 str
     1  22147  24  1928768  15   6781880  54 tuple
     2    287   0  1093352   9   7875232  63 dict of module
     3   5734   6   733952   6   8609184  69 types.CodeType
     4    498   1   713904   6   9323088  75 dict (no owner)
     5   5431   6   651720   5   9974808  80 function
     6    489   1   512856   4  10487664  84 dict of type
     7    489   1   437704   3  10925368  87 type
     8    261   0   281208   2  11206576  90 dict of class
     9   1629   2   130320   1  11336896  91 __builtin__.wrapper_descriptor
<285 more rows. Type e.g. '_.more' to view.>

由于某种原因,它只占5GB的12MB(大容量)几乎可以肯定 numpy 数组使用了内存)。

关于我可能用 heapy 做错了什么或者我应该尝试哪些其他工具的任何建议(除了 这个线程)?

I have a numpy script that -- according to top -- is using about 5GB of RAM:

  PID USER   PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
16994 aix    25   0 5813m 5.2g 5.1g S  0.0 22.1  52:19.66 ipython

Is there a memory profiler that would enable me to get some idea about the objects that are taking most of that memory?

I've tried heapy, but guppy.hpy().heap() is giving me this:

Partition of a set of 90956 objects. Total size = 12511160 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0  42464  47  4853112  39   4853112  39 str
     1  22147  24  1928768  15   6781880  54 tuple
     2    287   0  1093352   9   7875232  63 dict of module
     3   5734   6   733952   6   8609184  69 types.CodeType
     4    498   1   713904   6   9323088  75 dict (no owner)
     5   5431   6   651720   5   9974808  80 function
     6    489   1   512856   4  10487664  84 dict of type
     7    489   1   437704   3  10925368  87 type
     8    261   0   281208   2  11206576  90 dict of class
     9   1629   2   130320   1  11336896  91 __builtin__.wrapper_descriptor
<285 more rows. Type e.g. '_.more' to view.>

For some reason, it's only accounting for 12MB of the 5GB (the bulk of the memory is almost certainly used by numpy arrays).

Any suggestions as to what I might be doing wrong with heapy or what other tools I should try (other than those already mentioned in this thread)?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

◇流星雨 2024-11-14 09:00:29

Numpy(及其库绑定,稍后会详细介绍)使用 C malloc 来分配空间,这就是为什么大 numpy 分配使用的内存不会出现在 heapy 之类的分析中,并且永远不会被垃圾清理收藏家。

通常,大泄漏的嫌疑人实际上是 scipy 或 numpy 库绑定,而不是 python 代码本身。去年,我被 umfpack 的默认 scipy.linalg 接口严重烧伤,该接口以每次调用约 10Mb 的速度泄漏内存。您可能想尝试像 valgrind 这样的东西来分析代码。它通常可以给出一些提示,告诉您在哪里查看可能存在泄漏的地方。

Numpy (and its library bindings, more on that in a minute) use C malloc to allocate space, which is why memory used by big numpy allocations doesn't show up in the profiling of things like heapy and never gets cleaned up by the garbage collector.

The usual suspects for big leaks are actually scipy or numpy library bindings, rather than python code itself. I got burned badly last year by the default scipy.linalg interface to umfpack, which leaked memory at the rate of about 10Mb a call. You might want to try something like valgrind to profile the code. It can often give some hints as to where to look at where there might be leaks.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文