guppy 报告的内存使用情况与 ps 命令不同

发布于 2024-08-18 01:40:06 字数 1895 浏览 6 评论 0原文

我正在分析我的扭曲服务器。它使用的内存比我预期的要多得多。它的内存使用量随着时间的推移而增长。

 ps -o pid,rss,vsz,sz,size,command
  PID   RSS    VSZ    SZ    SZ COMMAND
 7697 70856 102176 25544 88320 twistd -y broadcast.tac

如您所见,它的成本为102176 KB,即99.78125 MB。我使用来自扭曲沙井的孔雀鱼来观察内存使用情况。

>>> hp.heap()
Partition of a set of 120537 objects. Total size = 10096636 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0  61145  51  5309736  53   5309736  53 str
     1  27139  23  1031596  10   6341332  63 tuple
     2   2138   2   541328   5   6882660  68 dict (no owner)
     3   7190   6   488920   5   7371580  73 types.CodeType
     4    325   0   436264   4   7807844  77 dict of module
     5   7272   6   407232   4   8215076  81 function
     6    574   0   305776   3   8520852  84 dict of class
     7    605   1   263432   3   8784284  87 type
     8    602   0   237200   2   9021484  89 dict of type
     9    303   0   157560   2   9179044  91 dict of zope.interface.interface.Method
<384 more rows. Type e.g. '_.more' to view.>

嗯……好像有什么不对劲。 Guppy显示内存的总使用量为10096636字节,即9859.996 KBs9.628 MBs

这是一个巨大的差异。这个奇怪的结果是怎么回事?我做错了什么?

更新: 我昨晚写了一个监控脚本。它记录内存使用情况和在线用户数。它是一个广播服务器,因此您可以看到有广播和听众总数。这是我用 matplotlib 生成的图。 alt text

有些奇怪。有时候ps打印的内存使用量很低,像这样

2010-01-15 00:46:05,139 INFO 4 4 17904 36732 9183 25944
2010-01-15 00:47:03,967 INFO 4 4 17916 36732 9183 25944
2010-01-15 00:48:04,373 INFO 4 4 17916 36732 9183 25944
2010-01-15 00:49:04,379 INFO 4 4 17916 36732 9183 25944
2010-01-15 00:50:02,989 INFO 4 4 3700 5256 1314 2260

内存使用量超低的值是什么原因呢?更重要的是,即使没有在线广播,没有听众,内存使用率仍然很高。

I am profiling my twisted server. It uses much more memory than I expected. Its memory usage grows over time.

 ps -o pid,rss,vsz,sz,size,command
  PID   RSS    VSZ    SZ    SZ COMMAND
 7697 70856 102176 25544 88320 twistd -y broadcast.tac

As you can see it costs 102176 KBs, namely, 99.78125 MBs. And I use guppy from a twisted manhole to watch the memory usage profile.

>>> hp.heap()
Partition of a set of 120537 objects. Total size = 10096636 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0  61145  51  5309736  53   5309736  53 str
     1  27139  23  1031596  10   6341332  63 tuple
     2   2138   2   541328   5   6882660  68 dict (no owner)
     3   7190   6   488920   5   7371580  73 types.CodeType
     4    325   0   436264   4   7807844  77 dict of module
     5   7272   6   407232   4   8215076  81 function
     6    574   0   305776   3   8520852  84 dict of class
     7    605   1   263432   3   8784284  87 type
     8    602   0   237200   2   9021484  89 dict of type
     9    303   0   157560   2   9179044  91 dict of zope.interface.interface.Method
<384 more rows. Type e.g. '_.more' to view.>

Hum... It seems there is something wrong. Guppy shows that the total usage of memory is 10096636 bytes, namely 9859.996 KBs or 9.628 MBs.

That's a huge difference. What's wrong this strange result? What am I doing wrong?

Update:
I wrote a monitor script last night. It records the memory usage and number of on-line users. It is a radio server, so you can see there is radios and total listeners. Here is the figure I generated by matplotlib.
alt text

Something is strange. Sometimes the memory usage printed by ps is very low, like this

2010-01-15 00:46:05,139 INFO 4 4 17904 36732 9183 25944
2010-01-15 00:47:03,967 INFO 4 4 17916 36732 9183 25944
2010-01-15 00:48:04,373 INFO 4 4 17916 36732 9183 25944
2010-01-15 00:49:04,379 INFO 4 4 17916 36732 9183 25944
2010-01-15 00:50:02,989 INFO 4 4 3700 5256 1314 2260

What is the reason of the super low value of memory usage? And what's more, even there is no on-line radios, no listeners, the memory usage is still high.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

请别遗忘我 2024-08-25 01:40:06

可能是由于交换/内存预留,根据 ps 的定义:

RSS: resident set size, the non-swapped physical memory
     that a task has used (in kiloBytes).

VSZ: virtual memory usage of entire process.
     vm_lib + vm_exe + vm_data + vm_stack

这可能有点令人困惑,可以看到 4 个不同的大小指标:

# ps -eo pid,vsz,rss,sz,size,cmd|egrep python

PID    VSZ   RSS   SZ    SZ    CMD
23801  4920  2896  1230  1100  python

虚拟大小包括进程保留但未使用的内存,所有共享库的大小已加载的页面、已换出的页面以及已由进程释放的块,因此它可能比 python 中所有活动对象的大小大得多。

一些用于研究内存性能的附加工具:

使用 pdb 和 objgraph 跟踪 python 中的内存泄漏的良好指南:

http://www.lshift.net/blog/2008 /11/14/tracing-python-内存泄漏

possibly due to swapping/memory reservation, based on ps's definition:

RSS: resident set size, the non-swapped physical memory
     that a task has used (in kiloBytes).

VSZ: virtual memory usage of entire process.
     vm_lib + vm_exe + vm_data + vm_stack

it can be a bit confusing, 4 different size metrics can be seen with:

# ps -eo pid,vsz,rss,sz,size,cmd|egrep python

PID    VSZ   RSS   SZ    SZ    CMD
23801  4920  2896  1230  1100  python

the virtual size includes memory that was reserved by the process and not used, the size of all shared libraries that were loaded, pages that are swapped out, and blocks that were already freed by your process, so it could be much larger than the size of all live objects in python.

some additional tools to investigate memory performance:

good guide on tracking down memory leaks in python using pdb and objgraph:

http://www.lshift.net/blog/2008/11/14/tracing-python-memory-leaks

猥︴琐丶欲为 2024-08-25 01:40:06

正如上面指出的,RSS 大小是您最感兴趣的。 “虚拟”大小包括您可能不想计算的映射库。

我已经有一段时间没有使用 heapy 了,但我很确定它打印的统计数据不包括 heapy 本身添加的开销。这种开销可能相当大(我见过 100MB 的 RSS 进程又增加了十几 MB,请参阅 http://www.pkgcore.org/trac/pkgcore/doc/dev-notes/heapy.rst )。

但在你的情况下,我怀疑问题是你正在使用一些 C 库,这些库要么泄漏,要么以 heapy 不跟踪的方式使用内存。 Heapy 知道 python 对象直接使用的内存,但是如果这些对象包装了单独分配的 C 对象,heapy 通常根本不知道该内存。您也许可以为您的绑定添加大量支持(但是如果您不控制您使用的绑定,这显然会很麻烦,即使您确实控制了绑定,您也可能无法做到这一点,具体取决于您所包装的内容)。

如果 C 级别存在泄漏,heapy 也会丢失对该内存的跟踪(RSS 大小将增加,但 heapy 报告的大小将保持不变)。 Valgrind 可能是追踪这些问题的最佳选择,就像在其他 C 应用程序中一样。

最后:内存碎片通常会导致您的内存使用量(如顶部所示)上升但不会下降(很多)。对于守护进程来说,这通常不是什么大问题,因为进程将重用这些内存,只是不会将其释放回操作系统,因此 top 中的值不会回退。如果内存使用量(如顶部所示)随着用户数(连接数)或多或少呈线性增加,不会回落,但也不会永远保持增长,直到达到新的最大用户数,则碎片可能是归咎于。

As pointed out above the RSS size is what you're most interested in here. The "Virtual" size includes mapped libraries, which you probably don't want to count.

It's been a while since I used heapy, but I am pretty sure the statistics it prints do not include overhead added by heapy itself. This overhead can be pretty significant (I've seen a 100MB RSS process grow another dozen or so MB, see http://www.pkgcore.org/trac/pkgcore/doc/dev-notes/heapy.rst ).

But in your case I suspect the problem is that you are using some C library that either leaks or uses memory in a way that heapy does not track. Heapy is aware of memory used directly by python objects, but if those objects wrap C objects that are separately allocated heapy is not normally aware of that memory at all. You may be able to add heapy support to your bindings (but if you do not control the bindings you use that is obviously a hassle, and even if you do control the bindings you may not be able to do this depending on what you are wrapping).

If there are leaks at the C level heapy will also lose track of that memory (RSS size will go up but heapy's reported size will stay the same). Valgrind is probably your best bet to track these down, just as it is in other C applications.

Finally: memory fragmentation will often cause your memory usage (as seen in top) to go up but not down (much). This is usually not that much of a problem with daemons, since the process will reuse this memory, it's just not released back to the os, so the values in top do not go back down. If memory usage (as seen by top) goes up more or less linearly with the number of users (connections), does not go back down, but also does not keep growing forever until you hit a new maximum number of users, fragmentation is probably to blame.

我一直都在从未离去 2024-08-25 01:40:06

这不是一个完整的答案,但从你的沙井来看,我还建议在使用 ps 或 top 进行查看之前手动运行 gc.collect() 。 guppy 将显示已分配的堆,但不会执行任何操作来主动释放不再分配的对象。

This isn't a complete answer, but from your manhole, I'd also suggest manually running gc.collect() prior to looking with ps or top. guppy will show the allocated heap, but doesn't do anything to proactively free objects that are no longer allocated.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文