测量 Linux 上进程的内存使用情况
我正在尝试测量 linux 上进程(java 程序)的内存使用情况,并有两个与之相关的问题:
我尝试使用脚本 ps_mem.py(对 /proc/$PID/smaps 中的值求和),总内存使用峰值约为 135MB(私有内存和共享内存)。共享内存量小于1MB。尝试将 Valgrind 与 Massif 工具
valgrind --tool=massif --trace-children=yes --stacks=yes java myProgram
结合使用,在内存使用峰值时产生约 10MB 的内存。
现在据我了解,堆是存储程序变量的地方,这是否意味着两种方法之间的区别是代码本身(包括jvm)占用的空间?如果不同的机器具有不同的 RAM 量或/和使用不同的处理器(ARM 或 x86),同一程序是否在不同的机器上使用不同的内存量?
I am trying to measure the memory usage of a process (a java program) on linux and have two questions related to that:
I tried using the script ps_mem.py(sums values from /proc/$PID/smaps) and the peak of total memory usage was about 135MB (private and shared memory). The amount of shared memory is less than 1MB. Trying to use Valgrind with the massif tool
valgrind --tool=massif --trace-children=yes --stacks=yes java myProgram
yields about 10MB at the peak of memory usage.
Now as I understand it, heap is where the variables of my program are stored, does it mean that the difference between the two methods is the space taken by the code itself (including the jvm)?Does the same program use different amount of memory on different machines if they have different amount of RAM or/and use different processors (ARM or x86)?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
smaps
中的许多共享内存映射都直接由磁盘上的库/二进制文件支持。虽然这些页面的占用空间确实很重要,但它并不那么重要,因为系统可以随时删除这些页面,并在再次需要时从磁盘重新加载它们。smaps
are directly backed by libraries/binaries on disk. While the footprint of these does matter, it's less important as the system can drop these pages at any time and reload them from disk when needed again.除此之外还有一个类似的问题,并在这里回答相同的问题,让人们了解 linux proc stat vm info 目前如何不准确。
Valgrind 可以显示详细信息,但它会显着减慢目标应用程序的速度,并且大多数时候它会改变应用程序的行为。
我假设每个人都想知道 WRT“内存使用情况”如下...
在Linux中,单个进程可能使用的物理内存量可以大致分为以下几类。
我更愿意按如下方式获取数字,以便以最少的开销获得实数。
您必须将这些内容相加,以便将 ps 显示为 RSS 的内容除以得到更准确的数字以免混淆。
/proc/(pid)/status 尝试显示这些数字,但失败了。
因此,我的愿望是,不要尝试正确地为每个映射标记 [anon]、[stack]
Linux 内核人员会将 proc 入口代码主线化,以求和并显示这些 Mapd、Mapc、Mnpd,... 数字。
恕我直言,嵌入式 Linux 的人们会非常高兴。
Mapd:
Mapc:
Mnpd:...等等
There's a similar question other than this and answering the same here to let people know about how linux proc stat vm info currently is not accurate.
Valgrind can show detailed information but it slows down the target application significantly, and most of the time it changes the behavior of the app.
I assume what everyone wants to know WRT "memory usage" is the following...
In linux, the amount of physical memory a single process might use can be roughly divided into following categories.
I would prefer to get the numbers as follows to get the real numbers in least overhead.
You have to sum these up in order to divide what ps shows as RSS and get more accurate numbers not to confuse.
/proc/(pid)/status tries to show these numbers, but they are failing.
So instead of trying to label [anon], [stack], correctly to each mapping, my wish is
that linux kernel people will mainline the proc entry code to sum and show these M.a.p.d, M.a.p.c, M.n.p.d, .... numbers.
Embedded linux people will get really happy IMHO.
M.a.p.d:
M.a.p.c:
M.n.p.d:... and so on
对于#1,共享内存是指(可能)由多个进程使用的内存。这基本上是如果您在多个进程中运行相同的二进制文件或者不同的进程正在使用共享库。堆是存储已分配内存的位置(当您在 Java 中使用
new
时)。由于 Java 有其 VM,因此它在进程级别分配大量内存,而您在 Java 代码中看不到这些内存。我认为是的,这 135 MB 的大部分来自 JVM 代码/数据本身。但是,堆栈也占用内存(当您进行函数调用并具有局部变量时)。对于#2,当我们让内存等于 RAM + 交换空间时,不同数量的 RAM 不会影响使用多少“内存”。但是,不同的处理器(特别是如果我们讨论的是 32 位与 64 位)可能会使用不同的内存量。此外,进程的编译方式可能会改变所使用的内存量,因为您可以指示编译器针对速度上的内存占用进行优化,以及完全禁用部分或全部优化。
For #1, Shared memory is memory (potentially) used by more than one process. This is basically if you run the same binary file in multiple processes or different processes are using a shared library. The heap is where allocated memory is stored (when you use
new
in Java). Since Java has its VM, it is allocating a lot of memory on the process level that you don't see in your java code. I think that yes, the majority of that 135 MB is from the JVM code/data itself. However, there is also the memory taken up by the stack (when you make a function call and have local variables) as well.For #2, different amount of RAM would not affect how much "memory" is used when we let memory equal RAM + swap space. However, different processors (especially if we're talking about 32-bit vs. 64-bit) may use different amount of memory. Also, the way a process is compiled may change the amount of memory used because you can instruct a compiler to optimize for memory footprint over speed, as well as disabling some or all optimization altogether.
您可能想看一下 JConsole。根据测量的目的,事情可能会很棘手。如果您想了解 Java 程序的内存使用情况,那么测量进程内存使用情况的工具将不准确,因为它们会显示 JVM 以及您的程序使用的内存。
至于massif工具你应该知道JVM的部分内容将存储在堆栈中,而java代码本身可能会存储在堆上(因为它是JVM的变量),我对JVM了解不够说。
You might want to take a look at JConsole. Things can be tricky depending on the purpose of your measurement. If you want to know the memory usage of your Java program then tools which measure the memory usage of a process will be inaccurate because they will show memory used by the JVM as well as your program.
As for the massif tool you should know that parts of the JVM will be stored on the stack, and the java code itself might be on the heap (since it's a variable of the JVM), I don't know enough about the JVM to say.