当前位置：文江博客话题详情

了解 Linux /proc/pid/maps 或 /proc/self/maps

发布于 2024-08-04 11:33:29 字数 131 浏览 5 评论 0原文

我试图了解我的嵌入式 Linux 应用程序的内存使用情况。 /proc/pid/maps 实用程序/文件似乎是查看详细信息的好资源。不幸的是我不明白所有的专栏和条目。

匿名 inode 0 条目是什么意思？这些似乎是一些较大的内存段。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

微暖i 2024-08-11 11:33:29

/proc/$PID/maps 中的每一行描述进程或线程中的连续虚拟内存区域。每行都有以下字段：

address           perms offset  dev   inode   pathname
08048000-08056000 r-xp 00000000 03:0c 64593   /usr/sbin/gpm

地址 - 这是进程地址空间中区域的起始和结束地址
权限 - 描述如何访问该区域中的页面。有四种不同的权限：读、写、执行和共享。如果禁用读/写/执行，则会出现 -，而不是 r/w/x。如果某个区域不是共享，那么它就是私有，因此将出现p而不是s。如果进程尝试以不允许的方式访问内存，则会生成分段错误。可以使用mprotect系统调用来更改权限。
偏移量 - 如果区域是从文件映射的（使用mmap），则这是文件中映射开始的偏移量。如果内存不是从文件映射的，则它只是 0。
设备 - 如果区域是从文件映射的，则这是文件所在的主设备号和次设备号（以十六进制表示）。
inode - 如果该区域是从文件映射的，则这是文件号。
路径名 - 如果区域是从文件映射的，则这是文件的名称。对于匿名映射区域，此字段为空。还有一些特殊区域，其名称如 [heap]、[stack] 或 [vdso]。 [vdso] 代表虚拟动态共享对象。系统调用使用它来切换到内核模式。这是一篇好文章关于它：“什么是 linux-gate.so.1？”

您可能会注意到很多匿名区域。这些通常由 mmap 创建，但不附加到任何文件。它们用于许多杂项，例如共享内存或未在堆上分配的缓冲区。例如，我认为 pthread 库使用匿名映射区域作为新线程的堆栈。

Each row in /proc/$PID/maps describes a region of contiguous virtual memory in a process or thread. Each row has the following fields:

address           perms offset  dev   inode   pathname
08048000-08056000 r-xp 00000000 03:0c 64593   /usr/sbin/gpm

address - This is the starting and ending address of the region in the process's address space
permissions - This describes how pages in the region can be accessed. There are four different permissions: read, write, execute, and shared. If read/write/execute are disabled, a - will appear instead of the r/w/x. If a region is not shared, it is private, so a p will appear instead of an s. If the process attempts to access memory in a way that is not permitted, a segmentation fault is generated. Permissions can be changed using the mprotect system call.
offset - If the region was mapped from a file (using mmap), this is the offset in the file where the mapping begins. If the memory was not mapped from a file, it's just 0.
device - If the region was mapped from a file, this is the major and minor device number (in hex) where the file lives.
inode - If the region was mapped from a file, this is the file number.
pathname - If the region was mapped from a file, this is the name of the file. This field is blank for anonymous mapped regions. There are also special regions with names like [heap], [stack], or [vdso]. [vdso] stands for virtual dynamic shared object. It's used by system calls to switch to kernel mode. Here's a good article about it: "What is linux-gate.so.1?"

You might notice a lot of anonymous regions. These are usually created by mmap but are not attached to any file. They are used for a lot of miscellaneous things like shared memory or buffers not allocated on the heap. For instance, I think the pthread library uses anonymous mapped regions as stacks for new threads.

回复收藏 0 原文

離殇 2024-08-11 11:33:29

请检查：http://man7.org/linux/man-pages/ man5/proc.5.html

address           perms offset  dev   inode       pathname
00400000-00452000 r-xp 00000000 08:02 173521      /usr/bin/dbus-daemon

地址字段是进程中的地址空间
映射占用。

perms 字段是一组权限：

 r = read
 w = write
 x = execute
 s = shared
 p = private (copy on write)

offset 字段是文件/其他内容的偏移量；

dev 是设备（主要：次要）；

inode 是该设备上的 inode。0 表示没有 inode 与该内存区域关联，就像 BSS（未初始化数据）的情况一样。

路径名字段通常是支持的文件
映射。对于 ELF 文件，您可以轻松协调
通过查看ELF中的Offset字段来获取offset字段
程序头（readelf -l）。

在 Linux 2.0 下，没有给出路径名的字段。

Please check: http://man7.org/linux/man-pages/man5/proc.5.html

address           perms offset  dev   inode       pathname
00400000-00452000 r-xp 00000000 08:02 173521      /usr/bin/dbus-daemon

The address field is the address space in the process that the
mapping occupies.

The perms field is a set of permissions:

 r = read
 w = write
 x = execute
 s = shared
 p = private (copy on write)

The offset field is the offset into the file/whatever;

dev is the device (major:minor);

inode is the inode on that device.0 indicates that no inode is associated with the memoryregion, as would be the case with BSS (uninitialized data).

The pathname field will usually be the file that is backing
the mapping. For ELF files, you can easily coordinate with
the offset field by looking at the Offset field in the ELF
program headers (readelf -l).

Under Linux 2.0, there is no field giving pathname.

回复收藏 0 原文

坏尐絯 2024-08-11 11:33:29

虽然问题特别提到了嵌入式系统，但标题只提到了 proc//maps，这对于理解“正常”程序也非常有用。在这个更广泛的上下文中，重要的是要认识到由 malloc() 分配的内存最终可能位于堆中或任意数量的匿名内存段中。因此，大块匿名内存很可能来自 malloc()。

/proc//maps 所指的 [heap] 更准确地说是为静态变量分配的内存之间的连续区域（称为 BSS 段）和一个称为“程序中断”的地址（见下图）。最初，该区域是空的并且没有堆。当 malloc() 被调用时，它可以通过请求内核来创建/扩展堆——通过 brk() 系统调用—移动程序中断。同样，如果与程序中断相邻的所有地址都不再使用，free() 可以缩小堆。

然而，移动程序中断并不是 malloc() 为自己腾出更多空间的唯一方法。它还可以通过 mmap() 询问内核 系统调用——在堆栈和堆之间保留一块空间（见下图）。以这种方式分配的内存出现在 /proc//maps 中，作为问题中提到的“匿名 inode 0 条目”。

图片来源

值得详细阐述 mmap() 系统调用一点。 mmap() 可以创建四种内存映射，每种内存映射都有不同的用途。首先，内存可以与某个文件的内容绑定，也可以不绑定。后者被称为“匿名”地图。其次，内存可以是“私有”的，也可以是“共享的”。私有意味着一个进程所做的更改对任何其他进程都是不可见的；这通常以一种惰性而高效的方式实现，称为“copy-on-write ”。共享意味着每个进程都可以访问相同的底层物理内存。以下是我所知道的每种内存映射的用途：

私有文件：可执行文件、动态库、大型数据结构的高效副本
私有匿名映射：malloc()、动态库的 BSS 段、线程的堆栈空间
共享文件：在不相关的进程之间共享内存
共享匿名映射：共享内存相关进程之间

回到 /proc//maps，您可以通过查看“pathname”和“perms”列来找出每行描述的内存映射类型。（这些列名称来自内核文档）。对于文件映射，“路径名”列将保存正在映射的文件的实际路径。对于匿名地图，“路径名”列将为空。还有一些特殊的路径名，例如 [heap] 和 [stack]。对于私有和共享地图，“perms”列将分别包含 p 或 s 标志。

malloc() 的当前实现使用 brk() 进行小型分配，使用 mmap() 进行大型分配。在堆上分配少量内存是有意义的，因为通常可以找到必要的空间，而不必进行昂贵的系统调用（例如，通过重用以前释放的空间）。然而，大量分配存在永远不会被释放回操作系统的风险。考虑一下如果您要在堆上进行大量分配，然后进行一堆小分配，会发生什么情况。即使在释放大分配之后，程序中断也无法移回，直到所有小分配也被释放。这个简单的示例假设分配按顺序进入堆，这是一种幼稚的方法，但它说明了堆如何使将内存释放回操作系统变得更加困难。

以下是 man malloc 的相关部分：

通常，malloc() 从堆中分配内存，并使用 sbrk(2) 根据需要调整堆的大小。当分配大于 MMAP_THRESHOLD 字节的内存块时，glibc malloc() 实现会使用 mmap(2) 将内存分配为私有匿名映射。 MMAP_THRESHOLD 默认情况下为 128 kB，但可以使用 mallopt(3) 进行调整。在 Linux 4.7 之前，使用 mmap(2) 执行的分配不受 RLIMIT_DATA 资源限制的影响；从 Linux 4.7 开始，对于使用 mmap(2) 执行的分配也强制执行此限制。

总之，如果您的程序使用 malloc()，则 malloc() 可能负责许多映射到虚拟内存并由 < 报告的大型匿名段。代码>/proc//maps。

买者自负：我在这里写的几乎所有内容都是我今天才学到的，所以请持保留态度。也就是说，我发现以下资源链接对于理解所有这些非常有帮助：

关于虚拟内存的精彩介绍
/proc//maps 的内核文档
- 这是整个 /proc 文件系统文档的链接，不幸的是，我认为没有办法直接链接到相关部分。但如果你搜索“/proc/PID/maps”，你应该能够找到正确的地方。
堆的确切定义
为什么调用 malloc() 不创建堆？
brk() 与 mmap() 的优点
malloc() 和 free() 如何实现工作

Although the question specifically mentions embedded systems, the title only mentions proc/<pid>/maps, which is also very useful for understanding "normal" programs. In this broader context, it's important to realize that memory allocated by malloc() can end up either in the heap or in any number of anonymous memory segments. Big blocks of anonymous memory, therefore, are likely to have come from malloc().

What /proc/<pid>/maps refers to as [heap] is more precisely a contiguous region between the memory allocated for static variables (called the BSS segment) and an address called the "program break" (see diagram below). Initially, this region is empty and there is no heap. When malloc() is called, it can create/expand the heap by asking the kernel—via the brk() syscall—to move the program break. Likewise, free() can shrink the heap if all the addresses adjacent to the program break are no longer in use.

However, moving the program break is not the only way that malloc() can make more room for itself. It can also ask the kernel—via the mmap() syscall—to reserve a block of space somewhere between the stack and the heap (see diagram below). Memory allocated in this way appears in /proc/<pid>/maps as the "anonymous inode 0 entries" mentioned in the question.

Image credit

It's worth elaborating on the mmap() syscall a bit. There are four kinds of memory maps that mmap() can create, and they are each used for very different purposes. First, the memory can either be tied to the contents of a certain file, or not. The latter is called an "anonymous" map. Second, the memory can either be "private" or "shared". Private means that changes made by one process will not be visible to any others; this is usually implemented in a lazy and efficient manner called "copy-on-write". Shared means that each process will get access to the same underlying physical memory. Below are the uses that I am aware of for each kind of memory map:

Private files: Executables, dynamic libraries, efficient copies of large data structures
Private anonymous maps: malloc(), BSS segments for dynamic libraries, stack space for threads
Shared files: Sharing memory between unrelated processes
Shared anonymous maps: Sharing memory between related processes

Going back to /proc/<pid>/maps, you can figure out which kind of memory map each line describes by looking at the "pathname" and "perms" columns. (These column names come from the kernel docs). For file maps, the "pathname" column will hold an actual path to the file being mapped. For anonymous maps, the "pathname" column will be empty. There are also some special path names like [heap] and [stack]. For private and shared maps, the "perms" column will include the p or s flag, respectively.

Current implementations of malloc() use brk() for small allocations and mmap() for large ones. It makes sense to allocate small amounts of memory on the heap, because it is very often possible to find the necessary space without having to make an expensive syscall (e.g. by reusing previously freed space). However, large allocations run the risk of never being released back to the operating system. Consider what would happen if you were to make a big allocation on the heap followed by a bunch of small ones. Even if after the big allocation is freed, the program break couldn't be moved back until all the small allocations were also freed. This simple example assumes that the allocations go onto the heap in order, which is a naive approach, but it illustrates how the heap makes it much harder to free memory back to the operating system.

Here's the relevant section from man malloc:

Normally, malloc() allocates memory from the heap, and adjusts the size of the heap as required, using sbrk(2). When allocating blocks of memory larger than MMAP_THRESHOLD bytes, the glibc malloc() implementation allocates the memory as a private anonymous mapping using mmap(2). MMAP_THRESHOLD is 128 kB by default, but is adjustable using mallopt(3). Prior to Linux 4.7 allocations performed using mmap(2) were unaffected by the RLIMIT_DATA resource limit; since Linux 4.7, this limit is also enforced for allocations performed using mmap(2).

In summary, if your program uses malloc(), then malloc() is likely responsible for many of the large, anonymous segments that get mapped into virtual memory and reported by /proc/<pid>/maps.

Caveat emptor: Pretty much everything I wrote here I just learned today, so take it with a grain of salt. That said, here are links to resources I found very helpful for understanding all of this: