如何在 Linux 上以编程方式获取堆的地址

发布于 2024-09-15 18:29:15 字数 110 浏览 8 评论 0原文

我可以使用 sbrk(0) 获取堆末尾的地址,但是除了解析 的内容之外,有什么方法可以以编程方式获取堆开头的地址>/proc/self/maps

I can get the address of the end of the heap with sbrk(0), but is there any way to programmatically get the address of the start of the heap, other than by parsing the contents of /proc/self/maps?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

睫毛上残留的泪 2024-09-22 18:29:15

我认为解析 /proc/self/maps 是 Linux 上查找堆段的唯一可靠方法。并且不要忘记,某些分配器(包括我的 SLES 中的分配器)确实用于大块 mmap(),因此内存不再是堆的一部分,可以位于任何随机位置。

否则,通常ld会添加一个符号来标记elf中所有段的结束,该符号称为_end。例如:

extern void *_end;
printf( "%p\n", &_end );

它匹配 .bss 的末尾,传统上是 elf 的最后一段。地址之后,经过一定的对齐,通常位于堆之后。堆栈和 mmap()(包括共享库)位于地址空间的较高地址。

我不确定它的可移植性如何,但显然它在 Solaris 10 上的工作方式相同。在 HP-UX 11 上,映射看起来不同,堆似乎与数据段合并,但分配确实发生在 _end 之后。在 AIX 上,procmap 根本不显示堆/数据段,但分配也会获取超过 _end 符号的地址。所以目前看来它非常便携。

不过,综合考虑,我不确定这有多大用处。

PS测试程序:

#include <stdio.h>
#include <stdlib.h>

char *ppp1 = "hello world";
char ppp0[] = "hello world";
extern void *_end; /* any type would do, only its address is important */

int main()
{
    void *p = calloc(10000,1);
    printf( "end:%p heap:%p rodata:%p data:%p\n", &_end, p, ppp1, ppp0 );
    sleep(10000); /* sleep to give chance to look at the process memory map */
    return 0;
}

I think parsing /proc/self/maps is the only reliable way on the Linux to find the heap segment. And do not forget that some allocators (including one in my SLES) do use for large blocks mmap() thus the memory isn't part of the heap anymore and can be at any random location.

Otherwise, normally ld adds a symbol which marks the end of all segments in elf and the symbol is called _end. E.g.:

extern void *_end;
printf( "%p\n", &_end );

It matches the end of the .bss, traditionally the last segment of elf. After the address, with some alignment, normally follows the heap. Stack(s) and mmap()s (including the shared libraries) are at the higher addresses of the address space.

I'm not sure how portable it is, but apparently it works same way on the Solaris 10. On HP-UX 11 the map looks different and heap appears to be merged with data segment, but allocations do happen after the _end. On AIX, procmap doesn't show heap/data segment at all, but allocations too get the addresses past the _end symbol. So it seems to be at the moment quite portable.

Though, all considered, I'm not sure how useful that is.

P.S. The test program:

#include <stdio.h>
#include <stdlib.h>

char *ppp1 = "hello world";
char ppp0[] = "hello world";
extern void *_end; /* any type would do, only its address is important */

int main()
{
    void *p = calloc(10000,1);
    printf( "end:%p heap:%p rodata:%p data:%p\n", &_end, p, ppp1, ppp0 );
    sleep(10000); /* sleep to give chance to look at the process memory map */
    return 0;
}
指尖微凉心微凉 2024-09-22 18:29:15

您可以调用 sbrk(0) 来获取堆的开头,但必须确保尚未分配内存。

最好的方法是在 main() 的开头分配返回值。请注意,许多函数确实会在幕后分配内存,因此在 printf(诸如 mtrace 等内存实用程序)之后调用 sbrk(0) 或即使调用 putenv 也已经返回一个偏移值。

尽管我们可以找到的大部分内容都表明堆就在 .bss 旁边,但我不确定 end 和第一次中断之间有什么区别。阅读那里似乎会导致分段错误。
第一个中断和 malloc 返回的第一个地址之间的区别(可能)是:

  • 内存双链表的头部,包括下一个空闲块,
  • 一个以malloced 块包括:
    • 该块的长度
    • 前一个空闲块的地址
    • 下一个空闲块的地址
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <errno.h>


void print_heap_line();

int main(int argc, char const *argv[])
{
    char* startbreak = sbrk(0);

    printf("pid: %d\n", getpid()); // printf is allocating memory
    char* lastbreak = sbrk(0);
    printf("heap: [%p - %p]\n", startbreak, lastbreak);

    long pagesize = sysconf(_SC_PAGESIZE);
    long diff = lastbreak - startbreak;
    printf("diff: %ld (%ld pages of %ld bytes)\n", diff, diff/pagesize, pagesize);

    print_heap_line();

    printf("\n\npress a key to finish...");
    getchar(); // gives you a chance to inspect /proc/pid/maps yourself
    return 0;
}

void print_heap_line() {
    int mapsfd = open("/proc/self/maps", O_RDONLY);
    if(mapsfd == -1) {
        fprintf(stderr, "open() failed: %s.\n", strerror(errno));
        exit(1);
    }
    char maps[BUFSIZ] = "";
    if(read(mapsfd, maps, BUFSIZ) == -1){
        fprintf(stderr, "read() failed: %s.\n", strerror(errno));
        exit(1);
    }
    if(close(mapsfd) == -1){
        fprintf(stderr, "close() failed: %s.\n", strerror(errno));
        exit(1);
    }

    char*  line = strtok(maps, "\n");
    while((line = strtok(NULL, "\n")) != NULL) {
        if(strstr(line, "heap") != NULL) {
            printf("\n\nfrom /proc/self/maps:\n%s\n", line);
            return;
        }
    }
}
pid: 29825
heap: [0x55fe05739000 - 0x55fe0575a000]
diff: 135168 (33 pages of 4096 bytes)


from /proc/self/maps:
55fe05739000-55fe0575a000 rw-p 00000000 00:00 0                          [heap]


press a key to finish...

You may call sbrk(0) to get the start of the heap, but you have to make sure no memory has been allocated yet.

The best way to do this is to assign the return value at the very beginning of main(). Note that many functions do allocate memory under the hood, so a call to sbrk(0) after a printf, a memory utility like mtrace or even a call to putenv will already return an offset value.

Although much of what we can find say that the heap is right next to .bss, I am not sure what is in the difference between end and the first break. Reading there seems to results in a segmentation fault.
The difference between the first break and the first address returned by malloc is, among (probably) other thing:

  • the head of the memory double-linked-list, including the next free block
  • a structure prefixed to the malloced block incuding:
    • the length of this block
    • the address of the previous free block
    • the address of the next free block
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <errno.h>


void print_heap_line();

int main(int argc, char const *argv[])
{
    char* startbreak = sbrk(0);

    printf("pid: %d\n", getpid()); // printf is allocating memory
    char* lastbreak = sbrk(0);
    printf("heap: [%p - %p]\n", startbreak, lastbreak);

    long pagesize = sysconf(_SC_PAGESIZE);
    long diff = lastbreak - startbreak;
    printf("diff: %ld (%ld pages of %ld bytes)\n", diff, diff/pagesize, pagesize);

    print_heap_line();

    printf("\n\npress a key to finish...");
    getchar(); // gives you a chance to inspect /proc/pid/maps yourself
    return 0;
}

void print_heap_line() {
    int mapsfd = open("/proc/self/maps", O_RDONLY);
    if(mapsfd == -1) {
        fprintf(stderr, "open() failed: %s.\n", strerror(errno));
        exit(1);
    }
    char maps[BUFSIZ] = "";
    if(read(mapsfd, maps, BUFSIZ) == -1){
        fprintf(stderr, "read() failed: %s.\n", strerror(errno));
        exit(1);
    }
    if(close(mapsfd) == -1){
        fprintf(stderr, "close() failed: %s.\n", strerror(errno));
        exit(1);
    }

    char*  line = strtok(maps, "\n");
    while((line = strtok(NULL, "\n")) != NULL) {
        if(strstr(line, "heap") != NULL) {
            printf("\n\nfrom /proc/self/maps:\n%s\n", line);
            return;
        }
    }
}
pid: 29825
heap: [0x55fe05739000 - 0x55fe0575a000]
diff: 135168 (33 pages of 4096 bytes)


from /proc/self/maps:
55fe05739000-55fe0575a000 rw-p 00000000 00:00 0                          [heap]


press a key to finish...
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文