当前位置：文江博客话题详情

c dynamic-linking ld dlopen

从内存中打开？

发布于 2024-10-18 08:22:51 字数 513 浏览 5 评论 0 原文

我正在寻找一种直接从内存加载生成的目标代码的方法。

我知道，如果我将其写入文件，我可以调用 dlopen 来动态加载其符号并链接它们。然而，考虑到它从内存中开始，写入磁盘，然后由 dlopen 重新加载到内存中，这似乎有点迂回。我想知道是否有某种方法可以动态链接内存中存在的目标代码。据我所知，可能有几种不同的方法可以做到这一点：

欺骗 dlopen 认为您的内存位置是一个文件，即使它永远不会离开内存。
找到一些其他系统调用来完成我正在寻找的事情（我认为这不存在）。
找一些可以直接在内存中链接代码的动态链接库。显然，这个有点难以用谷歌搜索，因为“动态链接库”会显示有关如何动态链接库的信息，而不是有关执行动态链接任务的库的信息。
从链接器中提取一些 API 并根据其代码库创建一个新库。（显然这对我来说是最不理想的选择）。

那么其中哪些是可能的呢？可行的？你能指出我假设存在的任何事物吗？还有其他我没想到的方法吗？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

夜血缘 2024-10-25 08:22:51

我需要一个解决方案，因为我有一个可编写脚本的系统，没有文件系统（使用数据库中的 blob）并且需要加载二进制插件来支持某些脚本。这是我提出的解决方案，它可以在 FreeBSD 上运行，但可能不可移植。

void *dlblob(const void *blob, size_t len) {
    /* Create shared-memory file descriptor */
    int fd = shm_open(SHM_ANON, O_RDWR, 0);
    ftruncate(fd, len);
    /* MemMap file descriptor, and load data */
    void *mem = mmap(NULL, len, PROT_WRITE, MAP_SHARED, fd, 0);
    memcpy(mem, blob, len);
    munmap(mem, len);
    /* Open Dynamic Library from SHM file descriptor */
    void *so = fdlopen(fd,RTLD_LAZY);
    close(fd);
    return so;
}

显然，代码缺乏任何类型的错误检查等，但这是核心功能。

ETA：我最初认为 fdlopen 是 POSIX 的假设是错误的，这似乎是 FreeBSD 主义。

I needed a solution to this because I have a scriptable system that has no filesystem (using blobs from a database) and needs to load binary plugins to support some scripts. This is the solution I came up with which works on FreeBSD but may not be portable.

void *dlblob(const void *blob, size_t len) {
    /* Create shared-memory file descriptor */
    int fd = shm_open(SHM_ANON, O_RDWR, 0);
    ftruncate(fd, len);
    /* MemMap file descriptor, and load data */
    void *mem = mmap(NULL, len, PROT_WRITE, MAP_SHARED, fd, 0);
    memcpy(mem, blob, len);
    munmap(mem, len);
    /* Open Dynamic Library from SHM file descriptor */
    void *so = fdlopen(fd,RTLD_LAZY);
    close(fd);
    return so;
}

Obviously the code lacks any kind of error checking etc, but this is the core functionality.

ETA: My initial assumption that fdlopen is POSIX was wrong, this appears to be a FreeBSD-ism.

回复收藏 0 原文

素手挽清风 2024-10-25 08:22:51

我不明白您为什么要考虑 dlopen，因为这将需要更多的不可移植代码来在磁盘上生成正确的对象格式（例如 ELF）以进行加载。如果您已经知道如何为您的体系结构生成机器代码，只需使用 PROT_READ|PROT_WRITE|PROT_EXEC mmap 内存并将代码放在那里，然后将地址分配给函数指针并调用它。很简单。

回复收藏 0 原文

那支青花 2024-10-25 08:22:51

除了写出文件然后使用 dlopen() 再次加载之外，没有标准方法可以做到这一点。

您可能会在当前的特定平台上找到一些替代方法。由您决定这是否比使用“标准和（相对）可移植”方法更好。

由于首先生成目标代码是特定于平台的，因此其他特定于平台的技术可能对您来说并不重要。但这是一个判断——无论如何，它取决于是否存在非标准技术，而这是相对不可能的。

回复收藏 0 原文

夜还是长夜 2024-10-25 08:22:51

我们在 Google 实现了一种方法来做到这一点。不幸的是上游 glibc 未能理解这一需求，因此它从未被接受。包含补丁的功能请求已停止。它称为dlopen_from_offset。

dlopen_with_offset glibc 代码可在 glibc google/grte* 分支中找到。但没有人应该享受修改自己的 glibc。

回复收藏 0 原文

烟若柳尘 2024-10-25 08:22:51

以下是在 Linux 上使用内存 fd 和 memfd_create 完全在内存中完成此操作的方法（无需写入 /tmp/xxx）：

user@system $ ./main < example-library.so
add(1, 2) = 3

// example-library.c
int add(int a, int b) { return a + b; }

#include <cstdio>
#include <dlfcn.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <sys/syscall.h>
#include <unistd.h>
#include <vector>

// Compile and then invoke as:
// $ ./main < my-shared-lib.so
int main() {
  // Read the shared library contents from stdin
  std::vector<char> library_contents;
  char buffer[1024];
  ssize_t bytes_read;
  while ((bytes_read = read(STDIN_FILENO, buffer, sizeof(buffer))) > 0) {
    library_contents.insert(library_contents.end(), buffer,
                            buffer + bytes_read);
  }

  // Create a memory file descriptor using memfd_create
  int fd = memfd_create("shared_library", 0);
  if (fd == -1) {
    perror("memfd_create failed");
    return 1;
  }

  // Write the shared library contents to the file descriptor
  if (write(fd, library_contents.data(), library_contents.size()) !=
      static_cast<ssize_t>(library_contents.size())) {
    perror("write failed");
    return 1;
  }

  // Create a path to the file descriptor using /proc/self/fd
  // https://sourceware.org/bugzilla/show_bug.cgi?id=30100#c33
  char path[100]; // > 35 == strlen("/proc/self/fd/") + log10(pow(2, 64)) + 1
  snprintf(path, sizeof(path), "/proc/self/fd/%d", fd);

  // Use dlopen to dynamically load the shared library
  void *handle = dlopen(path, RTLD_NOW);
  if (handle == NULL) {
    fprintf(stderr, "dlopen failed: %s\n", dlerror());
    return 1;
  }

  // Use the shared library...
  // Get a pointer to the function "int add(int, int)"
  int (*add)(int, int) =
      reinterpret_cast<int (*)(int, int)>(dlsym(handle, "add"));

  if (add == NULL) {
    fprintf(stderr, "dlsym failed: %s\n", dlerror());
    return 1;
  }

  // Call the function "int add(int, int)"
  printf("add(1, 2) = %d\n", add(1, 2));

  // Cleanup
  dlclose(handle);
  close(fd);
  return 0;
}

Here's how you can do it entirely in-memory on Linux (no writing to /tmp/xxx) using a memory fd with memfd_create:

user@system $ ./main < example-library.so
add(1, 2) = 3

// example-library.c
int add(int a, int b) { return a + b; }

#include <cstdio>
#include <dlfcn.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <sys/syscall.h>
#include <unistd.h>
#include <vector>

// Compile and then invoke as:
// $ ./main < my-shared-lib.so
int main() {
  // Read the shared library contents from stdin
  std::vector<char> library_contents;
  char buffer[1024];
  ssize_t bytes_read;
  while ((bytes_read = read(STDIN_FILENO, buffer, sizeof(buffer))) > 0) {
    library_contents.insert(library_contents.end(), buffer,
                            buffer + bytes_read);
  }

  // Create a memory file descriptor using memfd_create
  int fd = memfd_create("shared_library", 0);
  if (fd == -1) {
    perror("memfd_create failed");
    return 1;
  }

  // Write the shared library contents to the file descriptor
  if (write(fd, library_contents.data(), library_contents.size()) !=
      static_cast<ssize_t>(library_contents.size())) {
    perror("write failed");
    return 1;
  }

  // Create a path to the file descriptor using /proc/self/fd
  // https://sourceware.org/bugzilla/show_bug.cgi?id=30100#c33
  char path[100]; // > 35 == strlen("/proc/self/fd/") + log10(pow(2, 64)) + 1
  snprintf(path, sizeof(path), "/proc/self/fd/%d", fd);

  // Use dlopen to dynamically load the shared library
  void *handle = dlopen(path, RTLD_NOW);
  if (handle == NULL) {
    fprintf(stderr, "dlopen failed: %s\n", dlerror());
    return 1;
  }

  // Use the shared library...
  // Get a pointer to the function "int add(int, int)"
  int (*add)(int, int) =
      reinterpret_cast<int (*)(int, int)>(dlsym(handle, "add"));

  if (add == NULL) {
    fprintf(stderr, "dlsym failed: %s\n", dlerror());
    return 1;
  }

  // Call the function "int add(int, int)"
  printf("add(1, 2) = %d\n", add(1, 2));

  // Cleanup
  dlclose(handle);
  close(fd);
  return 0;
}

回复收藏 0 原文

羁〃客ぐ 2024-10-25 08:22:51

从内存加载 solib 有一个固有的限制。
也就是说，solib 的 DT_NEEDED deps 无法引用
到内存缓冲区。这意味着，除其他外，
你不能轻易地用 deps 加载 solib
内存缓冲区。恐怕，除非ELF规范
扩展为允许 DT_NEEDED 引用其他
对象比文件名更没有标准
用于从内存缓冲区加载 solib 的 API。

我认为你需要使用posix的shm_open()，然后
mmap 共享内存，在那里生成你的 solib，
然后通过 /dev/shm 挂载点使用普通 dlopen() 。
这样也可以处理部门：他们可以
引用常规文件或 /dev/shm 对象
有你生成的 solibs。

回复收藏 0 原文

违心° 2024-10-25 08:22:51

您不需要加载内存中生成的代码，因为它已经在内存中了！

但是，您可以以非可移植的方式在内存中生成机器代码（前提是它位于带有 PROT_EXEC 标志的 mmap 内存段中）。

^{（在这种情况下，不需要“链接”或重定位步骤，因为您生成具有确定的绝对或相对地址的机器代码，特别是调用外部函数）}

存在一些库可以做到这一点：在 GNU 上/Linux x86 或 x86-64，我知道 GNU Lightning（快速生成运行缓慢的机器代码），DotGNU LibJIT（生成中等质量的代码），以及LLVM & GCCJIT （它能够在内存中生成相当优化的代码，但需要时间来发出它）。 LuaJit 也有一些类似的功能。自 2015 年起，GCC 5 有了一个 gccjit 库。

当然，您仍然可以在文件中生成 C 代码，派生编译器将其编译为共享对象，然后 dlopen 该共享对象文件。我正在使用 GCC MELT 来实现这一点，这是一种扩展 GCC 的领域特定语言。它在实践中确实运作得很好。