自定义分配和 Boehm GC

发布于 2024-11-27 09:52:17 字数 788 浏览 4 评论 0原文

在我的 on-again-off-again 编译器项目中,我将闭包实现为带有可执行前缀的分配内存。因此,闭包的分配方式如下:

c = make_closure(code_ptr, env_size, env_data);

c 是指向已分配内存块的指针,如下所示:

movl $closure_call, %eax
call *%eax
.align 4
; size of environment
; environment data
; pointer to closure code

closure_call 是一个辅助函数,它查看最近放置在堆栈上的地址并使用它找到闭包数据和代码指针。 Boehm GC 用于一般的内存管理,当闭包不再被引用时,它可以被 GC 释放。

无论如何,这个分配的内存需要被标记为可执行;事实上,它所跨越的整个页面都被标记了。随着闭包的创建和释放,进程中越来越多的堆内存将是可执行的。

出于防御性编程的原因,我更愿意最小化可执行堆的数量。我的计划是尝试将所有闭包保留在同一页面上,并根据需要分配和释放可执行页面;即为闭包实现自定义分配器。 (如果所有闭包的大小相同,这会更容易;因此第一步是将环境数据移动到可以正常管理的单独的不可执行分配中。这也使防御性编程有意义。)

但剩下的问题是 GC。伯姆已经这么做了!我想要的是以某种方式告诉 Boehm 我的自定义分配,并让 Boehm 告诉我它们何时能够被 GC,但由我来取消分配它们。

所以我的问题是,Boehm 中是否有提供像这样的自定义分配的钩子?

In my on-again-off-again compiler project, I've implemented closures as allocated memory with an executable prefix. So a closure is allocated like this:

c = make_closure(code_ptr, env_size, env_data);

c is a pointer to a block of allocated memory, which looks like this:

movl $closure_call, %eax
call *%eax
.align 4
; size of environment
; environment data
; pointer to closure code

closure_call is a helper function that looks at the address most recently placed on the stack and uses it to find the closure data and code pointer. Boehm GC is used for general memory management, and when the closure is no longer referenced it can be deallocated by the GC.

Anyway this allocated memory needs to be marked as executable; in fact the entire pages it spans get marked. As closures are created and deallocated, more and more heap memory in the process will be executable.

For defensive programming reasons I'd prefer to minimise the amount of executable heap. My plan is to try to keep all closures together on the same page(s), and to allocate and deallocate executable pages as needed; i.e. to implement a custom allocator for closures. (This is easier if all closures are the same size; so the first step is moving the environment data into a separate non-executable allocation that can be managed normally. It also makes defensive programming sense.)

But the remaining issue is GC. Boehm already does this! What I want is to somehow tell Boehm about my custom allocations, and get Boehm to tell me when they're able to be GC'd, but to leave it up to me to deallocate them.

So my question is, are there hooks in Boehm that provide for custom allocations like this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

小ぇ时光︴ 2024-12-04 09:52:17

您可以使用 终结器 执行您想要的操作 - Boehm GC 仍然会释放它,但您将有机会事先使用断点操作(x86 上的 0xCC)来 memset 闭包,并在可能的情况下将其页面标记为不可执行。

然而,终结器会带来性能成本,因此不应轻易使用。 Boehm GC 基于标记-清除算法,该算法首先识别所有不应释放的块 (mark.c),然后一次性释放所有其他内容 (reclaim.c)。在您的情况下,修改回收过程以使用断点操作填充可执行区域中的所有可用空间,并在页面完全变空时将其标记为不可执行是有意义的。这避免了终结器,但以分叉库为代价(我找不到任何可扩展性机制)。

最后,请注意,执行预防是一种深度防御措施,而不应该是您唯一的安全保护。 面向返回的编程可用于使用不可修改的可执行区域执行任意代码。

You may be able to do what you want with a finalizer - Boehm GC would still deallocate it, but you would have an opportunity beforehand to memset the closure with breakpoint ops (0xCC on x86) and mark its page non-executable if possible.

However, finalizers have a performance cost, so should not be used lightly. Boehm GC is based on a mark-sweep algorithm, which first identifies all chunks that should not be freed (mark.c), then frees everything else all at once (reclaim.c). In your case, it makes sense to modify the reclamation process to also fill all free space in your executable region with breakpoint ops, and mark pages non-executable as they become completely empty. This avoids finalizers, at the expense of forking the library (I couldn't find any extensibility mechanism for this).

Finally, note that execution prevention is a defense-in-depth measure, and should not be your only security protection. Return-oriented programming can be used to execute arbitrary code using non-modifiable executable regions.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文