编写一个自定义 malloc,将信息存储在指针中
我最近读到了一系列自动内存管理技术,这些技术依赖于在分配器返回的指针中存储信息,即标头的几位,例如用于区分指针或存储与线程相关的信息(请注意,我不是这里谈论的是有限字段引用计数,只有不可变信息)。
我想尝试一下这些技术。现在,为了实现它们,我需要能够从分配器返回具有特定形状的指针。我想我可以使用最轻的位,但这需要看起来非常消耗内存的填充,所以我相信我应该使用最重的位。但是,我不知道如何做到这一点。有没有办法让我调用 malloc
或 malloc_create_zone
或一些相关函数并请求始终以给定位开头的指针?
谢谢大家!
I have recently been reading about a family of automatic memory management techniques that rely on storing information in the pointer returned by the allocator, i.e. few bits of header e.g. to differentiate between pointers or to store thread-related information (note that I'm not talking about limited-field reference counting here, only immutable information).
I'd like to toy with these techniques. Now, to implement them, I need to be able to return pointers with a specific shape from my allocator. I suppose I could play with the least weight bits but this would require padding that looks extremely memory consuming, so I believe that I should play with the heaviest bits. However, I have no good idea on how to do this. Is there a way for me to, call malloc
or malloc_create_zone
or some related function and request a pointer that always starts with the given bits?
Thanks everyone!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
指针中实际可以存储的信息量非常有限(通常每个指针只有一位或两位)。每次尝试取消引用指针都必须首先屏蔽掉魔法信息。顺便说一句,该技术通常称为标记。
这种技术的一个优点是,您可以直接从指针本身推断出对象的类型。您不需要取消引用它(例如,为了读取特殊的
type
字段或类似字段)。许多使用这种方案的语言实现也有针对“立即数”和其他小值的特殊标记组合,可以直接使用“指针”来表示。缺点是可以存储的信息量非常有限。此外,如示例代码所示,您必须注意每次访问对象时的标记,并且需要在实际使用指针之前“取消标记”指针。
使用最低有效位来标记源于观察,在大多数平台上,所有指向 malloc 内存的指针实际上都是在非字节边界(通常是 8 字节)上对齐,因此最低有效位有效位始终为零。
The amount of information you can actually store in a pointer is pretty limited (typically one or two bits per pointer). And every attempt to dereference the pointer has to first mask out the magic information. The technique is often called tagging, BTW.
One advantage of this technique is, that you can directly infer the type of the object from the pointer itself. You don't need to dereference it (say, in order to read a special
type
field or similar). Many language implementations using this scheme also have a special tag combination for "immediate" numbers and other small values, which can be represented direcly using the "pointer".The disadvatage is, that the amount of information, which can be stored, is pretty limited. Also, as the example code shows, you have to be aware of the tagging in every access to the object, and need to "untag" the pointer before actually using it.
The use of the least significant bits for tagging stemms from the observation, that on most platforms, all pointer to
malloc
ed memory is actually aligned on a non-byte boundary (usually 8 bytes), so the least significant bits are always zero.