为内存数据结构寻找安全的幻数
我正在实现一个堆分配器(malloc),我需要选择一个幻数来检查给定的指针是否指向我分配的数据结构。对我来说,显然没有一个神奇的数字可以被认为是完全安全的(如果检查了该数字,我可以确定指向我的数据结构之一),但也许我错过了一些东西,所以......如果有人可以提供帮助并且给我一些我的梦想,我真的很感激。 提前谢谢。
I'm implementing a heap allocator (malloc), and I need to choose a magic number to check if a given pointer point to a data structure I allocated. It seems obvious to me that no magic number can be considered completely safe (if the number is checked, I can be sure a point to one of my data structure), but maybe I missed something, so... if someone can help and bring me the number of my dreams, I'd really appreciate.
Thx in advance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
这取决于你这样做的目的。如果您这样做是为了尝试捕获编程错误(例如,您想确保不会意外混淆
my_malloc
/my_free
和malloc< /code>/
free
),然后选择一个随机值。当然,有时它无法检测到这种情况,但这并不重要。它不应该发生。所以,在这里:如果正确性取决于此,那么您确实应该以另一种方式执行此操作。例如,通过跟踪您在散列或树中分配的地址,或者在特殊情况下在位图中分配的地址。
如果您实际上正在实现 malloc/free(例如,编写自己的 C 库),请记住,
free
没有malloc
的东西(NULL 除外) ) 是标准未定义的行为,因此您的代码无需担心会发生什么。It depends on what you're doing this for. If you're doing it to try and catch programming mistakes (e.g., you want to make sure you don't accidentally mix up
my_malloc
/my_free
andmalloc
/free
), then just pick a random value. Sure, sometimes it'll fail to detect such a case, but that really doesn't matter. It shouldn't ever happen. So, here:If correctness depends on this, then you really ought to do this another way. For example, by keeping track of which addresses you've allocated in a hash or tree or, in special cases, a bitmap.
If you're actually implementing malloc/free (e.g., writing your own C library), then keep in mind that
free
ing something that wasn'tmalloc
ed (except NULL) is undefined behavior by the standard, so your code doesn't need to worry what happens.您不应选择单个幻数,而应使用随机数(最好至少设置一个较低的 8 位 - 例如,您可以通过 ORing 1 来强制执行此操作)或某个常数 - 您的选择,然后将其 (^) 与地址(例如,您正在检查的地址)进行异或。这种方法将大大降低意外碰撞的可能性。
例如,当您编写对象标头(或页标头,具体取决于您编写的分配器类型)时,存储
MAGIC ^ addr
。现在,当您想检查addr
是否有效时,只需查看value == addr ^ MAGIC
是否有效(当然,需要进行适当的转换)。,在开始创建您自己的自定义内存分配器之前,请阅读来自 OOPSLA 2002 的这篇论文(重新考虑自定义内存分配,作者:Berger、Zorn 和 McKinley)。
顺便说一句 //www.cs.umass.edu/~emery/pubs/berger-oopsla2002.pdf" rel="nofollow">http://www.cs.umass.edu/~emery/pubs/berger-oopsla2002.pdf< /a>
摘要:
希望经常实现性能改进的程序员
使用自定义内存分配器。这项深入研究考察了八项
使用自定义分配器的应用程序。令人惊讶的是,对于其中的六人来说
这些应用程序,一个最先进的通用分配器(
Lea 分配器)的性能与自定义分配器一样好甚至更好。这两个例外使用区域,可提供更高的性能(提高高达 44%)。区域还可以减轻程序员的负担并消除内存泄漏的根源。然而,我们表明程序员无法解放个人
区域内的对象会导致内存大幅增加
消耗。更糟糕的是,这种限制阻止了区域的使用
对于常见的编程习惯,降低了它们的实用性。
我们提出了通用和基于区域的概括
我们称之为收获的分配者。收获是区域的组合
和堆,提供全方位的区域语义,并添加单个对象删除。我们表明,我们的收获实现提供了高性能,优于具有类似区域语义的其他分配器。然后我们使用案例研究
在实践中展示 reaps 的空间优势和软件工程优势。我们的结果表明程序员
需要快速区域应该使用收获,并且大多数程序员
考虑自定义分配器应该使用 Lea 分配器。
Rather than picking a single magic number, you should use a random number (preferably with at least one of the lower 8 bits set -- you can force this by ORing in 1, for instance) or some constant -- your choice, and then XOR it (^) with an address (e.g., the address you are checking). That approach will dramatically reduce the odds of an accidental collision.
For example, when you write the object header (or page header, depending on the kind of allocator you are writing), store
MAGIC ^ addr
. Now when you want to check ifaddr
is valid, just see ifvalue == addr ^ MAGIC
(with appropriate casts, of course).By the way, before embarking on creating your own custom memory allocator, please read this paper (Reconsidering Custom Memory Allocation, by Berger, Zorn and McKinley), from OOPSLA 2002.
http://www.cs.umass.edu/~emery/pubs/berger-oopsla2002.pdf
Abstract:
Programmers hoping to achieve performance improvements often
use custom memory allocators. This in-depth study examines eight
applications that use custom allocators. Surprisingly, for six of
these applications, a state-of-the-art general-purpose allocator (the
Lea allocator) performs as well as or better than the custom allocators. The two exceptions use regions, which deliver higher performance (improvements of up to 44%). Regions also reduce programmer burden and eliminate a source of memory leaks. However, we show that the inability of programmers to free individual
objects within regions can lead to a substantial increase in memory
consumption. Worse, this limitation precludes the use of regions
for common programming idioms, reducing their usefulness.
We present a generalization of general-purpose and region-based
allocators that we call reaps. Reaps are a combination of regions
and heaps, providing a full range of region semantics with the addition of individual object deletion. We show that our implementation of reaps provides high performance, outperforming other allocators with region-like semantics. We then use a case study to
demonstrate the space advantages and software engineering benefits of reaps in practice. Our results indicate that programmers
needing fast regions should use reaps, and that most programmers
considering custom allocators should instead use the Lea allocator.
我还没有做过这样的事情(我使用过堆,但没有实现任何分配器)并且我不确定你想要做什么,但也许这是你应该使用散列的情况。
根据您到底在做什么,这意味着对内存块的地址或其包含的数据进行散列(每次更改某些内容时,这意味着重新计算散列)或内存操作的某种 ID。
再说一遍,我不确定你想要实现什么目标,那么这些就是我的 2 美分。
I haven't done such a thing (I worked with heaps but I did notimplemented any allocator) and I'm not certain of what you're trying to do, but maybe this is a case you should use hashing.
Depending on what exactly you're doing, it means to hash the address of the memory chunk, or the data it contains (and every time you change something, this implies recalculating the hash) or some kind of ID of the memory operation.
Again, I am not certain of what you are trying to achieve, then those are my 2 cents.
TALLOC_MAGIC 0xe814ec70
这是来自源代码中的文件
talloc.c
此处。当然,你必须看看为什么talloc选择这个神奇的数字,但它只是一个开始。
TALLOC_MAGIC 0xe814ec70
This is from the file
talloc.c
in the source code here.Of course, you will have to look at why talloc chose this magic number, but it's a start.