0xDEADBEEF 与 NULL

发布于 2024-11-05 13:26:10 字数 313 浏览 0 评论 0原文

在各种代码中,我看到调试版本中的内存分配使用 NULL...

memset(ptr,NULL,size);

或使用 0xDEADBEEF...

memset(ptr,0xDEADBEEF,size);
  1. 使用每一种的优点是什么,以及什么是通常在 C/C++ 中实现此目的的首选方法?
  2. 如果一个指针被赋值为0xDEADBEEF,它难道不能仍然遵循有效数据吗?

Throughout various code, I have seen memory allocation in debug builds with NULL...

memset(ptr,NULL,size);

Or with 0xDEADBEEF...

memset(ptr,0xDEADBEEF,size);
  1. What are the advantages to using each one, and what is the generally preferred way to achieve this in C/C++?
  2. If a pointer was assigned a value of 0xDEADBEEF, couldn't it still deference to valid data?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

兮子 2024-11-12 13:26:10
  1. 使用 memset(ptr, NULL, size)memset(ptr, 0xDEADBEEF, size) 清楚地表明作者不明白什么他们正在做。

    首先,如果 NULL 被定义为整数零,memset(ptr, NULL, size) 确实会将 C 和 C++ 中的内存块清零。

    但是,在此上下文中使用 NULL 表示零值是不可接受的做法。 NULL 是专门为指针上下文引入的宏。 memset 的第二个参数是一个整数,而不是指针。将内存块清零的正确方法是 memset(ptr, 0, size)。注意:0 不是 NULL。我想说,即使是 memset(ptr, '\0', size) 看起来也比 memset(ptr, NULL, size) 更好。

    此外,最新(目前)的 C++ 标准 - C++11 - 允许将 NULL 定义为 nullptrnullptr 值不能隐式转换为 int 类型,这意味着上面的代码不能保证在 C++11 及更高版本中编译。

    在 C 语言中(您的问题也被标记为 C),宏 NULL 可以扩展为 (void *) 0。即使在 C 中,(void *) 0 也不能隐式转换为 int 类型,这意味着在一般情况下 memset(ptr, NULL, size) 在 C 中只是无效代码。

    其次,即使 memset 的第二个参数的类型为 int,该函数也会将其解释为 unsigned char 值。这意味着仅使用该值的一个低字节来填充目标内存块。因此,memset(ptr, 0xDEADBEEF, size) 将编译,但不会用 0xDEADBEEF 值填充目标内存区域,正如代码作者可能天真的希望的那样。 memset(ptr, 0xDEADBEEF, size) 等价于 memset(ptr, 0xEF, size)(假设为 8 位字符)。虽然这可能足以用故意的“垃圾”填充某些内存区域,但像 memset(ptr, NULL, size)memset(ptr, 0xDEADBEEF, size)仍然暴露出作者严重缺乏专业精神。

    同样,正如其他答案已经指出的那样,这里的想法是用“垃圾”值填充未使用的内存。在这种情况下,零当然不是一个好主意,因为它还不够“垃圾”。使用 memset 时,您只能使用一字节值,例如 0xAB0xEF。如果这足以满足您的目的,请使用memset。如果您想要更具表现力和独特的垃圾值,例如 0xDEDABEEF0xBAADFOOD,您将无法将 memset 与它一起使用。您必须编写一个专用函数,可以用 4 字节模式填充内存区域。

  2. C 和 C++ 中的指针不能被赋予任意整数值(空指针常量除外,即零)。这种赋值只能通过使用显式强制转换将整数值强制到指针中来实现。正式来说,这种转换的结果是实现定义的。结果值当然可以指向有效数据。

  1. Using either memset(ptr, NULL, size) or memset(ptr, 0xDEADBEEF, size) is a clear indication of the fact that the author did not understand what they were doing.

    Firstly, memset(ptr, NULL, size) will indeed zero-out a memory block in C and C++ if NULL is defined as an integral zero.

    However, using NULL to represent the zero value in this context is not an acceptable practice. NULL is a macro introduced specifically for pointer contexts. The second parameter of memset is an integer, not a pointer. The proper way to zero-out a memory block would be memset(ptr, 0, size). Note: 0 not NULL. I'd say that even memset(ptr, '\0', size) looks better than memset(ptr, NULL, size).

    Moreover, the most recent (at the moment) C++ standard - C++11 - allows defining NULL as nullptr. nullptr value is not implicitly convertible to type int, which means that the above code is not guaranteed to compile in C++11 and later.

    In C language (and your question is tagged C as well) macro NULL can expand to (void *) 0. Even in C (void *) 0 is not implicitly convertible to type int, which means that in general case memset(ptr, NULL, size) is simply invalid code in C.

    Secondly, even though the second parameter of memset has type int, the function interprets it as an unsigned char value. It means that only one lower byte of the value is used to fill the destination memory block. For this reason memset(ptr, 0xDEADBEEF, size) will compile, but will not fill the target memory region with 0xDEADBEEF values, as the author of the code probably naively hoped. memset(ptr, 0xDEADBEEF, size) is eqivalent to memset(ptr, 0xEF, size) (assuming 8-bit chars). While this is probably good enough to fill some memory region with intentional "garbage", things like memset(ptr, NULL, size) or memset(ptr, 0xDEADBEEF, size) still betray the major lack of professionalism on the author's part.

    Again, as other answer have already noted, the idea here is to fill the unused memory with a "garbage" value. Zero is certainly not a good idea in this case, since it is not "garbagy" enough. When using memset you are limited to one-byte values, like 0xAB or 0xEF. If this is good enough for your purposes, use memset. If you want a more expressive and unique garbage value, like 0xDEDABEEF or 0xBAADFOOD, you won't be able to use memset with it. You'll have to write a dedicated function that can fill memory region with 4-byte pattern.

  2. A pointer in C and C++ cannot be assigned an arbitrary integer value (other than a Null Pointer Constant, i.e. zero). Such assignment can only be achieved by forcing the integral value into the pointer with an explicit cast. Formally speaking, the result of such a cast is implementation defined. The resultant value can certainly point to valid data.

ゞ记忆︶ㄣ 2024-11-12 13:26:10

写入 0xDEADBEEF 或其他非零位模式是一个好主意,能够捕获删除后写入和删除后读取的使用。

1) 删除后写入

通过写入特定模式,您可以检查已释放的块是否后来被错误代码覆盖;在我们的调试内存管理器中,我们使用空闲的块列表,在回收内存块之前,我们检查我们的自定义模式是否仍然写入整个块。当然,当我们发现问题时有点“晚”了,但仍然比发现问题时不做检查要早得多。
此外,我们还有一个特殊的函数,可以定期调用,也可以按需调用,它只是遍历所有已释放内存块的列表并检查它们的一致性,因此我们可以在追踪错误时经常调用该函数。使用 0x00000000 作为值不会那么有效,因为零可能正是有错误的代码想要写入已释放块的值,例如将字段归零或将指针设置为 NULL(相反,它更多)有缺陷的代码不太可能想要写入 0xDEADBEEF)。

2) 删除后读取

保持已释放块的内容不变,甚至只写入零,将增加读取死内存块内容的人仍会发现值合理且与不变量兼容的可能性(例如,许多情况下的 NULL 指针)体系结构 NULL 只是二进制零,或整数 0、ASCII NUL 字符或双精度值 0.0)。
通过编写像 0xDEADBEEF 这样的“奇怪”模式,大多数在读取模式下访问的代码可能会发现奇怪的不合理值(例如整数 -559038737 或值为 -1.1885959257070704e+148 的双精度值) ,希望能触发一些其他的自我一致性检查断言。

当然,位模式 0xDEADBEEF 没有什么真正特定的,实际上我们对释放的块、块前区域、块后区域使用不同的模式,并且我们的内存管理器写入另一个(与地址相关)在将任何内存块的内容部分提供给应用程序之前,将其指定为特定的位模式(这有助于查找未初始化内存的使用情况)。

Writing 0xDEADBEEF or another non-zero bit pattern is a good idea to be able to catch both write-after-delete and read-after-delete uses.

1) Write after delete

By writing a specific pattern you can check if a block that has already been deallocated was written over later by buggy code; in our debug memory manager we use a free list of blocks and before recycling a memory block we check that our custom pattern are still written all over the block. Of course it's sort of "late" when we discover the problem, but still much earlier than when it would be discovered not doing the check.
Also we have a special function that is called periodically and that can also be called on demand that just goes through the list of all freed memory blocks and check their consistency and so we can call this function often when chasing a bug. Using 0x00000000 as value wouldn't be as effective because zero may possibly be exactly the value that buggy code wants to write in the already deallocated block e.g. zeroing a field or setting a pointer to NULL (it's instead more unlikely that the buggy code wants to write 0xDEADBEEF).

2) Read after delete

Leaving the content of a deallocated block untouched or even writing just zeros will increase the possibility that someone reading the content of a dead memory block will still find the values reasonable and compatible with invariants (e.g. a NULL pointer as on many architectures NULL is just binary zeroes, or the integer 0, the ASCII NUL char or a double value 0.0).
By writing instead "strange" patterns like 0xDEADBEEF most of code that will access in read mode those bytes will probably find strange unreasonable values (e.g. the integer -559038737 or a double with value -1.1885959257070704e+148), hopefully triggering some other self consistency check assertion.

Of course nothing is really specific to the bit pattern 0xDEADBEEF, actually we use different patterns for freed blocks, before-block area, after-block area and and also our memory manager writes another (address-dependent) specific bit pattern to the content part of any memory block before giving it to the application (this is to help finding uses of uninitialized memory).

眼前雾蒙蒙 2024-11-12 13:26:10

我肯定会推荐 0xDEADBEEF。它清楚地识别未初始化的变量,并访问未初始化的指针。

奇怪的是,在加载字时取消引用 0xdeadbeef 指针肯定会在 PowerPC 架构上崩溃,并且很可能在其他架构上崩溃,因为内存可能位于进程的地址空间之外。

将内存清零很方便,因为许多结构/类都有使用 0 作为初始值的成员变量,但我强烈建议在构造函数中初始化每个成员,而不是使用默认的内存填充。您确实希望了解是否正确初始化了变量。

I would definitely recommend 0xDEADBEEF. It clearly identifies uninitialized variables, and accesses to uninitialized pointers.

Being odd, dereferencing a 0xdeadbeef pointer will definitely crash on the PowerPC architecture when loading a word, and very likely crash on other architectures since the memory is likely to be outside the process' address space.

Zeroing out memory is a convenience since many structures/classes have member variables that use 0 as their initial value, but I would very much recommend initializing each member in the constructor rather than using the default memory fill. You will really want to be on top of whether or not you properly initialized your variables.

虫児飞 2024-11-12 13:26:10

http://en.wikipedia.org/wiki/Hexspeak

这些“神奇”数字是调试帮助来识别错误的指针、未初始化的内存等。您需要一个在正常执行期间不太可能出现的值,以及在进行内存转储或检查变量时可见的值。在这方面,初始化为零不太有用。我猜当你看到人们初始化为零时,那是因为他们需要将该值设置为零。值为 0xDEADBEEF 的指针可能指向有效的内存位置,因此使用它来替代 NULL 是一个坏主意。

http://en.wikipedia.org/wiki/Hexspeak

These "magic" numbers are are a debugging aid to identify bad pointers, uninitialized memory etc. You want a value that is unlikely to occur during normal execution and something that is visible when doing memory dumps or inspecting variables. Initializing to zero is less useful in this regard. I would guess that when you see people initialize to zero it is because they need to have that value at zero. A pointer with a value of 0xDEADBEEF could point to a valid memory location so it's a bad idea to use that as an alternative to NULL.

江南烟雨〆相思醉 2024-11-12 13:26:10

将缓冲区清空或将其设置为特殊值的原因之一是,您可以轻松判断缓冲区内容在调试器中是否有效。

取消引用值“0xDEADBEEF”的指针几乎总是危险的(可能会使您的程序/系统崩溃),因为在大多数情况下您不知道其中存储的内容。

One reason that you null the buffer or set it to a special value is that you can easily tell whether the buffer contents is valid or not in the debugger.

Dereferencing a pointer of value "0xDEADBEEF" is almost always dangerous(probably crashes your program/system) because in most cases you have no idea what is stored there.

小猫一只 2024-11-12 13:26:10

DEADBEEF 是 HexSpeek 的示例。有了它,作为程序员,您可以故意传达错误情况。

DEADBEEF is an example of HexSpeek. With it, as a programmer you convey intentionally an error condition.

享受孤独 2024-11-12 13:26:10

我个人建议使用 NULL(或 0x0),因为它代表预期的 NULL,并且在比较时派上用场。想象一下,由于某种原因(不知道为什么),您在 DEADBEEF 上使用 char * 和其间的字符,那么至少您的调试器会非常方便地告诉您它是 0x0。

I would personally recommend using NULL (or 0x0) as it represents the NULL as expected and comes in handy while comparison. Imagine you are using char * and in between on DEADBEEF for some reason (don't know why), then at least your debugger will come very handy to tell you that its 0x0.

栩栩如生 2024-11-12 13:26:10

我会选择 NULL,因为将内存归零比稍后将所有指针设置为 0xDEADBEEF 要容易得多。此外,没有什么可以阻止 0xDEADBEEF 成为 x86 上的有效内存地址——诚然,这很不寻常,但也绝非不可能。 NULL 更可靠。

最终,look- NULL 是语言约定。 0xDEADBEEF 只是看起来很漂亮,仅此而已。你不会因此得到任何好处。库将检查 NULL 指针,而不检查 0xDEADBEEF 指针。在 C++ 中,零指针的概念甚至与零值无关,只是用文字零表示,而在 C++0x 中,有一个 nullptr 和一个 nullptr_t< /代码>。

I would go for NULL because it's much easier to mass zero out memory than to go through later and set all the pointers to 0xDEADBEEF. In addition, there's nothing at all stopping 0xDEADBEEF from being a valid memory address on x86- admittedly, it would be unusual, but far from impossible. NULL is more reliable.

Ultimately, look- NULL is the language convention. 0xDEADBEEF just looks pretty and that's it. You gain nothing for it. Libraries will check for NULL pointers, they don't check for 0xDEADBEEF pointers. In C++ then the idea of the zero pointer isn't even tied to a zero value, just indicated with the literal zero, and in C++0x there is a nullptr and a nullptr_t.

给我一枪 2024-11-12 13:26:10

如果这对 StackOverflow 来说太有意见了,请投票给我,但我认为整个讨论是我们用来制作软件的工具链中存在明显漏洞的症状。

通过使用“garabage-y”值初始化内存来检测未初始化的变量只能检测某些类型数据中的某些类型的错误。

在调试版本中检测未初始化的变量而不是在发布版本中检测未初始化的变量,就像仅在测试飞机时遵循安全程序并告诉飞行公众“嗯,测试正常”感到满意。

我们需要硬件支持来检测未初始化的变量。就像“无效”位一样,它伴随着内存的每个可寻址实体(=我们大多数机器上的字节),并且由操作系统在每个字节 VirtualAlloc() (等人或其他操作系统上的等效项)中设置传递给应用程序,并且在写入字节时自动清除,但如果先读取则会导致异常。

内存足够便宜,处理器也足够快。结束对“有趣”模式的依赖,让我们所有人都保持诚实。

Vote me down if this is too opinion-y for StackOverflow but I think this whole discussion is a symptom of a glaring hole in the toolchain we use to make software.

Detecting uninititialized variables by initializing memory with "garabage-y" values detects only some kinds of errors in some kinds of data.

And detecting uninititialized variables in debug builds but not for release builds is like following safety procedures only when testing an aircraft and telling the flying public to be satisfied with "well, it tested OK".

WE NEED HARDWARE SUPPORT for detecting uninitialized variables. As in something like an "invalid" bit that accompanies every addressability entity of memory (=byte on most of our machines) and which is set by the OS in every byte VirtualAlloc() (et. al, or equivalents on other OS's) hands over to applications and which is automatically cleared when the byte is written to but which causes an exception if read first.

Memory is cheap enough for this and processors are fast enough for this. This end of reliance on "funny" patterns and keeps us all honest to boot.

谎言月老 2024-11-12 13:26:10

请注意,memset 中的第二个参数应该是一个字节,即它隐式转换为 char 或类似的参数。对于大多数平台,0xDEADBEEF 会转换为 0xEF(对于某些奇怪的平台,会转换为其他内容)。

另请注意,第二个参数在形式上应该是 int,而 NULL 不是。

现在介绍一下进行此类初始化的优点。首先,当然,行为更有可能是确定性的(即使我们最终以未定义的行为结束,行为实际上也是一致的)。

具有确定性的行为意味着调试变得更加容易,当您发现错误时,您“只需”提供相同的输入,错误就会自行显现。

现在,当您选择要使用的值时,您应该选择一个最有可能导致不良行为的值 - 这意味着使用未初始化的数据更有可能导致观察到错误。这意味着您必须使用相关平台的一些知识(但是其中许多平台的行为非常相似)。

如果内存用于保存指针,那么清除内存实际上意味着您将获得一个 NULL 指针,并且通常取消引用将导致分段错误(这将被视为错误)。但是,如果您以其他方式使用它,例如作为算术类型,那么您将得到0,并且对于许多应用程序来说,这不是那个奇数。

如果您改为使用 0xDEADBEEF,您将得到一个相当大的整数,而且当将数据解释为浮点时,它也将是一个相当大的数字 (IIRC)。如果将其解释为文本,它将非常长并且包含非 ascii 字符,如果您使用 UTF-8 编码,它可能会无效。现在,如果在某些平台上用作指针,它将无法满足某些类型的对齐要求 - 而且在某些平台上,内存区域可能会被映射出来(请注意,在 x86_64 上,指针的值将是 0xDEADBEEFDEADBEEF 超出地址范围)。

请注意,虽然使用 0xEF 填充将具有非常相似的属性,但如果您想使用 0xDEADBEEF 填充内存,则需要使用自定义函数,因为 memset< /code> 不起作用。

Note that the second argument in memset is supposed to be a byte, that is it is implicitely cast to a char or similar. 0xDEADBEEF would for most platforms convert to 0xEF (and something else for some odd platform).

Also note that the second argument is supposed to formally be an int which NULL isn't.

Now for the advantage of doing these kind of initialization. First of course the behavior would more likely be deterministic (even if we by this ends up in undefined behavior the behavior would in practice be consistent).

Having deterministic behavior will mean that debugging becomes easier, when you found a bug you would "only" have to provide the same input and the fault will manifest itself.

Now when you select which value you would use you should select a value that most likely will result in bad behavior - which means the use of uninitialized data would more likely result in a fault being observed. This means that you would have to use some knowledge of the platform in question (however many of them behave quite similar).

If the memory is used to hold pointers then indeed having cleared the memory will mean that you get a NULL pointer and normally dereferencing that will result in segmentation fault (which will be observed as a fault). However if you use it in another way, for example as an arithmetic type then you will get 0 and for many application that is not that odd number.

If you instead use 0xDEADBEEF you will get a quite large integer, also when interpreting the data as floating point it will also be quite large number (IIRC). If interpreting it as text it will be very long and contain non-ascii characters and if you use UTF-8 encoding it will likely be invalid. Now if used as a pointer on some platform it would fail alignment requirements for some types - also on some platforms that region of memory might be mapped out anyway (note that on x86_64 the value of the pointer would be 0xDEADBEEFDEADBEEF which is out of range for an address).

Note that while filling with 0xEF will have pretty much similar properties, if you want to fill the memory with 0xDEADBEEF you would need to use a custom function since memset doesn't do the trick.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文