使用魔法调试值(例如 0xDEADBEEF)作为文字到底有什么危险?
不用说,使用硬编码的十六进制文字指针是一场灾难:
int *i = 0xDEADBEEF;
// god knows if that location is available
但是,使用十六进制文字作为变量值到底有什么危险?
int i = 0xDEADBEEF;
// what can go wrong?
如果这些值由于在各种调试场景中使用而确实“危险” a>,那么这意味着即使我不使用这些文字,任何在运行时碰巧偶然发现这些值之一的程序都可能崩溃。
有人愿意解释使用十六进制文字的真正危险吗?
编辑:只是为了澄清一下,我并不是指源代码中常量的一般使用。我特别讨论的是使用十六进制值时可能出现的调试场景问题,并以 0xDEADBEEF
为例。
It goes without saying that using hard-coded, hex literal pointers is a disaster:
int *i = 0xDEADBEEF;
// god knows if that location is available
However, what exactly is the danger in using hex literals as variable values?
int i = 0xDEADBEEF;
// what can go wrong?
If these values are indeed "dangerous" due to their use in various debugging scenarios, then this means that even if I do not use these literals, any program that during runtime happens to stumble upon one of these values might crash.
Anyone care to explain the real dangers of using hex literals?
Edit: just to clarify, I am not referring to the general use of constants in source code. I am specifically talking about debug-scenario issues that might come up to the use of hex values, with the specific example of 0xDEADBEEF
.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
使用十六进制文字并不比任何其他类型的文字更危险。
如果您的调试会话最终将数据作为代码执行,而不是您想要的,那么您无论如何都会陷入痛苦的世界。
当然,存在正常的“神奇值”与“命名良好的常量”代码气味/清洁问题,但这并不是我认为您正在谈论的那种危险。
There's no more danger in using a hex literal than any other kind of literal.
If your debugging session ends up executing data as code without you intending it to, you're in a world of pain anyway.
Of course, there's the normal "magic value" vs "well-named constant" code smell/cleanliness issue, but that's not really the sort of danger I think you're talking about.
除了少数例外,没有什么是“恒定的”。
我们更喜欢称它们为“慢变量”——它们的值变化如此之慢,以至于我们不介意重新编译来更改它们。
但是,我们不希望在应用程序或测试脚本中出现许多 0x07 实例,其中每个实例都有不同的含义。
我们希望为每个常量贴上标签,使其含义完全明确。
上述语句中的“7”是什么意思? 吗
它和“7”的意思一样
?测试用例是一个稍微不同的问题。我们不需要对数字文字的每个实例进行广泛、仔细的管理。相反,我们需要文档。
在某种程度上,我们可以通过在代码中包含一点提示来解释“7”的来源。
“常数”应该只被陈述和命名一次。
单元测试中的“结果”与常量不同,在解释它的来源时需要小心一些。
With few exceptions, nothing is "constant".
We prefer to call them "slow variables" -- their value changes so slowly that we don't mind recompiling to change them.
However, we don't want to have many instances of 0x07 all through an application or a test script, where each instance has a different meaning.
We want to put a label on each constant that makes it totally unambiguous what it means.
What does "7" mean in the above statement? Is it the same thing as
Is that the same meaning of "7"?
Test Cases are a slightly different problem. We don't need extensive, careful management of each instance of a numeric literal. Instead, we need documentation.
We can -- to an extent -- explain where "7" comes from by including a tiny bit of a hint in the code.
A "constant" should be stated -- and named -- exactly once.
A "result" in a unit test isn't the same thing as a constant, and requires a little care in explaining where it came from.
十六进制文字与十进制文字(如 1)没有什么不同。值的任何特殊意义都取决于特定程序的上下文。
A hex literal is no different than a decimal literal like 1. Any special significance of a value is due to the context of a particular program.
我相信今天早些时候在 IP 地址格式化问题中提出的问题与一般的十六进制文字的使用无关,而是与 0xDEADBEEF 的具体使用有关。至少,我是这么读的。
特别是使用 0xDEADBEEF 时存在一个问题,尽管在我看来这是一个很小的问题。问题是许多调试器和运行时系统已经将这个特定值作为标记值来指示未分配的堆、堆栈上的错误指针等。
我不记得哪个调试和运行时了系统使用这个特定的值,但多年来我已经多次看到它以这种方式使用。如果您在其中一种环境中进行调试,则代码中 0xDEADBEEF 常量的存在将与未分配的 RAM 或其他内容中的值无法区分,因此最多您不会获得有用的 RAM 转储,最坏的情况下您会收到警告来自调试器。
无论如何,这就是我认为最初的评论者告诉你这不利于“在各种调试场景中使用”时的意思。
I believe the concern raised in the IP address formatting question earlier today was not related to the use of hex literals in general, but the specific use of 0xDEADBEEF. At least, that's the way I read it.
There is a concern with using 0xDEADBEEF in particular, though in my opinion it is a small one. The problem is that many debuggers and runtime systems have already co-opted this particular value as a marker value to indicate unallocated heap, bad pointers on the stack, etc.
I don't recall off the top of my head just which debugging and runtime systems use this particular value, but I have seen it used this way several times over the years. If you are debugging in one of these environments, the existence of the 0xDEADBEEF constant in your code will be indistinguishable from the values in unallocated RAM or whatever, so at best you will not have as useful RAM dumps, and at worst you will get warnings from the debugger.
Anyhow, that's what I think the original commenter meant when he told you it was bad for "use in various debugging scenarios."
您没有理由不将
0xdeadbeef
分配给变量。但是,那些尝试分配十进制
3735928559
或八进制33653337357
或最糟糕的是:二进制11011110101011011011111011101111
的程序员将遭遇不幸。There's no reason why you shouldn't assign
0xdeadbeef
to a variable.But woe betide the programmer who tries to assign decimal
3735928559
, or octal33653337357
, or worst of all: binary11011110101011011011111011101111
.Big Endian 还是 Little Endian?
一种危险是当常量被分配给具有不同大小成员的数组或结构时;编译器或机器(包括 JVM 与 CLR)的字节序会影响字节的顺序。
当然,这个问题也适用于非常量值。
这是一个无可否认的人为的例子。最后一行之后 buffer[0] 的值是多少?
Big Endian or Little Endian?
One danger is when constants are assigned to an array or structure with different sized members; the endian-ness of the compiler or machine (including JVM vs CLR) will affect the ordering of the bytes.
This issue is true of non-constant values, too, of course.
Here's an, admittedly contrived, example. What is the value of buffer[0] after the last line?
我认为将其用作值没有任何问题。毕竟它只是一个数字。
I don't see any problem with using it as a value. Its just a number after all.
在正确的上下文中使用指针的硬编码十六进制值(如您的第一个示例)没有危险。特别是,在进行非常低级的硬件开发时,这是访问内存映射寄存器的方式。 (例如,尽管最好用 #define 为它们命名。)但在应用程序级别,您不需要执行这样的分配。
There's no danger in using a hard-coded hex value for a pointer (like your first example) in the right context. In particular, when doing very low-level hardware development, this is the way you access memory-mapped registers. (Though it's best to give them names with a #define, for example.) But at the application level you shouldn't ever need to do an assignment like that.
我用咖啡宝贝
我以前没有见过任何调试器使用过它。
I use CAFEBABE
I haven't seen it used by any debuggers before.
我在这两种情况下看到的危险是相同的:您创建了一个没有直接上下文的标志值。在这两种情况下,
i
都无法让我知道 100、1000 或 10000 行中存在与其关联的潜在关键标志值。你植入的是一个地雷错误,如果我不记得在每一个可能的使用中检查它,我可能会面临一个可怕的调试问题。现在,每次使用i
都必须如下所示:对需要在代码中使用
i
的所有 7000 个实例重复上述操作。现在,为什么上面的情况比这个更糟糕?
至少,我可以发现几个关键问题:
简而言之,编写您真正需要的代码就像创建神秘的魔法值一样容易。未来的代码维护者(很可能就是你)会感谢你。
The danger that I see is the same in both cases: you've created a flag value that has no immediate context. There's nothing about
i
in either case that will let me know 100, 1000 or 10000 lines that there is a potentially critical flag value associated with it. What you've planted is a landmine bug that, if I don't remember to check for it in every possible use, I could be faced with a terrible debugging problem. Every use ofi
will now have to look like this:Repeat the above for all of the 7000 instances where you need to use
i
in your code.Now, why is the above worse than this?
At a minimum, I can spot several critical issues:
In short, it's just as easy to write the code you really need as it is to create a mysterious magic value. The code-maintainer of the future (who quite likely will be you) will thank you.