检查 NULL 时要忽略多少位?
以下代码会因 seg-V 而崩溃:
// my code
int* ipt;
int bool set = false;
void Set(int* i) {
ASSERT(i);
ipt = i;
set = true;
}
int Get() {
return set ? *ipt : 0;
}
// code that I don't control.
struct S { int I, int J; }
int main() {
S* ip = NULL;
// code that, as a bug, forgets to set ip...
Set(&ip->J);
// gobs of code
return Get();
}
这是因为虽然 i
不是 NULL
,但它仍然无效。如果调用代码从 NULL 指针获取数组索引操作的地址,也会发生同样的问题。
解决此问题的一种方法是修剪低位:
void Set(int* i) {
ASSERT((reinterpret_cast<size_t>(i))>>10);
ipt = i;
set = true;
}
但是我应该/可以删除多少位?
编辑,我不担心未定义的行为,因为无论如何我都会中止(但比 seg-v 更干净)。
FWIW:这是一个半假设的情况。导致我想到这一点的错误在我发布之前已被修复,但我之前遇到过它,并且正在考虑将来如何使用它。
为了论证可以假设的事情:
- 如果使用会 seg-v 的东西调用 Set,那么这是一个错误
- Set 可能会被不是我需要修复的代码调用。 (例如,我提交了一个错误)
- Set 可能会被我试图修复的代码调用。 (例如,我将添加健全性检查作为调试工作的一部分。)
- 以不提供有关调用 Set 的位置的信息的方式调用我的方法。 (即允许 Get 进行 seg-v 并不是调试任何内容的有效方法。)
- 代码不需要可移植或捕获 100% 的错误指针。它只需要在我当前的系统上运行足够频繁,就可以让我找到问题所在。
The following crashes with a seg-V:
// my code
int* ipt;
int bool set = false;
void Set(int* i) {
ASSERT(i);
ipt = i;
set = true;
}
int Get() {
return set ? *ipt : 0;
}
// code that I don't control.
struct S { int I, int J; }
int main() {
S* ip = NULL;
// code that, as a bug, forgets to set ip...
Set(&ip->J);
// gobs of code
return Get();
}
This is because while i
is not NULL
it still isn't valid. The same problem can happen if the calling code takes the address of an array index operation from a NULL
pointer.
One solution to this is to trim the low order bits:
void Set(int* i) {
ASSERT((reinterpret_cast<size_t>(i))>>10);
ipt = i;
set = true;
}
But how many bits should/can I get rid of?
Edit, I'm not worried about undefined behavior as I'll be aborting (but more cleanly than a seg-v) on that case anyway.
FWIW: this is a semi-hypothetical situation. The bug that caused me to think of this was fixed before I posted, but I've run into it before and am thinking of how to work with it in the future.
Things that can be assumed for the sake of argument:
- If Set is called with something that will seg-v, that's a bug
- Set may be called by code that isn't my job to fix. (E.g. I file a bug)
- Set may be called by code I'm trying to fix. (E.g. I'm adding sanity checks as part of my debuggin work.)
- Get my be called in a way that provide no information about where Set was called. (I.e. allowing Get to seg-v isn't an effective way to debug anything.)
- The code needn't be portable or catch 100% of bad pointers. It need only work on my current system often enough to let me find where things are going wrong.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
没有可移植的方法来测试除 NULL 之外的任何无效指针。在您对其进行任何操作之前,评估
&ip[3]
会给出未定义的行为;唯一的解决方案是在对指针进行任何算术运算之前测试 NULL。如果你不需要可移植性,也不需要保证捕获所有错误,那么在大多数主流平台上,你可以检查地址是否在内存的第一页内;通常将 NULL 定义为地址零,并保留第一页以捕获大多数空指针取消引用。在 POSIX 平台上,这看起来像这样
,但这不是一个完整的解决方案。唯一真正的解决方案是首先修复滥用空指针的问题。
There is no portable way to test for any invalid pointer except NULL. Evaluating
&ip[3]
gives undefined behaviour, before you do anything with it; the only solution is to test for NULL before doing any arithmetic on the pointer.If you don't need portability, and don't need to guarantee that you catch all errors, then on most mainstream platforms you could check whether the address is within the first page of memory; it's common to define NULL to be address zero, and to reserve the first page to trap most null pointer dereferences. On a POSIX platform, this would look something like
But this isn't a complete solution. The only real solution is to fix whatever is abusing null pointers in the first place.
您根本不应该对空指针进行指针算术(包括数组索引)。
在 C++ 中,您应该使用
0
,而不是NULL
。 NULL 是 c 的一个特性,在 c++ 中仍然受支持,但不是惯用的。关于BCS的许多评论和编辑。这将问题从表面上相当幼稚的问题转变为更深层次的问题。但是……在像 c++ 这样宽容的语言中,要保护自己免受人们在调用您的代码之前做愚蠢的事情并不容易。
You shouldn't be doing pointer arithmetic (including array indexing) off of a null pointer at all.
And you should use
0
, notNULL
in c++. NULL is a feature of c, still supported but not idiomatic in c++.In regards to the BCS's many comments and the edit. That changes the question from the rather naive one on the surface to a much deeper one. But...it is not going to be easy---in a language as permissive as c++---to protect yourself against people doing stupid things before calling your code.
尝试解决未定义的行为将始终非常依赖于您的平台、编译器、版本等。如果可能的话。
常见的 *nix 永远不会精确映射地址空间的第一页来捕获空指针访问,因此您可能无需检查指针值是否在 0 到 4096 之间(或者您的系统使用的任何页面大小)。
但不要这样做,你无法防范所有可能出错的事情,而是专注于让代码正确。如果有人向您传递了无效的指针,则很可能存在指针验证检查无法修复的严重错误。
Trying to work around undefined behavior will always be very dependant on your platform, compiler, version,etc. if it is at all possible.
Common *nixes never map the first page of the address space precisely to catch null pointer access, thus you might get away with checking if the pointer value is between 0 and 4096 (Or whatever page size your system uses).
But don't do this, you can't guard against everything that can go wrong, focus instead on getting the code right. If somone passes you an invalid pointer, chances are there's something gravely wrong anyway that a pointer validation check can't fix.
有什么方法可以施加一些影响来纠正错误的代码吗?这不可能有好的结果。从法律上讲,仅仅创建一个无效的指针就是未定义的行为。
如果
Set
总是从ip
传递一个小的偏移量,并且ip
总是被初始化为NULL,那么你可能会同意你正在做的事情。大多数现代系统确实具有空指针常量,因为所有位都为零,并且大多数系统都会做自然的事情。当然,绝对不能保证它能够在具有任何给定编译器和任何给定编译器选项的任何给定系统上工作,并且更改其中任何一个都可能导致它失败。由于任何错误指针的使用都可能导致程序失败,因此您应该考虑当代码触发内存违规时会发生什么。
另外,我不知道您的 ASSERT 宏的作用,但在大多数实现中,assert 仅在调试模式下激活。如果您想将这个垃圾推入生产环境,或者以优化模式运行,您可能需要确保它仍然会更温和地失败。
Is there any way you can exert some influence to get that bad code corrected? There is no possible way this can turn out well. Legally, just creating an invalid pointer is undefined behavior.
If
Set
is always going to be passed a small offset fromip
, andip
will always be initialized to NULL, you are probably going to be OK with what you are doing. Most modern systems do have the null pointer constant as all bits zero, and most will do the natural thing. There is of course absolutely no guarantee that it will work on any given system with any given compiler and any given compiler options, and changing any of those might cause it to fail.Since any use of bad pointers can cause program failure, you should consider what happens when the code triggers a memory violation.
Also, I don't know what your
ASSERT
macro does, butassert
, in most implementations, is only activated in debug mode. If you want to push this piece of junk into production, or run in optimized mode, you might want to make sure it will still fail more gently.如果您不介意非常糟糕的黑客行为,您可以使用
volatile
强制进行内存访问(注意volatile
是邪恶的)。根据 GCC 文档,易失性访问必须跨序列点排序,因此您可以执行以下操作:我不认为
=
是序列点,但以下可能 也有效:If you don't mind a really bad hack, you can force a memory access with
volatile
(n.b.volatile
is evil). According to the GCC docs, volatile accesses must be ordered across sequence points, so you can do something like this:I don't think
=
is a sequence point, but the following might also work:我真的不建议尝试解决其他人代码中的错误。如果您在开发代码时没有通过调试器运行您编写的所有内容,那么再多的检查也无法帮助您发现所有问题。让他们修复他们的代码。
如果您不使用调试器,请获取一个不错的崩溃处理程序,该处理程序可以转储每个线程的调用堆栈以及尽可能多的有关程序状态的附加信息。尝试找出可能出了什么问题。
通过静态分析工具定期运行代码也可以提供帮助。
请记住,可能不是有人忘记初始化指针,也可能是其他人通过完全不相关的地方的错误内存写入覆盖了该指针。也有一些工具可以帮助追踪此类事情。
关于 NULL 与 0 的争论,
#define NULL 0
更好,原因如下:1) 当您处理指针时,您可以更容易地看到。
2) 使用 NULL 所提供的安全性不亚于或高于使用 0。那么为什么不让你的代码更具可读性呢?
3) 当 C++11 最终发布时,
#define NULL nullptr
比所有这些零更容易更改。 (我想,今天您可以采用另一种方式#define nullptr 0
,但是如果您正在开发跨平台代码,这可能会在将来引起问题。)根据记录,C++ 标准明确指出空指针常量是计算结果为零的右值整数类型。因此,请不要再胡说空指针不必等于零。
I really wouldn't recommend trying to work around a bug in somebody else's code. If you're not running everything you write through a debugger while you're developing code no amount of checks are going to help you catch all the problems. Get them to fix their code.
If you're not using a debugger, get a decent crash handler that dumps the callstack for each thread and as much additional information regarding the program state as possible. Try to figure out what could be going wrong from that.
Regularly running your code through static analysis tools can also help here.
Remember, that it might not be someone forgetting to initialise a pointer, it could be someone else overwriting that pointer through a bad memory write from somewhere completely unrelated. There are tools which can help track down such things too.
Regarding the NULL Vs 0 debate,
#define NULL 0
is better for a couple of reasons:1) You can more easily see when you're dealing with a pointer.
2) Using NULL offers no less or more safety than using 0. So why not make your code more readable?
3) When C++11 is finally released
#define NULL nullptr
is a lot easier to change than all those zeros. (You could go the other way and#define nullptr 0
today I suppose, but that will probably cause problems in the future if you're developing cross platform code.)And for the record, the C++ standard explicitly states that a null pointer constant is an rvalue integer type that evaluates to zero. So please let's not have any more nonsense about null pointers not having to equal zero.
无法以可移植方式执行此操作的原因之一是 NULL 不能保证为 0。仅指定空指针比较等于 0。您可以编写 0(或预处理器宏“NULL”) ),但编译器知道这个 0 位于指针上下文中,因此它会生成适当的代码将其与空指针进行比较,无论空指针的实际实现是什么。请参阅此处和此处 了解更多信息。将 NULL 指针重新解释为整型可能会导致它具有 true 值而不是 false。
One reason, among many, you cannot do this in a portable fashion is that NULL is not guaranteed to be 0. It is only specified that null pointers will compare equal to 0. You may write a 0 (or the preprocessor macro "NULL") in your code, but the compiler knows that this 0 is in a pointer context so it generates the appropriate code to compare it to a null pointer, whatever the actual implementation of a null pointer is. See here and here for more information on that. Reinterpreting a NULL pointer as an integral type may cause it to have a true value instead of false.
您必须考虑特定的操作系统和硬件架构。如果您只对检测“接近空”的指针感兴趣,那么您可以使用 ASSERT(i > pageSize),假设第一页在您的操作系统中始终处于写保护状态。
但是……显而易见的问题是:为什么要麻烦呢?正如您所指出的,操作系统将在这种情况下检测 null 和 SEGV,这与 ASSERT 一样好,不是吗?
You'd have to consider your particular operating system and hardware architecture. If you're only interested in detecting pointers that are "close to null" then you could use ASSERT(i > pageSize), assuming that the first page is always write protected in your OS.
But ... the obvious question is: Why bother? The OS will detect the null in this case and SEGV as you pointed out, which is just as good as an ASSERT, isn't it?