哪些 C 编译器存在指针减法下溢?
因此,正如我从 Michael Burr 对 这个答案,C 标准不支持从数组中第一个元素之后的指针中进行整数减法(我想这包括任何分配的记忆)。
来自 组合的 C99 + TC1 + 的第 6.5.6 节TC2(pdf):
如果指针操作数和结果都指向同一个数组对象的元素,或者超过数组对象的最后一个元素,则求值不会产生溢出; 否则,行为未定义。
我喜欢指针算术,但这从来不是我担心的事情。 我一直假设:
int a[1];
int * b = a - 3;
int * c = b + 3;
c == a
。
因此,虽然我相信我以前做过类似的事情,并且没有被咬,但这一定是由于我使用过的各种编译器的善意 - 他们已经超越了标准的要求使指针算术按照我想象的方式工作。
所以我的问题是,这种情况有多常见? 是否有常用的编译器不为我做这种好事? 超出数组范围的正确指针算术是事实上的标准吗?
So, as I learned from Michael Burr's comments to this answer, the C standard doesn't support integer subtraction from pointers past the first element in an array (which I suppose includes any allocated memory).
From section 6.5.6 of the combined C99 + TC1 + TC2 (pdf):
If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.
I love pointer arithmetic, but this has never been something I've worried about before. I've always assumed that given:
int a[1];
int * b = a - 3;
int * c = b + 3;
That c == a
.
So while I believe I've done that sort of thing before, and not gotten bitten, it must have been due to the kindness of the various compilers I've worked with - that they've gone above and beyond what the standards require to make pointer arithmetic work the way I thought it did.
So my question is, how common is that? Are there commonly used compilers that don't do that kindness for me? Is proper pointer arithmetic beyond the bounds of an array a defacto standard?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
MSDOS FAR 指针有这样的问题,这些问题通常可以通过在实模式下“巧妙”使用段寄存器与偏移寄存器的重叠来解决。 其效果是 16 位段左移 4 位,并添加到 16 位偏移量中,这给出了可以寻址 1MB 的 20 位物理地址,这已经足够了,因为每个人都知道没有人会需要这样的地址。多达 640KB 的 RAM。 ;-)
在保护模式下,段寄存器实际上是内存描述符表的索引。 典型的 DOS 扩展运行时通常会进行安排,以便可以像在实模式下一样处理许多段,这使得从实模式移植代码变得容易。 但它有一些缺陷。 首先,分配之前的段不是分配的一部分,因此其描述符甚至可能无效。
在处于保护模式的 80286 上,仅加载段寄存器的值会导致加载无效描述符,无论该描述符是否实际用于引用内存,都会导致异常。
分配后的一个字节可能会出现类似的问题。 指针上的最后一个 ++ 可能已转移到段寄存器,导致它加载新的描述符。 在这种情况下,可以合理地期望内存分配器可以安排一个超过分配范围末尾的安全描述符,但是期望它安排更多的内容是不合理的。
MSDOS FAR pointers had problems like this, which were usually covered over by "clever" use of the overlap of the segment register with the offset register in real-mode. The effect there was that the 16-bit segment was a shifted left 4 bits, and added to the 16-bit offset which gave a 20-bit physical address that could address 1MB, which was plenty because everyone knew that noone would ever need as much as 640KB of RAM. ;-)
In protected mode, the segment register was actually an index into a table of memory descriptors. A typical DOS extending runtime would usually arrange things so that many segments could be treated just like they would have been in real mode, which made porting code from real mode easy. But it had some defects. Primarily, the segment before an allocation was not part of the allocation, and so its descriptor might not even be valid.
On the 80286 in protected mode, just loading a segment register with a value that would cause an invalid descriptor to load would cause an exception, whether or not the descriptor was actually used to refer to memory.
A similar issue potentially occurs at one byte past the allocation. The last ++ on the pointer might have carried over to the segment register, causing it to load a new descriptor. In this case, it is reasonable to expect that the memory allocator could arrange for one safe descriptor past the end of the allocated range, but it would be unreasonable to expect it to arrange for any more than that.
这不是标准“定义的实现”,而是标准“未定义”。 这意味着你不能指望编译器支持它,你不能说,“好吧,这段代码在编译器 X 上是安全的”。 通过调用未定义的行为,您的程序是未定义的。
实际的答案不是“如何(在哪里、何时、在什么编译器上)我可以摆脱这个问题”;而是“如何(在哪里、何时、在什么编译器上)”。 实际的答案是“不要这样做”。
This is not "implementation defined" by the Standard, this is "undefined" by the Standard. Which means that you can't count on a compiler supporting it, you can't say, "well, this code is safe on compiler X". By invoking undefined behavior, your program is undefined.
The practical answer isn't "how (where, when, on what compiler) can I get away with this"; the practical answer is "don't do this".
另一个原因是,有可选的保守垃圾收集器(如 boehm-weiser GC),它们假设指针始终位于分配的范围内,如果不在分配范围内,则允许它们随时释放内存。
有一个流行的商业质量和使用的库确实打破了这一假设,它是 HP 的 Judy Trees 库,它使用指针算法来实现非常复杂的哈希结构。
Another reason is that there are optional conservative garbage collectors (like the boehm-weiser GC) that assume a pointer is always inside the allocated range and if not they are allowed to free the memory at any time.
There is one popular commercial quality and used library that does break this assumption and it is the Judy Trees Library from HP which uses pointer algorithms to implement a very complex hash structure.
ZETA-C 用于 TI Explorer; 指针被实现为数组和索引或移位数组,IIRC,所以你的例子可能不起作用。 从
zcprim.lisp
中的zcprim>pointer-subtract
开始,找出行为是什么。 不知道这是否符合标准,但我的印象是确实如此。ZETA-C for the TI Explorer; pointers are implemented as arrays and indexes or displaced arrays, IIRC, so your example probably wouldn't work. Start from
zcprim>pointer-subtract
inzcprim.lisp
to figure out what the behavior would be. No idea whether this was correct per the standard, but I get the impression that it was.