在 C 中访问超出数组末尾的元素
我一直在读《K & R 的 C 书籍,发现 C 中的指针算术允许访问数组末尾之外的一个元素。 我知道 C 允许用内存做几乎任何事情,但我只是不明白,这种特性的目的是什么?
I've been reading K & R's book on C, and found that pointer arithmetic in C allows access to one element beyond the end of an array. I know C allows to do almost anything with memory but I just don't understand, what is the purpose of this peculiarity?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
C 不允许访问超出数组末尾的内存。 但是,它确实允许指针指向数组末尾之外的一个元素。 区别很重要。
因此,这是可以的:(
执行
*end
将会是一个错误。)这说明了此功能有用的原因:一个指针指向结尾后的(不存在的)元素。数组对于比较很有用,例如在循环中。
从技术上讲,这就是 C 标准允许的一切。 然而,实际上,C 实现(编译器和运行时)不会检查您是否访问超出数组末尾的内存,无论是一个元素还是多个元素。 必须进行边界检查,这会减慢程序执行速度。 C 最适合的程序类型(系统编程、通用库)往往比安全性和安全边界检查所提供的速度受益更多。
这意味着 C 可能不是通用应用程序编程的好工具。
C doesn't allow access to memory beyond the end of the array. It does, however, allow a pointer to point at one element beyond the end of the array. The distinction is important.
Thus, this is OK:
(Doing
*end
would be an error.)And that shows the reason why this feature is useful: a pointer pointing at the (non-existent) element after the end of the array is useful for comparisons, such as in loops.
Technically speaking, that is everything the C standard allows. However, in practice, the C implementation (compiler and runtime) does not check whether you access memory beyond the end of the array, whether it is one element or more. There would have to be bounds checking and that would slow down program execution. The kinds of programs C is best suited for (systems programming, general purpose libraries) tend to benefit more from the speed than the security and safety bounds checking would give.
That means C is perhaps not a good tool for general purpose application programming.
通常,表示“结束”位置很有用,该位置是实际分配之后的位置,因此您可以编写如下代码:
C 标准明确表示该元素是有效的内存地址,但取消引用它仍然不是一个有效的内存地址。好主意。
为什么它有这样的保证呢? 假设您有一台内存为 2^16 字节、地址为 0000-FFFF、16 位指针的机器。 假设您创建了一个 16 字节数组。 内存可以分配在FFF0吗?
有 16 个连续的空闲字节,但是:
由于指针大小的原因,它回绕到 0000。 现在是循环条件:
循环不会执行任何操作,而不是迭代数组。 这会破坏很多代码,因此 C 标准规定不允许分配。
Often, it is useful to denote the "end" position, which is one past the actual allocation, so you can write code like:
The C standard explicitly says that this element is a valid memory address, but dereferencing it would still not be a good idea.
Why does it have this guarantee? Let's say you had a machine with 2^16 bytes of memory, addresses 0000-FFFF, 16-bit pointers. Say you created a 16 byte array. Could the memory be allocated at FFF0?
There are 16 bytes free contiguously, but:
which wraps to 0000 because of the pointer size. Now the loop condition:
Instead of iterating over the array, the loop would do nothing. This would break a lot of code, so the C standard says that allocation isn't permissible.
如果您读取或写入超出分配的内存,则 C 标准表示其“未定义的行为”。
这意味着任何事情都可能发生,也许是现在,也许是一周后,也许是五年后,也许永远不会发生,而你却侥幸逃脱了惩罚。
我的老板有几条格言:
“没有正确的 C 程序,只有还没有出错的 C 程序”
“关于记忆损坏,你能说的唯一明智的事情就是什么都没有。”
他总是对的。
If you read or write beyond allocated memory, then C standard says its "undefined behaviour".
That means just about anything could happen, maybe now, maybe in a week, or maybe in 5 years time or maybe never and you get away with it.
My Boss had a couple of maxims:
"There is no such thing as a correct C program, just one that hasn't gone wrong yet"
"The only sensible thing you can say about memory corruption, is nothing."
He was always right.
你可以远远超出数组 1
例如,
将在单词字符串末尾打印垃圾,无论之前内存中的内容如何。
you can go well beyond 1 past the array
for example`
will print garbage after the end of the word string, whatever was sitting in memory before hand.