具有缓冲区溢出的代码示例(获取方法)。为什么它的行为不符合预期?
这是来自 ac 程序的摘录,应该演示缓冲区溢出。
void foo()
{
char arr[8];
printf(" enter bla bla bla");
gets(arr);
printf(" you entered %s\n", arr);
}
问题是“用户最多可以输入多少个输入字符而不会造成缓冲区溢出”,
我最初的答案是 8,因为字符数组的长度是 8 个字节。 虽然我很确定我的答案是正确的,但我尝试了更多的字符,发现在出现分段错误之前我可以输入的字符限制是 11。(我在 VirtualBox Ubuntu 上运行这个)
所以我的问题是:为什么可以在 8 字节数组中输入 11 个字符?
This an extract from a c program that should demonstrate a bufferoverflow.
void foo()
{
char arr[8];
printf(" enter bla bla bla");
gets(arr);
printf(" you entered %s\n", arr);
}
The question was "How many input chars can a user maximal enter without a creating a buffer overflow"
My initial answer was 8, because the char-array is 8 bytes long.
Although I was pretty certain my answer was correct, I tried a higher amount of chars, and found that the limit of chars that I can enter, before I get a segmentation fault is 11. (Im running this on A VirtualBox Ubuntu)
So my question is: Why is it possible to enter 11 chars into that 8 byte array?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
您的字符实际上超出了定义的数组的范围,导致未定义的结果。在覆盖一些正在用于其他用途的内存之前,您不会看到效果。
语言和运行时没有采取任何措施来防止缓冲区溢出,这正是这些错误如此严重且有时难以追踪的原因。
由于这些原因,像
gets
这样的函数已被弃用,转而使用更安全的函数(在本例中为getline
),这些函数询问将存储数据的数组的长度。请参阅:http://crasseux.com/books/ctutorial/gets.html
另外,您只能可靠地存储 7 个字符,因为您需要第 8 个字符作为空终止符。
Your characters are actually exceeding the bounds of the defined array, leading to undefined results. You don't see the effect until you overwrite some memory that is being used for something else.
The language and runtime aren't doing anything to prevent you from overflowing the buffer, which is precisely why these bugs are so bad and sometimes hard to track down.
For these reasons, functions like
gets
are getting deprecated for safer functions (getline
in this case) that ask for the length of the array where they will store data.See: http://crasseux.com/books/ctutorial/gets.html
Also, you can only reliably store 7 characters because you need the 8th for a null terminator.
可能是因为对齐和/或填充。那里可能有一些“备用”内存,实际上并未使用,因此当您覆盖它时,不会造成任何破坏。这并不意味着它是正确的或有效的,只是对于您来说,在使用该编译器、椅子、头发颜色等的机器上它现在不会失败。
Probably because of alignment and/or padding. There might be some "spare" memory there, that is not actually used so when you overwrite it, nothing breaks. That doesn't mean it's correct or that it works, just that it doesn't fail right now, for you, on that machine using that compiler, chair, hair color and so on.
11+1 表示零终止 = 12 个字符。当 gets() 将 13 个字符写入 arr[8] 时,会发生 IOW 崩溃。
您还没有发布精确的堆栈跟踪,但根据我的经验,它应该在 foo() 返回后崩溃。
堆栈帧(使用 for void foo() + gets())看起来像 (*):
从所有信息中,最重要的位是返回地址和保存的堆栈指针。在您的情况下写入第 13 个字节可能已损坏 foo() 函数保存的堆栈指针。以下 printf() 的调用很可能会成功,因为堆栈指针仍然有效(最后通过从 gets() 返回来更改)。但是从 foo() 返回将导致 foo() 保存的堆栈指针(现已损坏)被恢复,然后从调用函数内部访问堆栈的任何操作都将转到错误地址。
根据我的经验,这是最有可能的情况。当堆栈损坏时,很难确定会发生什么。
(*) 有关如何构建堆栈帧的准确详细信息,请查找适用于您的架构的 ABI(应用程序二进制接口):例如适用于 Intel i386 的 IA-32 ABI 或适用于 AMD64 的 AMD64 ABI。
11+1 for zero termination = 12 characters. IOW crash occurs when gets() writes 13 characters into the arr[8].
You haven't posted precise stack trace, but from my experience it should have crashed after foo()'s return.
Stack frame (with for void foo() + gets()) would look like (*):
From all the information, the most important bits are the return address and saved stack pointer. And write of 13th byte in your case likely has corrupted the saved stack pointer of foo() function. Highly likely call of the following printf() would succeed, as the stack pointer is still valid (last changed by returning from gets()). But the returning from foo() would cause the foo()'s saved stack pointer (now corrupt) to be restored and then any action accessing stack from inside the calling function would go to a bad address.
From my experience this is the likeliest scenario. When stack is corrupt, it is really hard to tell for sure what would happen.
(*) For precise details how stack frame is constructed look for ABI - Application Binary Interface - for your architecture: for example IA-32 ABI for Intel i386 or AMD64 ABI for AMD64.
只是碰巧有足够的可用内存来存储额外的数据。您永远不应该依赖它并始终将数据保留在数组的范围内。
您的示例中实际允许的字符数是 7 个“标准”字符加上 NULL 字符(总共 8 个)。
There just happened to be enough free memory for you to store the extra data. You should never rely on that and always keep data within the bounds of the array.
The number of characters actually permitted in your example are 7 "standard" characters plus the NULL character (8 total).
在 C 中,没有什么可以阻止你越过数组的末尾。当您将字符串放入内存时,它将填充您的数组,然后继续填充内存,直到有东西真正阻止它为止。在较大的程序中,这可能意味着覆盖内存中的其他变量,从而导致很难追踪的错误。另外,在确定内存位置适合多大的字符串时,您可能需要考虑 C 如何找到字符串的结尾。
In C there is nothing stopping you from going past the end of an array. When you get the string into memory it will fill up your array, and then keep filling memory until something actually stops it. In larger programs this can mean overwriting other variables in memory, which leads to bugs that are very difficult to track down. On a separate note, you may want to think about how C finds the end of a string when determining how big of a string will fit in a memory location.