具有缓冲区溢出的代码示例(获取方法)。为什么它的行为不符合预期?

发布于 2024-09-05 04:56:21 字数 366 浏览 8 评论 0原文

这是来自 ac 程序的摘录,应该演示缓冲区溢出。

void foo()
{
  char arr[8];
  printf(" enter bla bla bla");
  gets(arr);
  printf(" you entered %s\n", arr);
}

问题是“用户最多可以输入多少个输入字符而不会造成缓冲区溢出”,

我最初的答案是 8,因为字符数组的长度是 8 个字节。 虽然我很确定我的答案是正确的,但我尝试了更多的字符,发现在出现分段错误之前我可以输入的字符限制是 11。(我在 VirtualBox Ubuntu 上运行这个)

所以我的问题是:为什么可以在 8 字节数组中输入 11 个字符?

This an extract from a c program that should demonstrate a bufferoverflow.

void foo()
{
  char arr[8];
  printf(" enter bla bla bla");
  gets(arr);
  printf(" you entered %s\n", arr);
}

The question was "How many input chars can a user maximal enter without a creating a buffer overflow"

My initial answer was 8, because the char-array is 8 bytes long.
Although I was pretty certain my answer was correct, I tried a higher amount of chars, and found that the limit of chars that I can enter, before I get a segmentation fault is 11. (Im running this on A VirtualBox Ubuntu)

So my question is: Why is it possible to enter 11 chars into that 8 byte array?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

单身狗的梦 2024-09-12 04:56:21

您的字符实际上超出了定义的数组的范围,导致未定义的结果。在覆盖一些正在用于其他用途的内存之前,您不会看到效果。

语言和运行时没有采取任何措施来防止缓冲区溢出,这正是这些错误如此严重且有时难以追踪的原因。

由于这些原因,像 gets 这样的函数已被弃用,转而使用更安全的函数(在本例中为 getline),这些函数询问将存储数据的数组的长度。

请参阅:http://crasseux.com/books/ctutorial/gets.html

另外,您只能可靠地存储 7 个字符,因为您需要第 8 个字符作为空终止符。

Your characters are actually exceeding the bounds of the defined array, leading to undefined results. You don't see the effect until you overwrite some memory that is being used for something else.

The language and runtime aren't doing anything to prevent you from overflowing the buffer, which is precisely why these bugs are so bad and sometimes hard to track down.

For these reasons, functions like gets are getting deprecated for safer functions (getline in this case) that ask for the length of the array where they will store data.

See: http://crasseux.com/books/ctutorial/gets.html

Also, you can only reliably store 7 characters because you need the 8th for a null terminator.

爱情眠于流年 2024-09-12 04:56:21

可能是因为对齐和/或填充。那里可能有一些“备用”内存,实际上并未使用,因此当您覆盖它时,不会造成任何破坏。这并不意味着它是正确的或有效的,只是对于您来说,在使用该编译器、椅子、头发颜色等的机器上它现在不会失败。

Probably because of alignment and/or padding. There might be some "spare" memory there, that is not actually used so when you overwrite it, nothing breaks. That doesn't mean it's correct or that it works, just that it doesn't fail right now, for you, on that machine using that compiler, chair, hair color and so on.

盗心人 2024-09-12 04:56:21

(我在 VirtualBox 上运行这个
Ubuntu)所以我的问题是:为什么会这样
可以在 8 个字符中输入 11 个字符
字节数组?

11+1 表示零终止 = 12 个字符。当 gets() 将 13 个字符写入 arr[8] 时,会发生 IOW 崩溃。

您还没有发布精确的堆栈跟踪,但根据我的经验,它应该在 foo() 返回后崩溃。

堆栈帧(使用 for void foo() + gets())看起来像 (*):

  • gets() 局部变量
  • 在 gets() 调用时保存的堆栈指针(所谓的“序言”)
  • 返回地址,指向 foo()
  • foo() 局部变量(您的 char arr[8])
  • 此时保存的堆栈指针gets() 调用的
  • 返回地址,指向 foo() 的调用者
  • <更高内存地址>

从所有信息中,最重要的位是返回地址和保存的堆栈指针。在您的情况下写入第 13 个字节可能已损坏 foo() 函数保存的堆栈指针。以下 printf() 的调用很可能会成功,因为堆栈指针仍然有效(最后通过从 gets() 返回来更改)。但是从 foo() 返回将导致 foo() 保存的堆栈指针(现已损坏)被恢复,然后从调用函数内部访问堆栈的任何操作都将转到错误地址。

根据我的经验,这是最有可能的情况。当堆栈损坏时,很难确定会发生什么。

(*) 有关如何构建堆栈帧的准确详细信息,请查找适用于您的架构的 ABI(应用程序二进制接口):例如适用于 Intel i386 的 IA-32 ABI 或适用于 AMD64 的 AMD64 ABI。

(Im running this on A VirtualBox
Ubuntu) So my question is: Why is it
possible to enter 11 chars into that 8
byte array?

11+1 for zero termination = 12 characters. IOW crash occurs when gets() writes 13 characters into the arr[8].

You haven't posted precise stack trace, but from my experience it should have crashed after foo()'s return.

Stack frame (with for void foo() + gets()) would look like (*):

  • <lower memory address>
  • gets() local variables
  • saved stack pointer at the moment of gets() call (so called "prologue")
  • return address, points to foo()
  • foo() local variables (your char arr[8])
  • saved stack pointer at the moment of gets() call
  • return address, points to caller of foo()
  • <higher memory address>

From all the information, the most important bits are the return address and saved stack pointer. And write of 13th byte in your case likely has corrupted the saved stack pointer of foo() function. Highly likely call of the following printf() would succeed, as the stack pointer is still valid (last changed by returning from gets()). But the returning from foo() would cause the foo()'s saved stack pointer (now corrupt) to be restored and then any action accessing stack from inside the calling function would go to a bad address.

From my experience this is the likeliest scenario. When stack is corrupt, it is really hard to tell for sure what would happen.

(*) For precise details how stack frame is constructed look for ABI - Application Binary Interface - for your architecture: for example IA-32 ABI for Intel i386 or AMD64 ABI for AMD64.

梦一生花开无言 2024-09-12 04:56:21

只是碰巧有足够的可用内存来存储额外的数据。您永远不应该依赖它并始终将数据保留在数组的范围内。

您的示例中实际允许的字符数是 7 个“标准”字符加上 NULL 字符(总共 8 个)。

There just happened to be enough free memory for you to store the extra data. You should never rely on that and always keep data within the bounds of the array.

The number of characters actually permitted in your example are 7 "standard" characters plus the NULL character (8 total).

夜访吸血鬼 2024-09-12 04:56:21

在 C 中,没有什么可以阻止你越过数组的末尾。当您将字符串放入内存时,它将填充您的数组,然后继续填充内存,直到有东西真正阻止它为止。在较大的程序中,这可能意味着覆盖内存中的其他变量,从而导致很难追踪的错误。另外,在确定内存位置适合多大的字符串时,您可能需要考虑 C 如何找到字符串的结尾。

In C there is nothing stopping you from going past the end of an array. When you get the string into memory it will fill up your array, and then keep filling memory until something actually stops it. In larger programs this can mean overwriting other variables in memory, which leads to bugs that are very difficult to track down. On a separate note, you may want to think about how C finds the end of a string when determining how big of a string will fit in a memory location.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文