奇怪的堆栈溢出?

发布于 2024-12-27 18:01:35 字数 1906 浏览 1 评论 0原文

我遇到了一个奇怪的情况,将一个指针传递给一个结构体,该结构体在 struct{} 定义中定义了一个非常大的数组,一个大小约为 34MB 的浮点数组。简而言之,伪代码如下所示:

typedef config_t{
  ...
  float values[64000][64];
} CONFIG;


int32_t Create_Structures(CONFIG **the_config)
{
  CONFIG  *local_config;
  int32_t number_nodes;

  number_nodes = Find_Nodes();

  local_config = (CONFIG *)calloc(number_nodes,sizeof(CONFIG));
  *the_config = local_config;
  return(number_nodes);
}


int32_t Read_Config_File(CONFIG *the_config)
{
    /* do init work here */
    return(SUCCESS);
}


main()
{
    CONFIG *the_config;
    int32_t number_nodes,rc;

    number_nodes = Create_Structures(&the_config);

    rc = Read_Config_File(the_config);
    ...
    exit(0);
}

代码编译得很好,但是当我尝试运行它时,我会在 Read_Config_File() 下面的 { 处得到一个 SIGSEGV。

(gdb) run
...
Program received signal SIGSEGV, Segmentation fault.
0x0000000000407d0a in Read_Config_File (the_config=Cannot access memory at address 0x7ffffdf45428
) at ../src/config_parsing.c:763
763 {
(gdb) bt
#0  0x0000000000407d0a in Read_Config_File (the_config=Cannot access memory at address 0x7ffffdf45428
) at ../src/config_parsing.c:763
#1  0x00000000004068d2 in main (argc=1, argv=0x7fffffffe448) at ../src/main.c:148

我一直用较小的数组做过这种事情。奇怪的是,0x7fffffffe448 - 0x7ffffdf45428 = 0x20B8EF8,或者大约是我的浮点数组的 34MB。

Valgrind 会给我类似的输出:

==10894== Warning: client switching stacks?  SP change: 0x7ff000290 --> 0x7fcf47398
==10894==          to suppress, use: --max-stackframe=34311928 or greater
==10894== Invalid write of size 8
==10894==    at 0x407D0A: Read_Config_File (config_parsing.c:763)
==10894==    by 0x4068D1: main (main.c:148)
==10894==  Address 0x7fcf47398 is on thread 1's stack

错误消息都指向我破坏堆栈指针,但是a)我从来没有遇到过在函数入口处崩溃的错误,b)我正在传递指针,而不是实际的数组。

有人可以帮我解决这个问题吗?我使用的是 64 位 CentOS 机器,运行内核 2.6.18 和 gcc 4.1.2

谢谢!

马特

I'm running into a strange situation with passing a pointer to a structure with a very large array defined in the struct{} definition, a float array around 34MB in size. In a nutshell, the psuedo-code looks like this:

typedef config_t{
  ...
  float values[64000][64];
} CONFIG;


int32_t Create_Structures(CONFIG **the_config)
{
  CONFIG  *local_config;
  int32_t number_nodes;

  number_nodes = Find_Nodes();

  local_config = (CONFIG *)calloc(number_nodes,sizeof(CONFIG));
  *the_config = local_config;
  return(number_nodes);
}


int32_t Read_Config_File(CONFIG *the_config)
{
    /* do init work here */
    return(SUCCESS);
}


main()
{
    CONFIG *the_config;
    int32_t number_nodes,rc;

    number_nodes = Create_Structures(&the_config);

    rc = Read_Config_File(the_config);
    ...
    exit(0);
}

The code compiles fine, but when I try to run it, I'll get a SIGSEGV at the { beneath Read_Config_File().

(gdb) run
...
Program received signal SIGSEGV, Segmentation fault.
0x0000000000407d0a in Read_Config_File (the_config=Cannot access memory at address 0x7ffffdf45428
) at ../src/config_parsing.c:763
763 {
(gdb) bt
#0  0x0000000000407d0a in Read_Config_File (the_config=Cannot access memory at address 0x7ffffdf45428
) at ../src/config_parsing.c:763
#1  0x00000000004068d2 in main (argc=1, argv=0x7fffffffe448) at ../src/main.c:148

I've done this sort of thing all the time, with smaller arrays. And strangely, 0x7fffffffe448 - 0x7ffffdf45428 = 0x20B8EF8, or about the 34MB of my float array.

Valgrind will give me similar output:

==10894== Warning: client switching stacks?  SP change: 0x7ff000290 --> 0x7fcf47398
==10894==          to suppress, use: --max-stackframe=34311928 or greater
==10894== Invalid write of size 8
==10894==    at 0x407D0A: Read_Config_File (config_parsing.c:763)
==10894==    by 0x4068D1: main (main.c:148)
==10894==  Address 0x7fcf47398 is on thread 1's stack

The error messages all point to me clobbering the stack pointer, but a) I've never run across one that crashes on entry of the function and b) I'm passing pointers around, not the actual array.

Can someone help me out with this? I'm on a 64-bit CentOS box running kernel 2.6.18 and gcc 4.1.2

Thanks!

Matt

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

街角迷惘 2025-01-03 18:01:35

通过将这些巨大的 config_t 结构之一分配到堆栈上,您已经炸毁了堆栈。 gdb 输出中的两个堆栈指针 0x7fffffffe448 和 0x7ffffdf45428 非常能说明这一点。

$ gdb
GNU gdb 6.3.50-20050815 ...blahblahblah...
(gdb) p 0x7fffffffe448 - 0x7ffffdf45428  
$1 = 34312224

有 ~34MB 常量与 config_t 结构的大小相匹配。默认情况下,系统不会为您提供那么多的堆栈空间,因此要么将对象移出堆栈,要么增加堆栈空间。

You've blown up the stack by allocating one of these huge config_t structs onto it. The two stack pointers on evidence in the gdb output, 0x7fffffffe448 and 0x7ffffdf45428, are very suggestive of this.

$ gdb
GNU gdb 6.3.50-20050815 ...blahblahblah...
(gdb) p 0x7fffffffe448 - 0x7ffffdf45428  
$1 = 34312224

There's your ~34MB constant that matches the size of the config_t struct. Systems don't give you that much stack space by default, so either move the object off the stack or increase your stack space.

凉风有信 2025-01-03 18:01:35

简短的答案是,必须在某处声明一个 config_t 为局部变量,这会将其放入堆栈中。可能是一个拼写错误:在某处 CONFIG 声明后缺少 *

The short answer is that there must be a config_t declared as a local variable somewhere, which would put it on the stack. Probably a typo: missing * after a CONFIG declaration somewhere.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文