通常在什么时候为 C++ 中的局部变量分配内存？

发布于 2024-11-29 17:24:40 字数 498 浏览 10 评论 0原文

我正在调试一个相当奇怪的堆栈溢出，据说是由于在堆栈上分配太大的变量引起的，我想澄清以下内容。

假设我有以下函数：

void function()
{
    char buffer[1 * 1024];
    if( condition ) {
       char buffer[1 * 1024];
       doSomething( buffer, sizeof( buffer ) );
    } else {
       char buffer[512 * 1024];
       doSomething( buffer, sizeof( buffer ) );
    }
 }

我知道它依赖于编译器，也取决于优化器的决定，但是为这些局部变量分配内存的典型策略是什么？

一旦进入函数，最坏的情况（1 + 512 KB）是否会立即分配，还是首先分配 1 KB，然后根据条件另外分配 1 或 512 KB？

原文

I'm debugging a rather weird stack overflow supposedly caused by allocating too large variables on stack and I'd like to clarify the following.

Suppose I have the following function:

void function()
{
    char buffer[1 * 1024];
    if( condition ) {
       char buffer[1 * 1024];
       doSomething( buffer, sizeof( buffer ) );
    } else {
       char buffer[512 * 1024];
       doSomething( buffer, sizeof( buffer ) );
    }
 }

I understand, that it's compiler-dependent and also depends on what optimizer decides, but what is the typical strategy for allocating memory for those local variables?

Will the worst case (1 + 512 kilobytes) be allocated immediately once function is entered or will 1 kilobyte be allocated first, then depending on condition either 1 or 512 kilobytes be additionally allocated?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

清晨说晚安 2024-12-06 17:24:41

您的本地（堆栈）变量分配在与堆栈帧相同的空间中。当函数被调用时，堆栈指针会发生变化，为堆栈帧“腾出空间”。它通常在一次调用中完成。如果您使用局部变量使用堆栈，则会遇到堆栈溢出。

无论如何，~512 kbytes 对于堆栈来说确实太大了；您应该使用 std::vector 在堆上分配它。

回复收藏 0 原文

半步萧音过轻尘 2024-12-06 17:24:41

正如您所说，它取决于编译器，但您可以考虑使用 alloca 来克服这个问题。变量仍将在堆栈上分配，并且在超出范围时仍会自动释放，但您可以控制何时以及是否分配堆栈空间。

虽然通常不鼓励使用alloca，但它确实有其用途诸如上述情况。

回复收藏 0 原文

野生奥特曼 2024-12-06 17:24:40

在许多平台/ABI 上，当您进入函数时，会分配整个堆栈帧（包括每个局部变量的内存）。在其他情况下，根据需要一点一点地推入/弹出内存是很常见的。

当然，在一次性分配整个堆栈帧的情况下，不同的编译器仍然可能决定不同的堆栈帧大小。在您的情况下，某些编译器会错过优化机会，并为每个局部变量分配唯一的内存，即使是位于代码不同分支的变量（1 * 1024） code> 数组和您的情况下的 512 * 1024 数组），其中更好的优化编译器应该只分配通过函数的任何路径所需的最大内存（else 路径在你的情况下，所以分配512kb 块应该足够了）。如果你想知道你的平台是做什么的，请查看反汇编。

但看到整个内存块立即分配，我不会感到惊讶。

回复收藏 0 原文

被你宠の有点坏 2024-12-06 17:24:40

我检查了 LLVM：

void doSomething(char*,char*);

void function(bool b)
{
    char b1[1 * 1024];
    if( b ) {
       char b2[1 * 1024];
       doSomething(b1, b2);
    } else {
       char b3[512 * 1024];
       doSomething(b1, b3);
    }
}

产量：

; ModuleID = '/tmp/webcompile/_28066_0.bc'
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
target triple = "x86_64-unknown-linux-gnu"

define void @_Z8functionb(i1 zeroext %b) {
entry:
  %b1 = alloca [1024 x i8], align 1               ; <[1024 x i8]*> [#uses=1]
  %b2 = alloca [1024 x i8], align 1               ; <[1024 x i8]*> [#uses=1]
  %b3 = alloca [524288 x i8], align 1            ; <[524288 x i8]*> [#uses=1]
  %arraydecay = getelementptr inbounds [1024 x i8]* %b1, i64 0, i64 0 ; <i8*> [#uses=2]
  br i1 %b, label %if.then, label %if.else

if.then:                                          ; preds = %entry
  %arraydecay2 = getelementptr inbounds [1024 x i8]* %b2, i64 0, i64 0 ; <i8*> [#uses=1]
  call void @_Z11doSomethingPcS_(i8* %arraydecay, i8* %arraydecay2)
  ret void

if.else:                                          ; preds = %entry
  %arraydecay6 = getelementptr inbounds [524288 x i8]* %b3, i64 0, i64 0 ; <i8*> [#uses=1]
  call void @_Z11doSomethingPcS_(i8* %arraydecay, i8* %arraydecay6)
  ret void
}

declare void @_Z11doSomethingPcS_(i8*, i8*)

您可以在以下位置看到 3 个 alloca函数的顶部。

我必须承认，我对 b2 和 b3 在 IR 中没有折叠在一起感到有点失望，因为只会使用其中之一。

I checked on LLVM:

void doSomething(char*,char*);

void function(bool b)
{
    char b1[1 * 1024];
    if( b ) {
       char b2[1 * 1024];
       doSomething(b1, b2);
    } else {
       char b3[512 * 1024];
       doSomething(b1, b3);
    }
}

Yields:

; ModuleID = '/tmp/webcompile/_28066_0.bc'
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
target triple = "x86_64-unknown-linux-gnu"

define void @_Z8functionb(i1 zeroext %b) {
entry:
  %b1 = alloca [1024 x i8], align 1               ; <[1024 x i8]*> [#uses=1]
  %b2 = alloca [1024 x i8], align 1               ; <[1024 x i8]*> [#uses=1]
  %b3 = alloca [524288 x i8], align 1            ; <[524288 x i8]*> [#uses=1]
  %arraydecay = getelementptr inbounds [1024 x i8]* %b1, i64 0, i64 0 ; <i8*> [#uses=2]
  br i1 %b, label %if.then, label %if.else

if.then:                                          ; preds = %entry
  %arraydecay2 = getelementptr inbounds [1024 x i8]* %b2, i64 0, i64 0 ; <i8*> [#uses=1]
  call void @_Z11doSomethingPcS_(i8* %arraydecay, i8* %arraydecay2)
  ret void

if.else:                                          ; preds = %entry
  %arraydecay6 = getelementptr inbounds [524288 x i8]* %b3, i64 0, i64 0 ; <i8*> [#uses=1]
  call void @_Z11doSomethingPcS_(i8* %arraydecay, i8* %arraydecay6)
  ret void
}

declare void @_Z11doSomethingPcS_(i8*, i8*)

You can see the 3 alloca at the top of the function.

I must admit I am slightly disappointed that b2 and b3 are not folded together in the IR, since only one of them will ever be used.

回复收藏 0 原文