为什么变量可以在没有声明和定义“正在运行”的情况下被初始化(和使用)?

发布于 2025-01-09 10:03:06 字数 744 浏览 2 评论 0 原文

C++ 不允许“转到定义:”

goto jumpover;
int something = 3;
jumpover:
std::cout << something << std::endl;

这将按预期引发错误,因为“某些内容”不会被声明(或定义)。

但是,我使用汇编代码跳过了:

#include<iostream>
using namespace std;
int main(){
    asm("\njmp tag\n");
    int ptr=9000;//jumped over
    cout << "Ran" << endl;
    asm("\ntag:\n");
    cout << ptr << endl;
    return 0;
}

它打印了 9000,尽管 int ptr=9000;//jumped over 行没有执行,因为程序没有打印 <代码>运行。我预计使用 ptr 时会导致内存损坏/未定义值,因为内存未分配(尽管编译器认为是这样,因为它不理解 ASM)。它怎么知道ptr是9000?

这是否意味着 ptr 是在 main() 开始时创建并分配的(因此不会跳过,由于某些优化或其他原因)或其他原因?

C++ disallows "goto-ing over a definition:"

goto jumpover;
int something = 3;
jumpover:
std::cout << something << std::endl;

This will raise an error as expected, because "something" won't be declared(or defined).

However, I jumped over using assembly code:

#include<iostream>
using namespace std;
int main(){
    asm("\njmp tag\n");
    int ptr=9000;//jumped over
    cout << "Ran" << endl;
    asm("\ntag:\n");
    cout << ptr << endl;
    return 0;
}

It printed 9000, although the int ptr=9000;//jumped over line is NOT executed, because the program did not print Ran. I expected it would cause a memory corruption/undefined value when ptr is used, because the memory isn't allocated(although the compiler thinks it is,because it does not understand ASM). How can it know ptr is 9000?

Does that mean ptr is created and assigned at the start of main()(therefore not skipped,due to some optimizations or whatever) or some other reason?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

做个少女永远怀春 2025-01-16 10:03:06

GCC 不支持 asm() 语句之间的跳转;
你的代码有未定义的行为。
从字面上看,任何事情都可以发生。

它后面没有 __builtin_unreachable(),你甚至没有使用 asm goto("" ::: : "label") (GCC 手册) 告诉它有关 asm 语句可能或的 C 标签可能不会跳到。

当您这样做时,不同版本的 gcc/clang 和不同的优化级别在实践中发生的任何情况都是优化器实际执行的巧合/实现细节/结果。

例如,启用优化后,假设将到达 int ptr=9000; 语句,它将执行常量传播,因为允许假设执行在第一个 asm 语句的末尾。

您必须查看编译器的完整 asm 输出才能了解实际发生的情况。例如 https://godbolt.org/z/MbGhEnK3b 显示 GCC -O0 和 -O2。使用 -O0 确实可以读取未初始化的堆栈空间,因为它会跳过 mov DWORD PTR [rbp-4], 9000,而使用 -O2< /code> 在 call std::basic_ostream 运算符之前得到constant-propagation: mov esi, 9000 <<(int)过载。

因为内存没有分配

空间,实际上是在函数序言中分配的;编译器不会在每次遇到作用域内的声明时生成代码来移动堆栈指针。它们在函数开始时分配一次空间。即使一次性 Tiny C 编译器也是如此工作的,不使用单独的 push 来分配+init 单独的 int 变量。 (在某些情况下,当 push 对于一条指令中的 alloc + init 很有用时,这实际上是一种错过的优化:哪些 C/C++ 编译器可以使用 push pop 指令来创建局部变量,而不仅仅是增加esp 一次?


与大多数其他类型的 C 未定义行为相比,编译器无法在运行时实际检测到此行为并发出警告。 asm 语句只是将文本插入到 GCC 的 asm 输出中,然后将其馈送到汇编器。您需要准确地向编译器描述 asm 的作用(使用约束和诸如 asm goto 之类的东西),以便为编译器提供足够的信息来围绕您的 asm 语句生成正确的代码。

GCC 不会解析 asm 模板中的指令,它只是将其直接复制到 asm 输出。 (或者对于扩展汇编,用根据操作数约束生成的文本替换 %0%1 等操作数。)

Jumping between asm() statements is not supported by GCC;
your code has undefined behaviour.
Literally anything is allowed to happen.

There's no __builtin_unreachable() after it, and you didn't even use asm goto("" ::: : "label") (GCC manual) to tell it about a C label the asm statement might or might not jump to.

Whatever happens in practice with different versions of gcc/clang and different optimization levels when you do that is a coincidence / implementation detail / result of whatever the optimizer actually did.

For example, with optimization enabled it would do constant-propagation assuming that the int ptr=9000; statement would be reached, because it's allowed to assume that execution comes out the end of the first asm statement.

You'd have to look at the compiler's full asm output to see what actually happened. e.g. https://godbolt.org/z/MbGhEnK3b shows GCC -O0 and -O2. With -O0 you do indeed get it reading uninitialized stack space since it jumps over a mov DWORD PTR [rbp-4], 9000, and with -O2 you get constant-propagation: mov esi, 9000 before the call std::basic_ostream<char,... operator <<(int) overload.

because the memory isn't allocated

Space for it actually is allocated in the function prologue; compilers don't generate code to move the stack pointer every time they encounter a declaration inside a scope. They allocate space once at the start of a function. Even the one-pass Tiny C Compiler works this way, not using a separate push to alloc+init separate int vars. (This is actually a missed optimization in some cases when push would be useful to alloc + init in one instruction: What C/C++ compiler can use push pop instructions for creating local variables, instead of just increasing esp once?)


Even moreso than most other kinds of C undefined behaviour, this is not something the compiler can actually detect at run-time to warn you about. asm statements just insert text into GCC's asm output which is fed to the assembler. You need to accurately describe to the compiler what the asm does (using constraints and things like asm goto) to give the compiler enough information to generate correct code around your asm statement.

GCC does not parse the instructions in the asm template, it just copies it directly to the asm output. (Or for Extended asm, substitutes the %0, %1 etc. operands with text generated according to the operand constraints.)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文