c 中右值存储在哪里？

发布于 2024-08-09 23:21:37 字数 160 浏览 8 评论 0原文

在C中，我有这段代码：

int a;
a = 10 + 5 - 3

我想问： (10+5-3) 存储在哪里？（据我所知，a位于堆栈上，那么(10+5-3)呢？这个右值是如何计算的？）

原文

in C, i have this code piece:

int a;
a = 10 + 5 - 3

I want to ask: where is (10+5-3) stored at?
(As far as I know, a is located on stack, how about (10+5-3)? How does this rvalue get calculated?)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

会傲 2024-08-16 23:21:37

通常，右值“存储”在程序本身内。

换句话说，编译器本身（在程序运行之前）计算 10 + 5 - 3 值（它可以这样做，因为它全部基于常量立即值），并发出汇编代码将此计算的结果存储在分配的任何左值中（在本例中，名为 a 的变量，编译器可能知道该变量是数据段原点的相对地址）。

因此，值为 12 的 r 值只能在程序的二进制文件中找到，在看起来类似于

  mov <some dest, typically DS-relative>, $0C

$0C 的汇编指令中就是“r 值”。

如果 r 值恰好是只能在运行时完成的计算的结果，假设底层 c 代码是： a = 17 * x; // x 一些运行时变量，右值也将被“存储”（或更确切地说具体化）为程序二进制文件中的一系列指令。与上面简单的“mov dest, imm”的区别在于，需要多条指令将变量 x 加载到累加器中，乘以 17，并将结果存储到变量 a 所在的地址。编译器可能会“授权自己”;-) 使用堆栈来获取某些中间结果等，但这样
a) 完全依赖于编译器
b) 瞬变
c) 并且通常只涉及右值的部分
因此，可以肯定地说，右值是一个编译时概念，它封装在程序的某些部分（而不是数据）中，并且除了程序二进制文件之外不存储在任何地方。

回应 paxdiablo：上面提供的解释确实限制了可能性，因为 c 标准实际上不规定了任何这种性质的内容。尽管如此，大多数右值最终都会（至少部分地）通过一些指令实现，这些指令设置了一些指令，以便正确地处理正确的值，无论是计算的（在运行时）还是立即的值。

Typically, the r-value is "stored" within the program itself.

In other words, the compiler itself (before the program is ever run) computes the 10 + 5 - 3 value (it can do so since since it is all based on constant immediate values), and it emits the assembly code to store the result of this calculation in whatever l-value for the assignement (in this case, the variable named a, which the compiler probably knows as a relative address to a data segment origin of sorts).

The r-value, which has a value of 12 is therefore only found inside the binary of the program, within a assembly instruction that looks like

  mov <some dest, typically DS-relative>, $0C

$0C is the "r-value".

If the r-value happened to be the result of a calculation that can only done at run-time, say if the underlying c code was: a = 17 * x; // x some run time var, the r-value would too be "stored" (or rather materialized) as a series of instructions within the program binary. The difference with the simple "mov dest, imm" above is that it would take several instructions to load the variable x in an accumulator, multiply by 17 and store the result at the address where the variable a is. It is possible that the compiler may "authorize itself" ;-) to use the stack for some intermediate result etc. but such would be
a) completely compiler dependent
b) transiant
c) and typically would only involve part of the r-value
it is therefore safe to say that the r-value is a compile-time concept which is encapsulated in parts of the program (not the data), and isn't stored anywhere but in the program binary.

In response to paxdiablo: the explanation offered above is indeed restrictive of the possibilities because the c standard effectively does not dictate anything of that nature. Never the less, most any r-value is eventually materialized, at least in part, by some instructions which sets things up so that the proper value, whether calculated (at run time) or immediate gets addressed properly.

回复收藏 0 原文

不弃不离 2024-08-16 23:21:37

常量可能在编译时被简化，因此您按字面意思提出的问题可能没有帮助。但是，例如，确实需要在运行时从某些变量计算的 i - j + k 之类的东西，可以“存储”在编译器喜欢的任何地方，具体取决于 CPU 架构：编译器通常会尝试尽最大努力使用寄存器，例如

 LOAD AX, i
 SUB AX, j
 ADD AX, k

计算这样的表达式，将其“存储”在累加器寄存器AX中，然后使用STORE AX, dest等将其分配给某个内存位置。如果现代优化编译器在半不错的 CPU 架构（是的，包括 x86！-）上需要将寄存器溢出到内存中以实现任何相当简单的表达式，我会感到非常惊讶！

Constants are probably simplified at compile time, so your question as literally posed may not help. But something like, say, i - j + k that does need to be computed at runtime from some variables, may be "stored" wherever the compiler likes, depending on the CPU architecture: the compiler will typically try to do its best to use registers, e.g.

 LOAD AX, i
 SUB AX, j
 ADD AX, k

to compute such an expression "storing" it in the accumulator register AX, before assigning it to some memory location with STORE AX, dest or the like. I'd be pretty surprised if a modern optimizing compiler on an even semi-decent CPU architecture (yeah, x86 included!-) needed to spill registers to memory for any reasonably simple expression!

回复收藏 0 原文

瑾兮 2024-08-16 23:21:37

这是编译器相关的。通常值 (12) 将由编译器计算。然后将其存储在代码中，通常作为加载/移动立即汇编指令的一部分。

回复收藏 0 原文

栀子花开つ 2024-08-16 23:21:37

RHS（右侧）中的计算结果由编译器在称为“恒定传播”的步骤中计算。
然后，它被存储为将值移动到 a 的汇编指令的操作数。

这是来自 MSVC 的反汇编：

  int a;
  a = 10 + 5 - 3;

0041338E  mov         dword ptr [a],0Ch

The result of the computation in the RHS (right-hand-side) is computed by the compiler in a step that's called "constant propagation".
Then, it is stored as an operand of the assembly instruction moving the value into a

Here's a disassembly from MSVC:

  int a;
  a = 10 + 5 - 3;

0041338E  mov         dword ptr [a],0Ch

回复收藏 0 原文

一枫情书 2024-08-16 23:21:37

它存储在哪里实际上完全取决于编译器。标准没有规定这种行为。

通过实际编译代码并查看汇编器输出可以看到典型的地方：

int main (int argc, char *argv[]) {
    int a;
    a = 10 + 5 - 3;
    return 0;
}

它产生：

        .file   "qq.c"
        .def    ___main;
            .scl    2;
            .type   32;
        .endef
        .text
.globl _main
        .def    _main;
            .scl    2;
            .type   32;
        .endef
_main:
        pushl   %ebp
        movl    %esp, %ebp
        subl    $8, %esp
        andl    $-16, %esp
        movl    $0, %eax
        addl    $15, %eax
        addl    $15, %eax
        shrl    $4, %eax
        sall    $4, %eax
        movl    %eax, -8(%ebp)
        movl    -8(%ebp), %eax
        call    __alloca
        call    ___main
        movl    $12, -4(%ebp)         ;*****
        movl    $0, %eax
        leave
        ret

相关位被标记为;*****，您可以看到该值是由编译器创建的，并且直接插入到 mov 类型指令中。

请注意，之所以这么简单，是因为表达式是一个常量值。一旦引入非常量值（如变量），代码就会变得更加复杂。这是因为您必须在内存中查找这些变量（或者它们可能已经在寄存器中），然后在运行时而不是编译时操作这些值。

至于编译器如何计算该值应该是什么，这与表达式求值有关，是一个完全不同的问题:-)

Where it stores it is actually totally up to the compiler. The standard does not dictate this behavior.

A typical place can be seen by actually compiling the code and looking at the assembler output:

int main (int argc, char *argv[]) {
    int a;
    a = 10 + 5 - 3;
    return 0;
}

which produces:

        .file   "qq.c"
        .def    ___main;
            .scl    2;
            .type   32;
        .endef
        .text
.globl _main
        .def    _main;
            .scl    2;
            .type   32;
        .endef
_main:
        pushl   %ebp
        movl    %esp, %ebp
        subl    $8, %esp
        andl    $-16, %esp
        movl    $0, %eax
        addl    $15, %eax
        addl    $15, %eax
        shrl    $4, %eax
        sall    $4, %eax
        movl    %eax, -8(%ebp)
        movl    -8(%ebp), %eax
        call    __alloca
        call    ___main
        movl    $12, -4(%ebp)         ;*****
        movl    $0, %eax
        leave
        ret

The relevant bit is marked ;***** and you can see that the value is created by the compiler and just inserted directly into a mov type instruction.

Note that it's only this simple because the expression is a constant value. As soon as you introduce non-constant values (like variables), the code becomes a little more complicated. That's because you have to look those variables up in memory (or they may already be in a register) and then manipulate the values at run-time, not compile-time.

As to how the compiler calculates what the value should be, that's to do with expression evaluation and is a whole other question :-)

回复收藏 0 原文