CPU 如何知道变量的地址?

发布于 2024-12-24 18:17:28 字数 157 浏览 2 评论 0原文

假设你这样做:

void something()
{
   int* number = new int(16);

   int* sixteen = number;
}

CPU如何知道我要分配给16的地址?

谢谢

Say you do:

void something()
{
   int* number = new int(16);

   int* sixteen = number;
}

How does the CPU know the address that I want to assign to sixteen?

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

孤独岁月 2024-12-31 18:17:28

您的示例代码中没有魔法。以这个片段为例:

int x = 5;
int y = x;

带有指针的代码完全相同 - 计算机不需要知道任何魔法信息,它只是将 number 中的任何内容复制到 sixteen

至于你的评论如下:

但它如何知道 x 或 y 在内存中的位置。如果我要求将 x 复制到 y 中,它如何知道其中一个在哪里。

实际上,在当今的大多数机器上,它们可能都不在内存中,而是在寄存器中。但如果它们在内存中,那么是的,编译器将根据需要发出跟踪所有这些地址的代码。在这种情况下,它们位于堆栈上,因此机器代码将访问堆栈指针寄存器并使用一些编译器确定的偏移量(引用每个特定变量的存储)取消引用它。

这是一个例子。这个简单的函数:

int f(void)
{
  int x = 5;
  int y = x;
  return y;
}

当使用 clang 编译并且没有优化时,在我的机器上给出以下输出:

_f:
 pushq  %rbp               ; save caller's base pointer
 movq   %rsp,%rbp          ; copy stack pointer into base pointer
 movl   $5,0xfc(%rbp)      ; store constant 5 to stack at rbp-4
 movl   0xfc(%rbp),%eax    ; copy value at rbp-4 to register eax
 movl   %eax,0xf8(%rbp)    ; copy value from eax to stack at rbp-8
 movl   0xf8(%rbp),%eax    ; copy value off stack to return value register eax
 popq   %rbp               ; restore caller's base pointer
 ret                       ; return from function

我添加了一些注释来解释生成的代码的每一行的作用。需要注意的重要一点是,堆栈上有两个变量 - 一个位于 0xf8(%rbp)(或者更清楚的是 rbp-8),另一个位于 0xfc(%rbp)(或rbp-4)。基本算法就像原始代码所示 - 常量 5 被保存到 xrbp-4 中,然后该值被复制过来进入 rbp-8 处的 y

“但是堆栈来自从哪里呢?”你可能会问。不过,这个问题的答案取决于操作系统和编译器。这一切都是在调用程序的 main 函数之前设置的,同时进行操作系统所需的其他运行时设置。

There's no magic in your example code. Take this snippet, for example:

int x = 5;
int y = x;

Your code with pointers is exactly the same - the computer doesn't need to know any magic information, it's just copying whatever's in number into sixteen.

As to your comment below:

but how does it know where x or y are in memory. If I ask to copy x into y, how does it know where either of those are.

In practice, on most machines these days, probably neither of them will be in memory, they'll be in registers. But if they are in memory, then yes, the compiler will emit code that keeps track of all of those addresses as necessary. In this case, they'd be on the stack, so the machine code would be accessing the stack pointer register and dereferencing it with some compiler-decided offsets that refer to the storage of each particular variable.

Here's an example. This simple function:

int f(void)
{
  int x = 5;
  int y = x;
  return y;
}

When compiled with clang and no optimizations, gives me the following output on my machine:

_f:
 pushq  %rbp               ; save caller's base pointer
 movq   %rsp,%rbp          ; copy stack pointer into base pointer
 movl   $5,0xfc(%rbp)      ; store constant 5 to stack at rbp-4
 movl   0xfc(%rbp),%eax    ; copy value at rbp-4 to register eax
 movl   %eax,0xf8(%rbp)    ; copy value from eax to stack at rbp-8
 movl   0xf8(%rbp),%eax    ; copy value off stack to return value register eax
 popq   %rbp               ; restore caller's base pointer
 ret                       ; return from function

I added some comments to explain what each line of the generated code does. The important things to see are that there are two variables on the stack - one at 0xf8(%rbp) (or rbp-8 to be clearer) and one at 0xfc(%rbp) (or rbp-4). The basic algorithm is just like the original code shows - the constant 5 gets saved into x at rbp-4, then that value gets copied over into y at rbp-8.

"But where does the stack come from?" you might ask. The answer to that question is operating system and compiler dependent, though. It's all set up prior to your program's main function being called, at the same time as other runtime setup required by your operating system takes place.

辞慾 2024-12-31 18:17:28

CPU 知道,因为你的程序告诉它。这里的魔力在于编译器。首先,我在 Visual Studio 2010 中构建此程序。

这是它生成的反汇编(在调试模式下):

void something()
{
003A13C0  push        ebp  
003A13C1  mov         ebp,esp  
003A13C3  sub         esp,0E8h  
003A13C9  push        ebx  
003A13CA  push        esi  
003A13CB  push        edi  
003A13CC  lea         edi,[ebp-0E8h]  
003A13D2  mov         ecx,3Ah  
003A13D7  mov         eax,0CCCCCCCCh  
003A13DC  rep stos    dword ptr es:[edi]  
   int* number = new int(16);
003A13DE  push        4  
003A13E0  call        operator new (3A1186h)  

在调用 new 运算符之后,EAX = 00097C58 这是内存的地址经理决定让我运行这个程序。这是每当您取消引用号码时都会使用的地址。

003A13E5  add         esp,4  
003A13E8  mov         dword ptr [ebp-0E0h],eax  
003A13EE  cmp         dword ptr [ebp-0E0h],0  
003A13F5  je          something+51h (3A1411h)  
003A13F7  mov         eax,dword ptr [ebp-0E0h]  
003A13FD  mov         dword ptr [eax],10h  
003A1403  mov         ecx,dword ptr [ebp-0E0h]  
003A1409  mov         dword ptr [ebp-0E8h],ecx  
003A140F  jmp         something+5Bh (3A141Bh)  
003A1411  mov         dword ptr [ebp-0E8h],0  
003A141B  mov         edx,dword ptr [ebp-0E8h]  
003A1421  mov         dword ptr [number],edx  
   int* sixteen = number;
003A1424  mov         eax,dword ptr [number]  
003A1427  mov         dword ptr [sixteen],eax  

在这里,您只需确保 16 与 number 的值相同。所以现在他们指向同一个地址。

}

您可以通过在 Locals 调试窗口中检查它们来进行验证:

+       number  0x00097c58  int *
+       sixteen 0x00097c58  int *

您可以进行此实验并逐步完成反汇编。它常常很有启发性。

The CPU knows because your program tells it. The magic here is in the compiler. First I build this program in Visual Studio 2010.

This is the disassembly that it generates (in DEBUG mode):

void something()
{
003A13C0  push        ebp  
003A13C1  mov         ebp,esp  
003A13C3  sub         esp,0E8h  
003A13C9  push        ebx  
003A13CA  push        esi  
003A13CB  push        edi  
003A13CC  lea         edi,[ebp-0E8h]  
003A13D2  mov         ecx,3Ah  
003A13D7  mov         eax,0CCCCCCCCh  
003A13DC  rep stos    dword ptr es:[edi]  
   int* number = new int(16);
003A13DE  push        4  
003A13E0  call        operator new (3A1186h)  

After the call to operator new, EAX = 00097C58 which is the address that the memory manager decided to give me this run of the program. This is the address that will be used whenever you dereference number.

003A13E5  add         esp,4  
003A13E8  mov         dword ptr [ebp-0E0h],eax  
003A13EE  cmp         dword ptr [ebp-0E0h],0  
003A13F5  je          something+51h (3A1411h)  
003A13F7  mov         eax,dword ptr [ebp-0E0h]  
003A13FD  mov         dword ptr [eax],10h  
003A1403  mov         ecx,dword ptr [ebp-0E0h]  
003A1409  mov         dword ptr [ebp-0E8h],ecx  
003A140F  jmp         something+5Bh (3A141Bh)  
003A1411  mov         dword ptr [ebp-0E8h],0  
003A141B  mov         edx,dword ptr [ebp-0E8h]  
003A1421  mov         dword ptr [number],edx  
   int* sixteen = number;
003A1424  mov         eax,dword ptr [number]  
003A1427  mov         dword ptr [sixteen],eax  

Here you're just making sure that sixteen is the same value as number. So now they point at the same address.

}

You can verify by inspecting them in the Locals debug window:

+       number  0x00097c58  int *
+       sixteen 0x00097c58  int *

You can do this experiment and step through the disassembly. It is often very enlightening.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文