调用堆栈不会说“你从哪里来”,而是说“你下一步要去哪里”?

发布于 2024-11-19 05:11:45 字数 1022 浏览 5 评论 0原文

在上一个问题(获取对象调用层次结构)中,我得到了这个有趣的答案

调用堆栈不会告诉您来自哪里。它是告诉你下一步要去哪里。

据我所知,当到达函数调用时,程序通常会执行以下操作:

  1. 调用代码中:< /p>

    • 存储返回地址(在调用堆栈上)
    • 保存寄存器的状态(在调用堆栈上)
    • 写入将传递给函数的参数(在调用堆栈上或寄存器中)
    • 跳转到目标函数

  2. 调用目标代码中:

    • 检索存储的变量(如果需要)

  3. 返回过程:撤消调用函数时所做的操作,即展开/弹出调用堆栈:

    • 从调用堆栈中删除局部变量
    • 从调用堆栈中删除函数变量
    • 恢复寄存器状态(我们之前存储的状态)
    • 跳转到返回地址(我们之前存储的地址)

问题:

这怎么能被视为“告诉你下一步要去哪里”而不是“告诉你你从哪里来”< /em>?

C# 的 JIT 或 C# 运行时环境中是否有某些东西使调用堆栈的工作方式不同?

感谢您对有关调用堆栈描述的文档的任何指点 - 有大量有关传统调用堆栈如何工作的文档。

In a previous question (Get object call hierarchy), I got this interesting answer:

The call stack is not there to tell you where you came from. It is to tell you where you are going next.

As far as I know, when arriving at a function call, a program generally does the following:

  1. In calling code:

    • store return address (on the call stack)
    • save registers' states (on the call stack)
    • write parameters that will be passed to function (on the call stack or in registers)
    • jump to target function
  2. In called target code:

    • Retrieve stored variables (if needed)
  3. Return process: Undo what we did when we called the function, i.e. unroll/pop the call stack:

    • remove local variables from the call stack
    • remove function variables from the call stack
    • restore registers state (the one we stored before)
    • jump to return address (the one we stored before)

Question:

How can this be viewed as something that "tells you where you are going next" rather than "tell you where you came from"?

Is there something in C#'s JIT or C#'s runtime environment that makes that call stack work differently?

Thanks for any pointers to documentation about this description of a call stack — there's plenty of documentation about how a traditional call stack works.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

甚是思念 2024-11-26 05:11:45

你自己已经解释过了。根据定义,“返回地址”会告诉您下一步要去哪里。

无论如何,都不要求放入堆栈的返回地址是调用您现在所在方法的方法内的地址。 通常是这样,这肯定会让调试变得更容易。但是要求返回地址是调用者内部的地址。优化器被允许(有时确实如此)修改返回地址,如果这样做可以使程序更快(或更小,或者无论它优化的目的是什么)而不改变其含义。

堆栈的目的是确保当该子例程完成时,它的继续 - 接下来发生的事情 - 是正确的。堆栈的目的不是告诉你你从哪里来。它通常这样做是一个令人高兴的意外。

此外:堆栈只是延续激活概念的实现细节。不要求这两个概念由同一个堆栈实现;可能有两个堆栈,一个用于激活(局部变量),一个用于继续(返回地址)。这种架构显然更能抵抗恶意软件的堆栈粉碎攻击,因为返回地址远离数据。

更有趣的是,根本不需要有任何堆栈!我们使用调用堆栈来实现延续,因为它们对于我们通常执行的编程类型:基于子例程的同步调用很方便。我们可以选择将 C# 实现为“延续传递风格”语言,其中延续实际上是 具体化堆上的对象,而不是将一堆字节推送到一百万字节的系统堆栈。然后,该对象从一个方法传递到另一个方法,其中没有一个方法使用任何堆栈。 (然后通过将每个方法分解为可能的多个委托来具体化激活,每个委托都与一个激活对象相关联。)

在连续传递风格中,根本没有堆栈,并且根本没有办法告诉你来自哪里;延续对象没有该信息。它只知道你下一步要去哪里。

这似乎是一个高调的理论胡言乱语,但我们本质上是在下一个版本中将 C# 和 VB 变成连续传递风格的语言;即将到来的“异步”功能只是简单伪装的延续传递风格。在下一个版本中,如果您使用异步功能,您实际上将放弃基于堆栈的编程;无法查看调用堆栈并知道如何到达这里,因为堆栈经常是空的。

对于很多人来说,将延续具体化为调用堆栈以外的东西是一个很难理解的想法;这当然是为了我。但一旦你明白了,你就会明白它的意思。作为温和的介绍,这里有我写的一些关于该主题的文章:

CPS 简介,带有 JScript 示例:

http://blogs.msdn.com/b/ericlippert/archive/2005/08/08/recursion-part-four-continuation-passing-style.aspx

http://blogs.msdn.com/b/ericlippert/archive/2005/08/11/recursion-part- Five-more-on-cps.aspx

http://blogs.msdn.com/b/ericlippert/archive/2005/08/15/recursion-part-six-making-cps-work.aspx

这里有十几篇文章,开头为更深入地研究 CPS,然后解释这一切如何与即将到来的“异步”功能一起工作。从底部开始:

http://blogs.msdn.com/b/ ericlippert/archive/tags/async/

支持延续传递风格的语言通常有一个神奇的控制流原语,称为“使用当前延续进行调用”,简称为“call/cc”。在这个 stackoverflow 问题中,我解释了“await”和“call/cc”之间的细微差别:

c# 5.0 中的新异步功能如何通过 call/cc 实现?

获取官方文档“文档”(一堆白皮书),以及 C# 和 VB 新的“异步等待”功能的预览版,以及支持问答的论坛,请访问:

http://msdn.com/vstudio/async

You've explained it yourself. The "return address" by definition tells you where you are going next.

There is no requirement whatsoever that the return address that is put on the stack is an address inside the method that called the method you're in now. It typically is, which sure makes it easier to debug. But there is not a requirement that the return address be an address inside the caller. The optimizer is permitted to -- and sometimes does -- muck with the return address if doing so makes the program faster (or smaller, or whatever it is optimizing for) without changing its meaning.

The purpose of the stack is to make sure that when this subroutine finishes, it's continuation -- what happens next -- is correct. The purpose of the stack is not to tell you where you came from. That it usually does so is a happy accident.

Moreover: the stack is just an implementation detail of the concepts of continuation and activation. There is no requirement that both concepts be implemented by the same stack; there could be two stacks, one for activations (local variables) and one for continuation (return addresses). Such architectures are obviously much more resistant to stack smashing attacks by malware because the return address is nowhere near the data.

More interestingly, there is no requirement that there be any stack at all! We use call stacks to implement continuation because they are convenient for the kind of programming we typically do: subroutine-based synchronous calls. We could choose to implement C# as a "Continuation Passing Style" language, where the continuation is actually reified as an object on the heap, not as a bunch of bytes pushed on a million byte system stack. That object is then passed around from method to method, none of which use any stack. (Activations are then reified by breaking each method up into possibly many delegates, each of which is associated with an activation object.)

In continuation passing style there simply is no stack, and no way at all to tell where you came from; the continuation object does not have that information. It only knows where you are going next.

This might seem to be a highfalutin theoretical mumbo jumbo, but we essentially are making C# and VB into continuation passing style languages in the next version; the coming "async" feature is just continuation passing style in a thin disguise. In the next version, if you use the async feature you will essentially be giving up stack-based programming; there will be no way to look at the call stack and know how you got here, because the stack will frequently be empty.

Continuations reified as something other than a call stack is a hard idea for a lot of people to get their minds around; it certainly was for me. But once you get it, it just clicks and makes perfect sense. For a gentle introduction, here are a number of articles I've written on the subject:

An introduction to CPS, with examples in JScript:

http://blogs.msdn.com/b/ericlippert/archive/2005/08/08/recursion-part-four-continuation-passing-style.aspx

http://blogs.msdn.com/b/ericlippert/archive/2005/08/11/recursion-part-five-more-on-cps.aspx

http://blogs.msdn.com/b/ericlippert/archive/2005/08/15/recursion-part-six-making-cps-work.aspx

Here are a dozen articles that start by doing a deeper dive into CPS, and then explain how this all works with the coming "async" feature. Start from the bottom:

http://blogs.msdn.com/b/ericlippert/archive/tags/async/

Languages that support continuation passing style often have a magic control flow primitive called "call with current continuation", or "call/cc" for short. In this stackoverflow question, I explain the trivial difference between "await" and "call/cc":

How could the new async feature in c# 5.0 be implemented with call/cc?

To get your hands on the official "documentation" (a bunch of white papers), and a preview release of C# and VB's new "async await" feature, plus a forum for support Q&A, go to:

http://msdn.com/vstudio/async

蓝海 2024-11-26 05:11:45

考虑以下代码:

void Main()
{
    // do something
    A();
    // do something else
}

void A()
{
    // do some processing
    B();
}

void B()
{
}

这里,函数 A 所做的最后一件事是调用 BA 之后立即返回。聪明的优化器可能会优化对 B调用,并将其替换为对 B跳转起始地址。 (不确定当前的 C# 编译器是否执行此类优化,但几乎所有 C++ 编译器都会执行此类优化)。为什么这会起作用?因为堆栈中有A的调用者地址,所以当B完成时,它不会返回到A,而是直接返回到A A 的调用者。

因此,您可以看到堆栈不一定包含有关执行来自何处​​的信息,而是包含执行应该去往何处的信息。

如果没有优化,B 内部的调用堆栈是(为了清楚起见,我省略了局部变量和其他内容):

----------------------------------------
|address of the code calling A         |
----------------------------------------
|address of the return instruction in A|
----------------------------------------

因此,从 B 返回到 A > 并立即退出`A.

经过优化,调用栈就这样

----------------------------------------
|address of the code calling A         |
----------------------------------------

B直接返回到Main

在他的回答中,埃里克提到了另一种(更复杂的)情况,其中堆栈信息不包含真正的调用者。

Consider the following code:

void Main()
{
    // do something
    A();
    // do something else
}

void A()
{
    // do some processing
    B();
}

void B()
{
}

Here, the last thing the function A is doing is calling B. A immediately returns after that. A clever optimizer might optimize out the call to B, and replace it with just a jump to B's start address. (Not sure whether current C# compilers do such optimizations, but almost all C++ compilers do). Why would this work? Because there's an address of the A's caller in the stack, so when B finishes, it would return not to A, but directly to A's caller.

So, you can see that the stack does not necessary contain the information about where did the execution come from, but rather where it should go to.

Without optimization, inside B the call stack is (I omit the local variables and other stuff for clarity):

----------------------------------------
|address of the code calling A         |
----------------------------------------
|address of the return instruction in A|
----------------------------------------

So the return from B returns to A and immediately quits `A.

With the optimization, the call stack is just

----------------------------------------
|address of the code calling A         |
----------------------------------------

So B returns directly to Main.

In his answer, Eric mentions another (more complicated) cases where the stack information doesn't contain the real caller.

萌能量女王 2024-11-26 05:11:45

Eric 在他的帖子中说的是,执行指针不需要知道它来自哪里,只需要知道当前方法结束时它要去哪里。这两件事表面上看起来是同一件事,但是如果(例如)尾递归的情况我们来自哪里和下一步要去哪里可能会有所不同。

What Eric is saying in his post is that the execution pointer does not need to know where it has come from, only where it has to go when the current method ends. These two things superficially would seem to be the same thing, but if the case of (for instance) tail recursion where we came from and where we are going next can diverge.

书间行客 2024-11-26 05:11:45

这比你想象的要多。

在 C 语言中,完全有可能让程序重写调用堆栈。事实上,该技术是一种称为面向返回的编程的漏洞利用风格的基础。

我还用一种语言编写了代码,使您可以直接控制调用堆栈。您可以弹出调用您的函数,然后将其他函数推入其位置。您可以复制调用堆栈顶部的项目,因此调用函数中的其余代码将执行两次,以及一堆其他有趣的事情。事实上,直接操作调用堆栈是该语言提供的主要控制结构。 (挑战:任何人都可以从这个描述中识别出这种语言吗?)

它确实清楚地表明,调用堆栈指示您要去哪里,而不是您去过哪里。

There is more to this than you think.

In C it is entirely possible to have a program rewrite the call stack. Indeed, that technique is the very basis of a style of exploit known as return oriented programming.

I've also written code in one language which gave you direct control over the callstack. You could pop off the function that called yours, and push some other one in its place. You could duplicate the item on the top of the call stack, so the rest of the code in the calling function would get executed twice, and a bunch of other interesting things. In fact direct manipulation of the call stack was the primary control structure provided by this language. (Challenge: can anybody Identify the language from this description?)

It did clearly show that the call stack indicates where you are going, not where you have been.

香草可樂 2024-11-26 05:11:45

我认为他试图说它告诉被调用的方法下一步该去哪里。

  • 方法A调用方法B。
  • 方法B完成后,接下来去哪里?

它将被调用者方法地址从堆栈顶部弹出,然后转到那里。

所以方法 B 知道完成后要去哪里。方法B,并不关心它来自哪里。

I think he's trying to say that it tells the Called method where to go next.

  • Method A calls Method B.
  • Method B completes, where does it go next?

It Pops the callee methods address off the top of the Stack and then goes to there.

So Method B knows where to go after it completes. Method B, doesn't really care where it came from.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文