callvirt 的底层是如何工作的?

发布于 2024-10-04 12:37:13 字数 1452 浏览 6 评论 0原文

我试图了解 CLR 如何实现引用类型和多态性。我参考了 Don Box 的 Essential .Net Vol 1,它对于简化大部分内容有很大帮助。但是,当我尝试使用一些 IL 代码以更好地理解时,我对以下问题感到困惑/困惑。

我会尽力解释这个问题。 考虑以下代码

class Base
{
    public void m()
    {
        Console.WriteLine("Base.m");
    }
}
class Derived : Base
{
    public void m()
    {
        Console.WriteLine("Derived.m");
    }
}

现在考虑一个简单的控制台应用程序,其主方法的 IL 如下所示。 我手动调整了编译器创建的 IL,以理解并再次使用 ILAsm.exe 进行组装。

.class private auto ansi beforefieldinit Console1.Program
       extends [mscorlib]System.Object
{
    .method private hidebysig static void  Main(string[] args) cil managed
    {
      .entrypoint
      // Code size       44 (0x2c)
      .maxstack  1
      .locals init ([0] class Console1.Base d)
      nop
      newobj     instance void Console1.Base::.ctor()
      stloc.0
      ldloc.0
      callvirt   instance void Console1.Derived::m()
      nop
      call       string [mscorlib]System.Console::ReadLine()
      pop
      ret
    } // end of method Program::Main
} // end of class Console1.Program

我希望此代码运行,因为对象引用指向 Base 的对象,并且该方法无法运行基础对象的表将有一个在派生类中定义的方法 m() 的条目。

但神奇的是,这段代码执行了 Derived.m()!

所以,上面的代码中有两个问题我不明白:

  1. 下面IL代码中指定的Type有什么意义?我尝试通过将其更改为不同类型(例如System.Exception!!)进行实验,并且没有报告错误。为什么??

    .locals init ([0] class Console1.Base d)

  2. callvirt 到底是如何工作的?调用是如何路由到 Derived.m() 的?

提前致谢!!

问候, 阿杰

I am trying to understand how the CLR implements reference types and polymorphism. I have referred to Don Box's Essential .Net Vol 1 which is a great help to calrify most of the stuff. But I am stuck/confused by the following issue when I tried to play around with some IL code to understand better.

I will try to explain the problem as best as I can.
Consider the following code

class Base
{
    public void m()
    {
        Console.WriteLine("Base.m");
    }
}
class Derived : Base
{
    public void m()
    {
        Console.WriteLine("Derived.m");
    }
}

Now consider a simple console application with IL of the main method shown below.
I tweaked the IL created by compiler manually to understand and assembled again with ILAsm.exe

.class private auto ansi beforefieldinit Console1.Program
       extends [mscorlib]System.Object
{
    .method private hidebysig static void  Main(string[] args) cil managed
    {
      .entrypoint
      // Code size       44 (0x2c)
      .maxstack  1
      .locals init ([0] class Console1.Base d)
      nop
      newobj     instance void Console1.Base::.ctor()
      stloc.0
      ldloc.0
      callvirt   instance void Console1.Derived::m()
      nop
      call       string [mscorlib]System.Console::ReadLine()
      pop
      ret
    } // end of method Program::Main
} // end of class Console1.Program

I was expecting this code NOT to run as the object reference is pointing to an object of Base and there is no way the method table of a base object will have an entry for the method m() defined in Derived class.

But magically this code executes the Derived.m()!!

So, there are two questions I don't understand in the above code:

  1. What is the significance of the Type specified in the below IL code? I have tried to experiment by changing this to different types (e.g System.Exception!!) and no errors are reported. Why??

    .locals init ([0] class Console1.Base d)

  2. How exactly does callvirt works? How did the call get routed to Derived.m()?

Thanks in advance!!

Regards,
Ajay

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

つ可否回来 2024-10-11 12:37:13

我的猜测是,抖动意识到 Derived.m 不是虚拟的,因此永远无法指向其他任何地方。因此,callvirt 减少为空检查和调用,而不是通过 v 表进行调用。

尝试将 Derived.m 设为虚拟。我打赌它会抛出。

即使在调用非虚方法时,如果 C# 编译器无法证明 this!=null,它也会发出 callvirt 指令,因此它会进行 null 检查。在这种情况下,抖动足够智能,可以用具有固定地址(甚至内联地址)的正常调用来替换虚拟调用。

并且您应该检查您的代码是否可验证。我认为不是。

My guess is that the jitter realizes that Derived.m isn't virtual and thus can never point anywhere else. So the callvirt reduces to a null-check and a call instead of a call through the v-table.

Try making Derived.m virtual. I bet it'll throw then.

The C# compiler emits callvirt instructions even when calling a non virtual methods if it can't prove that this!=null so it gets a null-check. And the jitter is intelligent enough in that case to replace the virtual call by a normal call with a fixed address(or even inline it).

And you should check if you're code is verifiable. I think it isn't.

树深时见影 2024-10-11 12:37:13

您的代码不可验证(通过 peverify 运行它)。我写了一篇博客文章 有关 callvirt 底层工作原理的信息可能会帮助您了解它的作用以及代码的执行方式。

请记住,如果作为普通程序运行,CLR 确实会尝试执行不可验证的代码;只有当它确实引起问题时,它才会失效。

在您的示例中,在 Base 实例上调用 Derived.m() 是有效的,因为对象实例的实际运行时二进制表示形式是相同的; this 对象基本相同,没有访问对象的实例字段。

尝试将实例字段访问放入这两种方法中,看看会发生什么......

Your code isn't verifiable (run it through peverify). I've written a blog post about how callvirt works under-the-hood that might help you understand what it does, and how your code executes.

Bear in mind that the CLR does try to execute non-verifiable code if run as a normal program; only if it actually causes a problem does it bork.

In your example, calling Derived.m() on an instance of Base works because the actual run-time binary representation of the object instances is the same; the this object is basically the same, and no instance fields of the objects are accessed.

Try putting an instance field access into both methods and see what happens...

素手挽清风 2024-10-11 12:37:13

请注意,默认情况下,不会验证从本地计算机执行的代码。这意味着可以编写和执行无效代码。我怀疑你的主要功能不会按原样通过。 PEVerify 工具可以检查程序集以确保代码类型安全,或者您可以通过 安全策略管理

locals 语句中类型的目的是声明局部变量的类型。这提供了类型验证器所需的信息,以验证对局部变量的成员访问是否在正确类型的对象上进行。

Callvirt 可以通过多种方式实现。最可能的方式与 C++ vtable 的实现方式相同:一个对象包含一个函数指针表。每个函数都位于表中预定义的偏移处。要调用该函数,将加载并调用预定义偏移处的地址。请注意,在某些情况下,如果对象的类型已知,则 CLR 可以执行其他优化。是否做到这一点,我不知道。

please note that by default, code executed from the local machine is not verified. This means that invalid code can be written and executed. I suspect your main function will not pass as-is. The PEVerify tool can check an assembly to ensure the code is type-safe, or you can enable these checks for code from the local machine or from a specific location via Security Policy Administration.

The purpose of the type in the locals statement is to declare the type of the local variable. This provides the information needed by the type verifier to verify that member accesses on the local variable are operating on an object of the correct type.

Callvirt could be implemented several ways. The most likely way is in the same way C++ vtables are implemented: An object contains a table of function pointers. Each function is located at a predefined offset in the table. To call the function, the address at the predefined offset is loaded and called. Note that in some cases, the CLR could do additional optimizations if the type of the object is known. Whether this is done, I don't know.

风情万种。 2024-10-11 12:37:13

我认为这是 JIT 编译器优化的副作用。如果 m() 方法是虚拟方法,则必须生成机器代码以从对象中挖掘方法表指针,然后进行虚拟调用。但这个方法不是虚拟的,并且 JIT 编译器已经知道派生类的方法表指针。因此它绕过了指针检索并直接提供它。使通话按照您观察到的方式进行。您可以通过检查生成的机器代码来验证我的猜测。

是的,IL 验证器在这里没有得分。您可以通过使用 Derived.m() 方法修改仅在 Derived 中声明的字段来使其变得更有趣。我见过太多因 AccessViolation 而导致 Reflection.Emit 代码崩溃的情况,对此我感到非常惊讶。然而,这很可能是故意的,无论如何都不需要验证 IL 是否崩溃。不确定,利用此类验证漏洞并不常见。值得庆幸的是。

I think this is a side-effect of a JIT compiler optimization. If the m() method was virtual, it would have to generate the machine code to dig the method table pointer out of the object, then make the virtual call. But this method isn't virtual and the JIT compiler already knows the method table pointer for the Derived class. So it bypasses the pointer retrieval and supplies it directly. Making the call work as you observed. You can verify my guess by checking the generated machine code.

Yeah, the IL verifier isn't scoring any points here. You could make it more interesting by having the Derived.m() method tinker with a field that's only declared in Derived. I've seen too much Reflection.Emit code crash with an AccessViolation to be greatly surprised by this. It however may well be intentional, no need to verify IL that crashes anyway. Not sure, exploiting these kind of verification loopholes isn't (yet) common. Thankfully.

绅刃 2024-10-11 12:37:13

有关其如何在幕后更深入地工作的更多信息,请查看此 StackExchange 问题/答案:
callvirt .NET 指令如何用于接口?

For more information about how this works even deeper under the hood, check out this StackExchange question/answer:
How does the callvirt .NET instruction work for interfaces?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文