callvirt 的底层是如何工作的？

发布于 2024-10-04 12:37:13 字数 1452 浏览 6 评论 0原文

我试图了解 CLR 如何实现引用类型和多态性。我参考了 Don Box 的 Essential .Net Vol 1，它对于简化大部分内容有很大帮助。但是，当我尝试使用一些 IL 代码以更好地理解时，我对以下问题感到困惑/困惑。

我会尽力解释这个问题。考虑以下代码

class Base
{
    public void m()
    {
        Console.WriteLine("Base.m");
    }
}
class Derived : Base
{
    public void m()
    {
        Console.WriteLine("Derived.m");
    }
}

现在考虑一个简单的控制台应用程序，其主方法的 IL 如下所示。我手动调整了编译器创建的 IL，以理解并再次使用 ILAsm.exe 进行组装。

.class private auto ansi beforefieldinit Console1.Program
       extends [mscorlib]System.Object
{
    .method private hidebysig static void  Main(string[] args) cil managed
    {
      .entrypoint
      // Code size       44 (0x2c)
      .maxstack  1
      .locals init ([0] class Console1.Base d)
      nop
      newobj     instance void Console1.Base::.ctor()
      stloc.0
      ldloc.0
      callvirt   instance void Console1.Derived::m()
      nop
      call       string [mscorlib]System.Console::ReadLine()
      pop
      ret
    } // end of method Program::Main
} // end of class Console1.Program

我希望此代码不运行，因为对象引用指向 Base 的对象，并且该方法无法运行基础对象的表将有一个在派生类中定义的方法 m() 的条目。

但神奇的是，这段代码执行了 Derived.m()！

所以，上面的代码中有两个问题我不明白：

下面IL代码中指定的Type有什么意义？我尝试通过将其更改为不同类型（例如System.Exception！！）进行实验，并且没有报告错误。为什么？？
.locals init ([0] class Console1.Base d)
callvirt 到底是如何工作的？调用是如何路由到 Derived.m() 的？

提前致谢！！

问候，阿杰

原文

I am trying to understand how the CLR implements reference types and polymorphism. I have referred to Don Box's Essential .Net Vol 1 which is a great help to calrify most of the stuff. But I am stuck/confused by the following issue when I tried to play around with some IL code to understand better.

I will try to explain the problem as best as I can.
Consider the following code

class Base
{
    public void m()
    {
        Console.WriteLine("Base.m");
    }
}
class Derived : Base
{
    public void m()
    {
        Console.WriteLine("Derived.m");
    }
}

Now consider a simple console application with IL of the main method shown below.
I tweaked the IL created by compiler manually to understand and assembled again with ILAsm.exe

.class private auto ansi beforefieldinit Console1.Program
       extends [mscorlib]System.Object
{
    .method private hidebysig static void  Main(string[] args) cil managed
    {
      .entrypoint
      // Code size       44 (0x2c)
      .maxstack  1
      .locals init ([0] class Console1.Base d)
      nop
      newobj     instance void Console1.Base::.ctor()
      stloc.0
      ldloc.0
      callvirt   instance void Console1.Derived::m()
      nop
      call       string [mscorlib]System.Console::ReadLine()
      pop
      ret
    } // end of method Program::Main
} // end of class Console1.Program

I was expecting this code NOT to run as the object reference is pointing to an object of Base and there is no way the method table of a base object will have an entry for the method m() defined in Derived class.

But magically this code executes the Derived.m()!!

So, there are two questions I don't understand in the above code:

What is the significance of the Type specified in the below IL code? I have tried to experiment by changing this to different types (e.g System.Exception!!) and no errors are reported. Why??
.locals init ([0] class Console1.Base d)
How exactly does callvirt works? How did the call get routed to Derived.m()?

Thanks in advance!!

Regards,
Ajay

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

つ可否回来 2024-10-11 12:37:13

我的猜测是，抖动意识到 Derived.m 不是虚拟的，因此永远无法指向其他任何地方。因此，callvirt 减少为空检查和调用，而不是通过 v 表进行调用。

尝试将 Derived.m 设为虚拟。我打赌它会抛出。

即使在调用非虚方法时，如果 C# 编译器无法证明 this!=null，它也会发出 callvirt 指令，因此它会进行 null 检查。在这种情况下，抖动足够智能，可以用具有固定地址（甚至内联地址）的正常调用来替换虚拟调用。

并且您应该检查您的代码是否可验证。我认为不是。

回复收藏 0 原文

树深时见影 2024-10-11 12:37:13

您的代码不可验证（通过 peverify 运行它）。我写了一篇博客文章有关 callvirt 底层工作原理的信息可能会帮助您了解它的作用以及代码的执行方式。

请记住，如果作为普通程序运行，CLR 确实会尝试执行不可验证的代码；只有当它确实引起问题时，它才会失效。

在您的示例中，在 Base 实例上调用 Derived.m() 是有效的，因为对象实例的实际运行时二进制表示形式是相同的； this 对象基本相同，没有访问对象的实例字段。

尝试将实例字段访问放入这两种方法中，看看会发生什么......

回复收藏 0 原文

素手挽清风 2024-10-11 12:37:13

请注意，默认情况下，不会验证从本地计算机执行的代码。这意味着可以编写和执行无效代码。我怀疑你的主要功能不会按原样通过。 PEVerify 工具可以检查程序集以确保代码类型安全，或者您可以通过安全策略管理。

locals 语句中类型的目的是声明局部变量的类型。这提供了类型验证器所需的信息，以验证对局部变量的成员访问是否在正确类型的对象上进行。

Callvirt 可以通过多种方式实现。最可能的方式与 C++ vtable 的实现方式相同：一个对象包含一个函数指针表。每个函数都位于表中预定义的偏移处。要调用该函数，将加载并调用预定义偏移处的地址。请注意，在某些情况下，如果对象的类型已知，则 CLR 可以执行其他优化。是否做到这一点，我不知道。

回复收藏 0 原文

风情万种。 2024-10-11 12:37:13

我认为这是 JIT 编译器优化的副作用。如果 m() 方法是虚拟方法，则必须生成机器代码以从对象中挖掘方法表指针，然后进行虚拟调用。但这个方法不是虚拟的，并且 JIT 编译器已经知道派生类的方法表指针。因此它绕过了指针检索并直接提供它。使通话按照您观察到的方式进行。您可以通过检查生成的机器代码来验证我的猜测。

是的，IL 验证器在这里没有得分。您可以通过使用 Derived.m() 方法修改仅在 Derived 中声明的字段来使其变得更有趣。我见过太多因 AccessViolation 而导致 Reflection.Emit 代码崩溃的情况，对此我感到非常惊讶。然而，这很可能是故意的，无论如何都不需要验证 IL 是否崩溃。不确定，利用此类验证漏洞并不常见。值得庆幸的是。

回复收藏 0 原文