在空引用上调用实例方法有时会成功
抱歉,文字墙很长,但我想提供有关情况的良好背景。我知道您可以在 IL 中调用空引用上的方法,但就我对 CLR 如何工作的理解而言,仍然不明白这样做时会发生一些非常奇怪的事情。我在这里发现的与此相关的其他几个问题并未涵盖我在这里看到的行为。
下面是一些 IL:
.assembly MrSandbox {}
.class private MrSandbox.AClass {
.field private int32 myField
.method public int32 GetAnInt() cil managed {
.maxstack 1
.locals init ([0] int32 retval)
ldc.i4.3
stloc retval
ldloc retval
ret
}
.method public int32 GetAnotherInt() cil managed {
.maxstack 1
.locals init ([0] int32 retval)
ldarg.0
ldfld int32 MrSandbox.AClass::myField
stloc retval
ldloc retval
ret
}
}
.class private MrSandbox.Program {
.method private static void Main(string[] args) cil managed {
.entrypoint
.maxstack 1
.locals init ([0] class MrSandbox.AClass p,
[1] int32 myInt)
ldnull
stloc p
ldloc p
call instance int32 MrSandbox.AClass::GetAnotherInt()
stloc myInt
ldloc myInt
call void [mscorlib]System.Console::WriteLine(int32)
ret
}
}
现在,当这段代码运行时,我们得到了我期望发生的事情,有点。 callvirt
将检查 null,而 call
则不会,但是,在调用时会抛出 NullReferenceException
。我不清楚这一点,因为我希望出现 System.AccessViolationException 。我将在这个问题的末尾解释我的推理。
如果我们将 Main(string[] args)
中的代码替换为以下内容(在 .locals
行之后):
ldnull
stloc p
ldloc p
call instance int32 MrSandbox.AClass::GetAnInt()
stloc myInt
ldloc myInt
call void [mscorlib]System.Console::WriteLine(int32)
ret
令我惊讶的是,这个代码运行并打印 3
到控制台,成功退出。我正在空引用上调用函数,并且它正在正确执行。我的猜测是,这与没有调用任何实例字段有关,因此 CLR 可以成功执行代码。
最后,这就是我真正感到困惑的地方,将 Main(string[] args)
中的代码替换为以下代码(在 .locals
行之后):
ldnull
stloc p
ldloc p
call instance int32 MrSandbox.AClass::GetAnInt()
stloc myInt
ldloc myInt
call void [mscorlib]System.Console::WriteLine(int32)
call valuetype [mscorlib]System.ConsoleKeyInfo [mscorlib]System.Console::ReadKey()
pop
call instance int32 MrSandbox.AClass::GetAnotherInt()
stloc myInt
ldloc myInt
call void [mscorlib]System.Console::WriteLine(int32)
ret
现在,您希望这段代码做什么?我期望代码将 3
写入控制台,从控制台读取密钥,然后因 NullReferenceException
失败。好吧,这些都没有发生。相反,除了 System.AccessViolationException 之外,不会将任何值打印到屏幕上。为什么不一致呢?
抛开背景,我的问题如下:
1) MSDN 列出,如果 obj 为 null,callvirt
将抛出 NullReferenceException
,但是call
只是说它不能为空。那么为什么它默认抛出 NRE 而不是访问冲突呢?在我看来,通过合约调用会尝试访问内存并失败,而不是首先检查 null 来执行 callvirt 所做的事情。
2)第二个示例之所以有效,是因为它没有访问类级别字段并且 call
不执行空检查吗?如果是这样,如何在空引用上调用非静态方法并返回成功?我的理解是,当引用类型放入堆栈时,只有它放入堆的 Type 对象。那么该方法是从类型对象中调用的吗?
3)为什么第一个和最后一个例子抛出的异常有差异?在我看来,第三个示例抛出了正确的异常,即 AccessViolationException,因为这正是它想要做的;访问未分配的内存。
在“行为未定义”答案出现之前,我知道这根本不是正确的书写方式,我只是希望有人可以帮助对上述问题提供一些见解。
谢谢。
Sorry for the wall of text, but I wanted to give a good background on the situation. I know you can call methods on null references in IL, but still don't understand a few very strange things that happen when you do it, in regards to my understanding of how the CLR works. The few other questions I've found here regarding this didn't cover the behavior I'm seeing here.
Here is some IL:
.assembly MrSandbox {}
.class private MrSandbox.AClass {
.field private int32 myField
.method public int32 GetAnInt() cil managed {
.maxstack 1
.locals init ([0] int32 retval)
ldc.i4.3
stloc retval
ldloc retval
ret
}
.method public int32 GetAnotherInt() cil managed {
.maxstack 1
.locals init ([0] int32 retval)
ldarg.0
ldfld int32 MrSandbox.AClass::myField
stloc retval
ldloc retval
ret
}
}
.class private MrSandbox.Program {
.method private static void Main(string[] args) cil managed {
.entrypoint
.maxstack 1
.locals init ([0] class MrSandbox.AClass p,
[1] int32 myInt)
ldnull
stloc p
ldloc p
call instance int32 MrSandbox.AClass::GetAnotherInt()
stloc myInt
ldloc myInt
call void [mscorlib]System.Console::WriteLine(int32)
ret
}
}
Now, when this code runs, we get what I expect to happen, kind of. callvirt
will check for null, where call
doesn't, however, here on the call a NullReferenceException
is thrown. This isn't clear to me, as I would expect a System.AccessViolationException
instead. I'll explain my reasoning at the end of this question.
If we replace the code inside Main(string[] args)
with this (after the .locals
lines):
ldnull
stloc p
ldloc p
call instance int32 MrSandbox.AClass::GetAnInt()
stloc myInt
ldloc myInt
call void [mscorlib]System.Console::WriteLine(int32)
ret
This one, to my surprise, runs, and prints 3
to the console, exiting successfully. I am calling a function on a null reference, and it's executing properly. My guess is that it has something to do with the fact that no instance fields are being called, so the CLR can successfully execute the code.
Finally, and this is where the real confusion sets in for me, replace the code in Main(string[] args)
with this (after the .locals
lines):
ldnull
stloc p
ldloc p
call instance int32 MrSandbox.AClass::GetAnInt()
stloc myInt
ldloc myInt
call void [mscorlib]System.Console::WriteLine(int32)
call valuetype [mscorlib]System.ConsoleKeyInfo [mscorlib]System.Console::ReadKey()
pop
call instance int32 MrSandbox.AClass::GetAnotherInt()
stloc myInt
ldloc myInt
call void [mscorlib]System.Console::WriteLine(int32)
ret
Now, what would you expect this code to do? I expected the code to write 3
out to the console, read a key from the console, and then fail on a NullReferenceException
. Well, none of that happens. Instead, no values are printed to the screen, except for a System.AccessViolationException
. Why is it inconsistent?
With the background out of the way, here are my questions:
1) MSDN lists that callvirt
will throw a NullReferenceException
if obj is null, but call
just says that it must not be null. Why then, is it throwing an NRE by default instead of an access violation? It seems to me that call
by contract would try and access the memory and fail, instead of doing what callvirt
does by checking for null first.
2) Is the reason why the second example works due to the fact that it accesses no class level fields and that call
doesn't do a null check? If so, how can a non-static method be invoked on a null reference and return successful? My understanding is that when a reference type is put on the stack, only the Type object it put on the heap. So is the method being called from the type object?
3) Why the difference in exceptions throw between the first and the last example? In my opinion, the 3rd example throws the correct exception, an AccessViolationException
since that's exactly what it's trying to do; accessing unallocated memory.
Before the "The behavior is undefined" answers roll in, I know that this is not AT ALL a proper way of writing things, I'm just hoping someone can help to shed some insight on the above questions.
Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
1) 处理器正在引发访问冲突。 CLR 捕获异常并根据异常的访问地址翻译它。地址空间前 64KB 内的任何访问都会被重新引发为托管 NullReferenceException。检查这个答案供参考。
2) 是的,CLR 不强制使用非空 this 值。例如,C++/CLI 编译器生成的代码不执行此检查,就像本机 C++ 所做的那样。只要该方法不使用 this 引用,就不会导致异常。 C# 编译器显式生成代码以在方法调用 callvirt 之前验证 this 的值。请参阅此博客文章供参考。
3)你把IL搞错了,GetAnotherInt()是一个实例方法,但你忘了写ldloc指令。你得到一个 AV 因为引用指针是随机的。
1) The processor is raising an access violation. The CLR traps the exception and translates it, based on the exception's access address. Any access within the first 64KB of the address space is re-raised as a managed NullReferenceException. Check this answer for reference.
2) Yes, the CLR does not enforce a non-null this value. The C++/CLI compiler for example generates code that doesn't perform this check, much like native C++ does. As long as the method doesn't ever use the this reference this will not cause an exception. The C# compiler explicitly generates code to verify the value of this before the method call, callvirt. See this blog post for reference.
3) You got the IL wrong, GetAnotherInt() is an instance method but you forgot to write the ldloc instruction. You get an AV because the reference pointer is random.
我不能肯定回答 2),但这里是 1) 和 3)。
NullReferenceException
与AccessViolationException
是一样的;在 CLR 的早期,根本不存在AccessViolationException
,并且取消引用无效但非空指针仍然会产生NullReferenceException
。这是因为在当今的计算机上,让硬件进行健全性检查的成本较低。您对抛出哪个异常的概念基于 CLR 执行显式 null 检查(
if (foo == null) throw new NullReferenceException()
)的想法,但 Microsoft 的情况并非如此Windows PC 的实施。当您取消引用无效地址时,您的程序会因为执行了无效操作而被中断; CLR 与该中断挂钩,并根据触发故障的地址抛出 NullReferenceException 或 AccessViolationException 异常。这样,它就不需要插入任何内存检查,并且仍然会以可预测的方式运行。
如果我没记错的话,访问
0xFFFF
下的任何地址都会导致NullReferenceException
,而上面的任何内容都将导致AccessViolationException
。您可以使用不安全的代码和指针进行验证。我自己从未在 C# 中使用过不安全的代码,因此以下代码片段可能不起作用,但我希望测试所需的修复是微不足道的。 (一位朋友在 .NET Framework 3 或 3.5 最新时对此进行了测试,因此该数据可能不是最新的。)我对问题 2 的不太远的看法是,地址要调用的方法的类型是在编译时确定的,因为它不能改变。 callvirt 在空引用上出错的原因是它需要访问对象的 vtable,并且为此需要读取对象的标头。对于常规
调用
,由于不需要在运行时确定要调用的方法,因此无需查找任何内容,CLR 可以直接继续。 (至少,C++ 大致是这样工作的,所以我认为它与 CLR 的工作方式相差不远。)I can't answer 2) for sure, but here are for 1) and 3).
A
NullReferenceException
is the same thing as anAccessViolationException
; in the early days of the CLR, there was noAccessViolationException
at all and dereferencing an invalid but non-null pointer still gave aNullReferenceException
.This is because on today's computers, it's less expensive to let the hardware do the sanity check. Your conception of which exception to throw is based on the idea that the CLR does explicit null checks (
if (foo == null) throw new NullReferenceException()
), but this isn't the case on Microsoft's implementation for Windows PCs.When you dereference an invalid address, your program is interrupted because it did something invalid; the CLR is hooked to that interrupt, and will throw either a
NullReferenceException
or anAccessViolationException
depending on the address that triggered the fault. That way, it doesn't need to insert any memory check and it will still behave in a predictable way.If I remember correctly, accessing any address under
0xFFFF
will result in aNullReferenceException
and anything above will be anAccessViolationException
. You can verify with unsafe code and pointers. I have myself never used unsafe code in C#, so the following snippet might not work, but I expect the fixes required to test to be trivial. (A friend tested this with the .NET Framework 3 or 3.5 when it was current, so there's a possibility that this data isn't up-to-date.)My not-too-long-shot about question 2 is that the address of the method to call is determined at compile-time since it cannot vary. The reason a
callvirt
faults on null references is that it needs to access the vtable of the object, and by doing so it needs to read the object's header. With a regularcall
, since the method to call does not need to be determined at runtime, there's nothing to lookup and the CLR can proceed directly. (At least, that's roughly how it works for C++, so I suppose it's not too far away from how the CLR works.)这有点奇怪,因为 OP 声明 PEverify 不会失败。
最后一次调用
GetAnotherInt
看起来无效。此时堆栈上没有任何内容。
这至少解释了 AccessViolationException ;P
不确定为什么 PEVerify 允许它。
更新:
PEVerify 确实失败了。
This is a bit weird as the OP states PEverify does not fail.
That last call to
GetAnotherInt
looks invalid.There is nothing on the stack at that moment.
That explains the
AccessViolationException
at least ;PNot sure why PEVerify allows it.
Update:
PEVerify does indeed fail.