为什么 Calli 比委托调用更快?
我正在使用 Reflection.Emit 并发现了很少使用的 EmitCalli
。出于好奇,我想知道它与常规方法调用是否有什么不同,所以我编写了下面的代码:
using System;
using System.Diagnostics;
using System.Reflection.Emit;
using System.Runtime.InteropServices;
using System.Security;
[SuppressUnmanagedCodeSecurity]
static class Program
{
const long COUNT = 1 << 22;
static readonly byte[] multiply = IntPtr.Size == sizeof(int) ?
new byte[] { 0x8B, 0x44, 0x24, 0x04, 0x0F, 0xAF, 0x44, 0x24, 0x08, 0xC3 }
: new byte[] { 0x0f, 0xaf, 0xca, 0x8b, 0xc1, 0xc3 };
static void Main()
{
var handle = GCHandle.Alloc(multiply, GCHandleType.Pinned);
try
{
//Make the native method executable
uint old;
VirtualProtect(handle.AddrOfPinnedObject(),
(IntPtr)multiply.Length, 0x40, out old);
var mulDelegate = (BinaryOp)Marshal.GetDelegateForFunctionPointer(
handle.AddrOfPinnedObject(), typeof(BinaryOp));
var T = typeof(uint); //To avoid redundant typing
//Generate the method
var method = new DynamicMethod("Mul", T,
new Type[] { T, T }, T.Module);
var gen = method.GetILGenerator();
gen.Emit(OpCodes.Ldarg_0);
gen.Emit(OpCodes.Ldarg_1);
gen.Emit(OpCodes.Ldc_I8, (long)handle.AddrOfPinnedObject());
gen.Emit(OpCodes.Conv_I);
gen.EmitCalli(OpCodes.Calli, CallingConvention.StdCall,
T, new Type[] { T, T });
gen.Emit(OpCodes.Ret);
var mulCalli = (BinaryOp)method.CreateDelegate(typeof(BinaryOp));
var sw = Stopwatch.StartNew();
for (int i = 0; i < COUNT; i++) { mulDelegate(2, 3); }
Console.WriteLine("Delegate: {0:N0}", sw.ElapsedMilliseconds);
sw.Reset();
sw.Start();
for (int i = 0; i < COUNT; i++) { mulCalli(2, 3); }
Console.WriteLine("Calli: {0:N0}", sw.ElapsedMilliseconds);
}
finally { handle.Free(); }
}
delegate uint BinaryOp(uint a, uint b);
[DllImport("kernel32.dll", SetLastError = true)]
static extern bool VirtualProtect(
IntPtr address, IntPtr size, uint protect, out uint oldProtect);
}
我在 x86 模式和 x64 模式下运行了代码。结果?
32 位:
- 代理版本:994
- 审美干扰镜版本:46
64 位:
- 代理版本:326
- 审美干扰镜版本:83
我想现在问题已经很明显了......为什么会有如此巨大的速度差异?
更新:
我还创建了一个 64 位 P/Invoke 版本:
- 代理版本:284
- 审美干扰镜版本:77
- P/调用版本:31
显然,P/Invoke 更快...这是我的基准测试的问题,还是发生了我不明白的事情? (顺便说一下,我正处于发布模式。)
I was playing around with Reflection.Emit and found about about the little-used EmitCalli
. Intrigued, I wondered if it's any different from a regular method call, so I whipped up the code below:
using System;
using System.Diagnostics;
using System.Reflection.Emit;
using System.Runtime.InteropServices;
using System.Security;
[SuppressUnmanagedCodeSecurity]
static class Program
{
const long COUNT = 1 << 22;
static readonly byte[] multiply = IntPtr.Size == sizeof(int) ?
new byte[] { 0x8B, 0x44, 0x24, 0x04, 0x0F, 0xAF, 0x44, 0x24, 0x08, 0xC3 }
: new byte[] { 0x0f, 0xaf, 0xca, 0x8b, 0xc1, 0xc3 };
static void Main()
{
var handle = GCHandle.Alloc(multiply, GCHandleType.Pinned);
try
{
//Make the native method executable
uint old;
VirtualProtect(handle.AddrOfPinnedObject(),
(IntPtr)multiply.Length, 0x40, out old);
var mulDelegate = (BinaryOp)Marshal.GetDelegateForFunctionPointer(
handle.AddrOfPinnedObject(), typeof(BinaryOp));
var T = typeof(uint); //To avoid redundant typing
//Generate the method
var method = new DynamicMethod("Mul", T,
new Type[] { T, T }, T.Module);
var gen = method.GetILGenerator();
gen.Emit(OpCodes.Ldarg_0);
gen.Emit(OpCodes.Ldarg_1);
gen.Emit(OpCodes.Ldc_I8, (long)handle.AddrOfPinnedObject());
gen.Emit(OpCodes.Conv_I);
gen.EmitCalli(OpCodes.Calli, CallingConvention.StdCall,
T, new Type[] { T, T });
gen.Emit(OpCodes.Ret);
var mulCalli = (BinaryOp)method.CreateDelegate(typeof(BinaryOp));
var sw = Stopwatch.StartNew();
for (int i = 0; i < COUNT; i++) { mulDelegate(2, 3); }
Console.WriteLine("Delegate: {0:N0}", sw.ElapsedMilliseconds);
sw.Reset();
sw.Start();
for (int i = 0; i < COUNT; i++) { mulCalli(2, 3); }
Console.WriteLine("Calli: {0:N0}", sw.ElapsedMilliseconds);
}
finally { handle.Free(); }
}
delegate uint BinaryOp(uint a, uint b);
[DllImport("kernel32.dll", SetLastError = true)]
static extern bool VirtualProtect(
IntPtr address, IntPtr size, uint protect, out uint oldProtect);
}
I ran the code in x86 mode and x64 mode. The results?
32-bit:
- Delegate version: 994
- Calli version: 46
64-bit:
- Delegate version: 326
- Calli version: 83
I guess the question's obvious by now... why is there such a huge speed difference?
Update:
I created a 64-bit P/Invoke version as well:
- Delegate version: 284
- Calli version: 77
- P/Invoke version: 31
Apparently, P/Invoke is faster... is this a problem with my benchmarking, or is there something going on I don't understand? (I'm in release mode, by the way.)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
发布评论
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
鉴于您的性能数据,我假设您一定使用 2.0 框架或类似的框架? 4.0 中的数字要好得多,但“Marshal.GetDelegate”版本仍然较慢。
问题是,并非所有代表都是生而平等的。
托管代码函数的委托本质上只是一个直接函数调用(在 x86 上,这是一个 __fastcall),如果您调用静态函数,则添加一点“switcheroo”(但在 x86 上这只是 3 或 4 条指令)。
另一方面,由“Marshal.GetDelegateForFunctionPointer”创建的委托是对“存根”函数的直接函数调用,它在调用非托管函数之前会产生一些开销(编组等)。在这种情况下,几乎没有编组,并且此调用的编组似乎在 4.0 中得到了相当多的优化(但很可能仍然通过 2.0 上的 ML 解释器) - 但即使在 4.0 中,也有一个 stackWalk 要求非托管代码权限,不是您的审美干扰镜代表的一部分。
我通常发现,如果不了解 .NET 开发团队中的某个人,要了解托管/非托管互操作的情况,最好的办法就是使用 WinDbg 和 SOS 进行一些挖掘。
Given your performance numbers, I assume you must be using the 2.0 framework, or something similar? The numbers are much better in 4.0, but the "Marshal.GetDelegate" version is still slower.
The thing is that not all delegates are created equal.
Delegates for managed code functions are essentially just a straight function call (on x86, that's a __fastcall), with the addition of a little "switcheroo" if you're calling a static function (but that's just 3 or 4 instructions on x86).
Delegates created by "Marshal.GetDelegateForFunctionPointer", on the other hand - are a straight function call into a "stub" function, which does a little overhead (marshalling and whatnot) before calling the unmanaged function. In this case there's very little marshalling, and the marshalling for this call appears to be pretty much optimized out in 4.0 (but most likely still goes through the ML interpreter on 2.0) - but even in 4.0, there's a stackWalk demanding unmanaged code permissions that isn't part of your calli delegate.
I've generally found that, short of knowing someone on the .NET dev team, your best bet on figuring out what's going on w/ managed/unmanaged interop is to do a little digging with WinDbg and SOS.