当前位置：文江博客话题详情

替换方法 MethodBody 中的指令

发布于 2024-08-31 23:13:36 字数 8552 浏览 12 评论 0 原文

（首先，这是一篇非常长的文章，但不用担心：我已经实现了所有这些，我只是询问您的意见或可能的替代方案。）

我在实现以下内容时遇到了困难；我希望得到一些帮助：

我得到一个 Type 作为参数。
我使用反射定义了一个子类。请注意，我并不打算修改原始类型，而是创建一个新类型。

我为原始类的每个字段创建一个属性，如下所示：

公共类 OriginalClass {
    私有 int x;
}


公共类子类：OriginalClass {
    私有 int x;

    公共 int X {
        得到{返回x; }
        设置 { x = 值; }
    }

}

对于超类的每个方法，我在子类中创建一个类似的方法。方法的主体必须相同，只是我将指令 ldfld x 替换为 callvirt this.get_X，也就是说，我不是直接从字段读取，而是调用 get 访问器.

我在执行第 4 步时遇到了麻烦。我知道您不应该像这样操作代码，但我确实需要这样做。

以下是我的尝试：

尝试#1：使用 Mono.Cecil。这将使我能够将方法的主体解析为人类可读的指令，并轻松替换指令。但是，原始类型不在 .dll 文件中，因此我找不到使用 Mono.Cecil 加载它的方法。将类型写入 .dll，然后加载它，然后修改它并将新类型写入磁盘（我认为这是使用 Mono.Cecil 创建类型的方式），然后加载它看起来像一个巨大的开销。

尝试#2：使用 Mono.Reflection。这也允许我将正文解析为指令，但随后我不支持替换指令。我已经使用 Mono.Reflection 实现了一个非常丑陋且低效的解决方案，但它还不支持包含 try-catch 语句的方法（尽管我想我可以实现这个），而且我担心可能还有其他场景它不会工作，因为我以一种不寻常的方式使用 ILGenerator 。而且，它非常难看；）。这就是我所做的：

private void TransformMethod(MethodInfo methodInfo) {

    // Create a method with the same signature.
    ParameterInfo[] paramList = methodInfo.GetParameters();
    Type[] args = new Type[paramList.Length];
    for (int i = 0; i < args.Length; i++) {
        args[i] = paramList[i].ParameterType;
    }
    MethodBuilder methodBuilder = typeBuilder.DefineMethod(
        methodInfo.Name, methodInfo.Attributes, methodInfo.ReturnType, args);
    ILGenerator ilGen = methodBuilder.GetILGenerator();

    // Declare the same local variables as in the original method.
    IList<LocalVariableInfo> locals = methodInfo.GetMethodBody().LocalVariables;
    foreach (LocalVariableInfo local in locals) {
        ilGen.DeclareLocal(local.LocalType);
    }

    // Get readable instructions.
    IList<Instruction> instructions = methodInfo.GetInstructions();

    // I first need to define labels for every instruction in case I
    // later find a jump to that instruction. Once the instruction has
    // been emitted I cannot label it, so I'll need to do it in advance.
    // Since I'm doing a first pass on the method's body anyway, I could
    // instead just create labels where they are truly needed, but for
    // now I'm using this quick fix.
    Dictionary<int, Label> labels = new Dictionary<int, Label>();
    foreach (Instruction instr in instructions) {
        labels[instr.Offset] = ilGen.DefineLabel();
    }

    foreach (Instruction instr in instructions) {

        // Mark this instruction with a label, in case there's a branch
        // instruction that jumps here.
        ilGen.MarkLabel(labels[instr.Offset]);

        // If this is the instruction that I want to replace (ldfld x)...
        if (instr.OpCode == OpCodes.Ldfld) {
            // ...get the get accessor for the accessed field (get_X())
            // (I have the accessors in a dictionary; this isn't relevant),
            MethodInfo safeReadAccessor = dataMembersSafeAccessors[((FieldInfo) instr.Operand).Name][0];
            // ...instead of emitting the original instruction (ldfld x),
            // emit a call to the get accessor,
            ilGen.Emit(OpCodes.Callvirt, safeReadAccessor);

        // Else (it's any other instruction), reemit the instruction, unaltered.
        } else {
            Reemit(instr, ilGen, labels);
        }

    }

}

可怕的 Reemit 方法来了：

private void Reemit(Instruction instr, ILGenerator ilGen, Dictionary<int, Label> labels) {

    // If the instruction doesn't have an operand, emit the opcode and return.
    if (instr.Operand == null) {
        ilGen.Emit(instr.OpCode);
        return;
    }

    // Else (it has an operand)...

    // If it's a branch instruction, retrieve the corresponding label (to
    // which we want to jump), emit the instruction and return.
    if (instr.OpCode.FlowControl == FlowControl.Branch) {
        ilGen.Emit(instr.OpCode, labels[Int32.Parse(instr.Operand.ToString())]);
        return;
    }

    // Otherwise, simply emit the instruction. I need to use the right
    // Emit call, so I need to cast the operand to its type.
    Type operandType = instr.Operand.GetType();
    if (typeof(byte).IsAssignableFrom(operandType))
        ilGen.Emit(instr.OpCode, (byte) instr.Operand);
    else if (typeof(double).IsAssignableFrom(operandType))
        ilGen.Emit(instr.OpCode, (double) instr.Operand);
    else if (typeof(float).IsAssignableFrom(operandType))
        ilGen.Emit(instr.OpCode, (float) instr.Operand);
    else if (typeof(int).IsAssignableFrom(operandType))
        ilGen.Emit(instr.OpCode, (int) instr.Operand);
    ... // you get the idea. This is a pretty long method, all like this.
}

分支指令是一种特殊情况，因为 instr.Operand 是 SByte ，但 Emit 需要一个 Label 类型的操作数。因此需要字典标签。

正如你所看到的，这非常可怕。更重要的是，它并不适用于所有情况，例如包含 try-catch 语句的方法，因为我没有使用方法 BeginExceptionBlock、BeginCatchBlock 发出它们， ILGenerator 等。这变得越来越复杂。我想我可以做到：MethodBody 有一个ExceptionHandlingClause 列表，其中应包含执行此操作所需的信息。但无论如何我不喜欢这个解决方案，所以我会将其保存为最后的解决方案。

尝试#3：直接复制 MethodBody.GetILAsByteArray() 返回的字节数组，因为我只想将一条指令替换为另一条指令相同的大小会产生完全相同的结果：它在堆栈上加载相同类型的对象，等等。因此不会有任何标签移动，并且一切都应该完全相同。我已经这样做了，替换了数组的特定字节，然后调用 MethodBuilder.CreateMethodBody(byte[], int)，但我仍然遇到同样的错误，但有异常，而且我仍然需要声明局部变量，否则我会收到错误...即使我只是复制方法的主体并且不更改任何内容。所以这更有效，但我仍然必须处理异常等。

叹息。

这是尝试 #3 的实现，以防有人感兴趣：（

private void TransformMethod(MethodInfo methodInfo, Dictionary<string, MethodInfo[]> dataMembersSafeAccessors, ModuleBuilder moduleBuilder) {

    ParameterInfo[] paramList = methodInfo.GetParameters();
    Type[] args = new Type[paramList.Length];
    for (int i = 0; i < args.Length; i++) {
        args[i] = paramList[i].ParameterType;
    }
    MethodBuilder methodBuilder = typeBuilder.DefineMethod(
        methodInfo.Name, methodInfo.Attributes, methodInfo.ReturnType, args);

    ILGenerator ilGen = methodBuilder.GetILGenerator();

    IList<LocalVariableInfo> locals = methodInfo.GetMethodBody().LocalVariables;
    foreach (LocalVariableInfo local in locals) {
        ilGen.DeclareLocal(local.LocalType);
    }

    byte[] rawInstructions = methodInfo.GetMethodBody().GetILAsByteArray();
    IList<Instruction> instructions = methodInfo.GetInstructions();

    int k = 0;
    foreach (Instruction instr in instructions) {

        if (instr.OpCode == OpCodes.Ldfld) {

            MethodInfo safeReadAccessor = dataMembersSafeAccessors[((FieldInfo) instr.Operand).Name][0];

            // Copy the opcode: Callvirt.
            byte[] bytes = toByteArray(OpCodes.Callvirt.Value);
            for (int m = 0; m < OpCodes.Callvirt.Size; m++) {
                rawInstructions[k++] = bytes[put.Length - 1 - m];
            }

            // Copy the operand: the accessor's metadata token.
            bytes = toByteArray(moduleBuilder.GetMethodToken(safeReadAccessor).Token);
            for (int m = instr.Size - OpCodes.Ldfld.Size - 1; m >= 0; m--) {
                rawInstructions[k++] = bytes[m];
            }

        // Skip this instruction (do not replace it).
        } else {
            k += instr.Size;
        }

    }

    methodBuilder.CreateMethodBody(rawInstructions, rawInstructions.Length);

}


private static byte[] toByteArray(int intValue) {
    byte[] intBytes = BitConverter.GetBytes(intValue);
    if (BitConverter.IsLittleEndian)
        Array.Reverse(intBytes);
    return intBytes;
}



private static byte[] toByteArray(short shortValue) {
    byte[] intBytes = BitConverter.GetBytes(shortValue);
    if (BitConverter.IsLittleEndian)
        Array.Reverse(intBytes);
    return intBytes;
}

我知道它不太漂亮。抱歉。我很快把它放在一起，看看它是否可行。）

我没有太大希望，但有人可以建议吗还有比这更好的吗？

抱歉这篇文章太长了，谢谢。

更新#1：啊……我刚刚读过这个msdn 文档：

[CreateMethodBody 方法] 是目前不完全支持。这用户无法提供位置令牌修复和异常处理程序。

在尝试任何事情之前我真的应该阅读文档。有一天我会了解到...

这意味着选项 #3 不能支持 try-catch 语句，这使得它对我来说毫无用处。我真的必须使用可怕的#2吗？：/ 帮助！ :P

更新 #2: 我已经成功实现了支持异常的尝试 #2。它很丑陋，但它有效。当我稍微完善代码时，我会将其发布在这里。这不是优先事项，因此可能需要几周后才能完成。只是让您知道，以防有人对此感兴趣。

感谢您的建议。

原文

(First of all, this is a very lengthy post, but don't worry: I've already implemented all of it, I'm just asking your opinion, or possible alternatives.)

I'm having trouble implementing the following; I'd appreciate some help:

I get a Type as parameter.
I define a subclass using reflection. Notice that I don't intend to modify the original type, but create a new one.

I create a property per field of the original class, like so:

public class OriginalClass {
    private int x;
}


public class Subclass : OriginalClass {
    private int x;

    public int X {
        get { return x; }
        set { x = value; }
    }

}

For every method of the superclass, I create an analogous method in the subclass. The method's body must be the same except that I replace the instructions ldfld x with callvirt this.get_X, that is, instead of reading from the field directly I call the get accessor.

I'm having trouble with step 4. I know you're not supposed to manipulate code like this, but I really need to.

Here's what I've tried:

Attempt #1: Use Mono.Cecil. This would allow me to parse the body of the method into human-readable Instructions, and easily replace instructions. However, the original type isn't in a .dll file, so I can't find a way to load it with Mono.Cecil. Writing the type to a .dll, then load it, then modify it and write the new type to disk (which I think is the way you create a type with Mono.Cecil), and then load it seems like a huge overhead.

Attempt #2: Use Mono.Reflection. This would also allow me to parse the body into Instructions, but then I have no support for replacing instructions. I've implemented a very ugly and inefficient solution using Mono.Reflection, but it doesn't yet support methods that contain try-catch statements (although I guess I can implement this) and I'm concerned that there may be other scenarios in which it won't work, since I'm using the ILGenerator in a somewhat unusual way. Also, it's very ugly ;). Here's what I've done:

private void TransformMethod(MethodInfo methodInfo) {

    // Create a method with the same signature.
    ParameterInfo[] paramList = methodInfo.GetParameters();
    Type[] args = new Type[paramList.Length];
    for (int i = 0; i < args.Length; i++) {
        args[i] = paramList[i].ParameterType;
    }
    MethodBuilder methodBuilder = typeBuilder.DefineMethod(
        methodInfo.Name, methodInfo.Attributes, methodInfo.ReturnType, args);
    ILGenerator ilGen = methodBuilder.GetILGenerator();

    // Declare the same local variables as in the original method.
    IList<LocalVariableInfo> locals = methodInfo.GetMethodBody().LocalVariables;
    foreach (LocalVariableInfo local in locals) {
        ilGen.DeclareLocal(local.LocalType);
    }

    // Get readable instructions.
    IList<Instruction> instructions = methodInfo.GetInstructions();

    // I first need to define labels for every instruction in case I
    // later find a jump to that instruction. Once the instruction has
    // been emitted I cannot label it, so I'll need to do it in advance.
    // Since I'm doing a first pass on the method's body anyway, I could
    // instead just create labels where they are truly needed, but for
    // now I'm using this quick fix.
    Dictionary<int, Label> labels = new Dictionary<int, Label>();
    foreach (Instruction instr in instructions) {
        labels[instr.Offset] = ilGen.DefineLabel();
    }

    foreach (Instruction instr in instructions) {

        // Mark this instruction with a label, in case there's a branch
        // instruction that jumps here.
        ilGen.MarkLabel(labels[instr.Offset]);

        // If this is the instruction that I want to replace (ldfld x)...
        if (instr.OpCode == OpCodes.Ldfld) {
            // ...get the get accessor for the accessed field (get_X())
            // (I have the accessors in a dictionary; this isn't relevant),
            MethodInfo safeReadAccessor = dataMembersSafeAccessors[((FieldInfo) instr.Operand).Name][0];
            // ...instead of emitting the original instruction (ldfld x),
            // emit a call to the get accessor,
            ilGen.Emit(OpCodes.Callvirt, safeReadAccessor);

        // Else (it's any other instruction), reemit the instruction, unaltered.
        } else {
            Reemit(instr, ilGen, labels);
        }

    }

}

And here comes the horrible, horrible Reemit method:

private void Reemit(Instruction instr, ILGenerator ilGen, Dictionary<int, Label> labels) {

    // If the instruction doesn't have an operand, emit the opcode and return.
    if (instr.Operand == null) {
        ilGen.Emit(instr.OpCode);
        return;
    }

    // Else (it has an operand)...

    // If it's a branch instruction, retrieve the corresponding label (to
    // which we want to jump), emit the instruction and return.
    if (instr.OpCode.FlowControl == FlowControl.Branch) {
        ilGen.Emit(instr.OpCode, labels[Int32.Parse(instr.Operand.ToString())]);
        return;
    }

    // Otherwise, simply emit the instruction. I need to use the right
    // Emit call, so I need to cast the operand to its type.
    Type operandType = instr.Operand.GetType();
    if (typeof(byte).IsAssignableFrom(operandType))
        ilGen.Emit(instr.OpCode, (byte) instr.Operand);
    else if (typeof(double).IsAssignableFrom(operandType))
        ilGen.Emit(instr.OpCode, (double) instr.Operand);
    else if (typeof(float).IsAssignableFrom(operandType))
        ilGen.Emit(instr.OpCode, (float) instr.Operand);
    else if (typeof(int).IsAssignableFrom(operandType))
        ilGen.Emit(instr.OpCode, (int) instr.Operand);
    ... // you get the idea. This is a pretty long method, all like this.
}

Branch instructions are a special case because instr.Operand is SByte, but Emit expects an operand of type Label. Hence the need for the Dictionary labels.

As you can see, this is pretty horrible. What's more, it doesn't work in all cases, for instance with methods that contain try-catch statements, since I haven't emitted them using methods BeginExceptionBlock, BeginCatchBlock, etc, of ILGenerator. This is getting complicated. I guess I can do it: MethodBody has a list of ExceptionHandlingClause that should contain the necessary information to do this. But I don't like this solution anyway, so I'll save this as a last-resort solution.

Attempt #3: Go bare-back and just copy the byte array returned by MethodBody.GetILAsByteArray(), since I only want to replace a single instruction for another single instruction of the same size that produces the exact same result: it loads the same type of object on the stack, etc. So there won't be any labels shifting and everything should work exactly the same. I've done this, replacing specific bytes of the array and then calling MethodBuilder.CreateMethodBody(byte[], int), but I still get the same error with exceptions, and I still need to declare the local variables or I'll get an error... even when I simply copy the method's body and don't change anything.
So this is more efficient but I still have to take care of the exceptions, etc.

Sigh.

Here's the implementation of attempt #3, in case anyone is interested:

private void TransformMethod(MethodInfo methodInfo, Dictionary<string, MethodInfo[]> dataMembersSafeAccessors, ModuleBuilder moduleBuilder) {

    ParameterInfo[] paramList = methodInfo.GetParameters();
    Type[] args = new Type[paramList.Length];
    for (int i = 0; i < args.Length; i++) {
        args[i] = paramList[i].ParameterType;
    }
    MethodBuilder methodBuilder = typeBuilder.DefineMethod(
        methodInfo.Name, methodInfo.Attributes, methodInfo.ReturnType, args);

    ILGenerator ilGen = methodBuilder.GetILGenerator();

    IList<LocalVariableInfo> locals = methodInfo.GetMethodBody().LocalVariables;
    foreach (LocalVariableInfo local in locals) {
        ilGen.DeclareLocal(local.LocalType);
    }

    byte[] rawInstructions = methodInfo.GetMethodBody().GetILAsByteArray();
    IList<Instruction> instructions = methodInfo.GetInstructions();

    int k = 0;
    foreach (Instruction instr in instructions) {

        if (instr.OpCode == OpCodes.Ldfld) {

            MethodInfo safeReadAccessor = dataMembersSafeAccessors[((FieldInfo) instr.Operand).Name][0];

            // Copy the opcode: Callvirt.
            byte[] bytes = toByteArray(OpCodes.Callvirt.Value);
            for (int m = 0; m < OpCodes.Callvirt.Size; m++) {
                rawInstructions[k++] = bytes[put.Length - 1 - m];
            }

            // Copy the operand: the accessor's metadata token.
            bytes = toByteArray(moduleBuilder.GetMethodToken(safeReadAccessor).Token);
            for (int m = instr.Size - OpCodes.Ldfld.Size - 1; m >= 0; m--) {
                rawInstructions[k++] = bytes[m];
            }

        // Skip this instruction (do not replace it).
        } else {
            k += instr.Size;
        }

    }

    methodBuilder.CreateMethodBody(rawInstructions, rawInstructions.Length);

}


private static byte[] toByteArray(int intValue) {
    byte[] intBytes = BitConverter.GetBytes(intValue);
    if (BitConverter.IsLittleEndian)
        Array.Reverse(intBytes);
    return intBytes;
}



private static byte[] toByteArray(short shortValue) {
    byte[] intBytes = BitConverter.GetBytes(shortValue);
    if (BitConverter.IsLittleEndian)
        Array.Reverse(intBytes);
    return intBytes;
}

(I know it isn't pretty. Sorry. I put it quickly together to see if it would work.)

I don't have much hope, but can anyone suggest anything better than this?

Sorry about the extremely lengthy post, and thanks.

UPDATE #1: Aggh... I've just read this in the msdn documentation:

[The CreateMethodBody method] is
currently not fully supported. The
user cannot supply the location of
token fix ups and exception handlers.

I should really read the documentation before trying anything. Some day I'll learn...

This means option #3 can't support try-catch statements, which makes it useless for me. Do I really have to use the horrible #2? :/ Help! :P

UPDATE #2: I've successfully implemented attempt #2 with support for exceptions. It's quite ugly, but it works. I'll post it here when I refine the code a bit. It's not a priority, so it may be a couple of weeks from now. Just letting you know in case someone is interested in this.

Thanks for your suggestions.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

↘人皮目录ツ 2024-09-07 23:13:36

我正在尝试做一件非常相似的事情。我已经尝试过你的#1方法，我同意，这会产生巨大的开销（尽管我还没有精确测量）。

有一个 DynamicMethod 类 - 根据MSDN - “定义并表示可以编译、执行和丢弃的动态方法。丢弃的方法可用于垃圾收集。”

就性能而言，听起来不错。

使用 ILReader我可以将正常的 MethodInfo 转换为动态方法。当您查看 ILReader 库，您可以找到我们需要的代码：

byte[] code = body.GetILAsByteArray();
ILReader reader = new ILReader(method);
ILInfoGetTokenVisitor visitor = new ILInfoGetTokenVisitor(ilInfo, code);
reader.Accept(visitor);

ilInfo.SetCode(code, body.MaxStackSize);

理论上，这让我们可以修改现有方法的代码并将其作为动态方法运行。

我现在唯一的问题是 Mono.Cecil 不允许我们保存方法的字节码（至少我找不到方法）。当您下载 Mono.Cecil 源代码时，它有一个 CodeWriter 类来完成任务，但它不是公开的。

我对这种方法的另一个问题是 MethodInfo -> DynamicMethod 转换仅适用于 ILReader。但这是可以解决的。

调用的性能取决于我使用的方法。调用短方法 10'000'000 次后，我得到以下结果：

Reflection.Invoke - 14 sec
DynamicMethod.Invoke - 26 sec
DynamicMethod with delegates - 9 sec

接下来我要尝试的是：

使用 Cecil 加载原始方法
修改Cecil 中的代码
从程序集中剥离未修改的代码
将程序集保存为 MemoryStream 而不是文件使用
反射加载新程序集（从内存）
如果是一次性调用，则使用反射调用方法调用方法
生成 DynamicMethod 的委托并存储它们我想定期调用该方法，
尝试找出是否可以从内存中卸载不需要的程序集（释放 MemoryStream 和运行时程序集表示），

这听起来像是很多工作，但可能不起作用，我们会看到:)

我希望它有帮助，让我知道你的想法。

I am trying to do a very similar thing. I have already tried your #1 approach, and I agree, that creates a huge overhead (I haven't measured it exactly though).

There is a DynamicMethod class which is - according to MSDN - "Defines and represents a dynamic method that can be compiled, executed, and discarded. Discarded methods are available for garbage collection."

Performance wise it sounds good.

With the ILReader library I could convert normal MethodInfo to DynamicMethod. When you look into the ConvertFrom method of the DyanmicMethodHelper class of the ILReader library you can find the code we'd need:

byte[] code = body.GetILAsByteArray();
ILReader reader = new ILReader(method);
ILInfoGetTokenVisitor visitor = new ILInfoGetTokenVisitor(ilInfo, code);
reader.Accept(visitor);

ilInfo.SetCode(code, body.MaxStackSize);

Theoretically this let's us modify the code of an existing method and run it as a dynamic method.

My only problem now is that Mono.Cecil does not allow us to save the bytecode of a method (at least I could not find the way to do it). When you download the Mono.Cecil source code it has a CodeWriter class to accomplish the task, but it is not public.

Other problem I have with this approach is that MethodInfo -> DynamicMethod transformation works only with static methods with ILReader. But this can be worked around.

The performance of the invocation depends on the method I used. I got following results after calling short method 10'000'000 times:

Reflection.Invoke - 14 sec
DynamicMethod.Invoke - 26 sec
DynamicMethod with delegates - 9 sec

Next thing I'm going to try is:

load original method with Cecil
modify the code in Cecil
strip off of the unmodified code from the assembly
save the assembly as MemoryStream instead of File
load the new assembly (from memory) with Reflection
call the method with reflection invoke if its a one-time call
generate DynamicMethod's delegates and store them if I want to call that method regularly
try to find out if I can unload the not necessary assemblies from memory (free up both MemoryStream and run-time assembly representation)

It sounds like a lot of work and it might not work, we'll see :)

I hope it helps, let me know what you think.

回复收藏 0 原文