虚拟机中的函数调用杀死性能
我用 C 语言编写了一个虚拟机,它有一个调用表,其中填充了指向提供虚拟机操作码功能的函数的指针。当虚拟机运行时,它首先解释程序,为所提供的操作码创建与调用表中适当函数相对应的索引数组。然后它循环遍历数组,调用每个函数,直到到达末尾。
每条指令都非常小,通常只有一行。非常适合内联。问题是编译器不知道何时调用虚拟机的任何指令,因为它是在运行时决定的,因此它无法内联它们。函数调用和参数传递的开销正在降低我的虚拟机的性能。关于如何解决这个问题有什么想法吗?
I wrote a virtual machine in C, which has a call table populated by pointers to functions that provide the functionality of the VM's opcodes. When the virtual machine is run, it first interprets a program, creating an array of indexes corresponding to the appropriate function in the call table for the opcode provided. It then loops through the array, calling each function until it reaches the end.
Each instruction is extremely small, typically one line. Perfect for inlining. The problem is that the compiler doesn't know when any of the virtual machine's instructions are going to be called, as it's decided at runtime, so it can't inline them. The overhead of function calls and argument passing is killing the performance of my VM. Any ideas on how to get around this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
以下是减少开销的一些选项:
fastcall
(或类似的内容)以减少参数传递的开销最终您将了解 JIT 编译、在线分析和重新优化以及各种其他很棒的东西。
Here are some options for reducing the overhead:
fastcall
(or something similar) to reduce the overhead of argument passingEventually you're going to get to the point of JIT-compiling, on-line profiling and reoptimization, and all sorts of other awesome stuff.
您可能想要研究许多好的技术。以下是我熟悉的两个:
内联缓存 - 本质上,找到保留的内容被调用,然后从 vtable 查找切换到仅添加一堆分派到静态已知位置的 if 语句。该技术在 Self 语言中使用效果非常好,是 JVM 的主要优化之一。
跟踪 - 为可能最终使用的每种类型编译一段多态调度的一个版本,但推迟编译,直到代码运行了足够多次。 Mozilla 的 TraceMonkey JavaScript 解释器在许多情况下使用它来获得巨大的性能优势。
希望这有帮助!
There are many good techniques you might want to look into. Here are two that I'm familiar with:
Inline caching- Essentially, finding what keeps getting called, then switching from a vtable lookup to just adding a bunch of if statements that dispatch to a statically-known location. This technique was used to great effect in the Self language and is one of the JVM's major optimizations.
Tracing- Compiling one version of a piece of a polymorphic dispatch for each type that might end up being used, but deferring compilation until the code has been run sufficiently many times. Mozilla's TraceMonkey JavaScript interpreter uses this to get a huge performance win in many cases.
Hope this helps!