及时编译总是更快?

发布于 2024-10-12 06:43:45 字数 770 浏览 4 评论 0原文

向 Stack Overflow 上的所有编译器设计者致以问候。

我目前正在开展一个项目,该项目的重点是开发一种用于高性能计算的新脚本语言。源代码首先被编译成字节码表示形式。然后,字节码由运行时加载,运行时对其执行积极的(并且可能耗时的)优化(这比大多数“提前”编译器所做的更进一步,毕竟这就是整个过程的重点)项目)。请记住,此过程的结果仍然是字节码。

然后字节码在虚拟机上运行。目前,该虚拟机是使用直接跳转表和消息泵来实现的。虚拟机通过指针运行字节码,加载指针下的指令,在跳转表中查找指令处理程序并跳转到其中。指令处理程序执行适当的操作,最后将控制权返回给消息循环。虚拟机的指令指针递增,整个过程重新开始。通过这种方法我能够实现的性能实际上是相当惊人的。当然,实际指令处理程序的代码再次手动微调。

现在大多数“专业”运行时环境(如Java、.NET等)都使用即时编译在执行前将字节代码翻译为本机代码。使用 JIT 的虚拟机通常比字节码解释器具有更好的性能。现在的问题是,由于解释器基本上所做的就是加载指令并在跳转表中查找跳转目标(请记住指令处理程序本身是静态编译到解释器中的,因此它已经是本机代码),是否会使用即时编译会带来性能提升还是实际上会降低性能?我真的无法想象解释器的跳转表会降低性能那么多,以弥补使用 JITer 编译代码所花费的时间。我知道 JITer 可以对代码执行额外的优化,但就我而言,在执行之前已经在字节码级别上执行了非常积极的优化。您认为我可以通过用 JIT 编译器替换解释器来获得更快的速度吗?如果是这样,为什么?

我知道同时实施方法和基准测试将为这个问题提供最准确的答案,但如果有明确的答案,可能不值得花时间。

谢谢。

Greetings to all the compiler designers here on Stack Overflow.

I am currently working on a project, which focuses on developing a new scripting language for use with high-performance computing. The source code is first compiled into a byte code representation. The byte code is then loaded by the runtime, which performs aggressive (and possibly time consuming) optimizations on it (which go much further, than what even most "ahead-of-time" compilers do, after all that's the whole point in the project). Keep in mind the result of this process is still byte code.

The byte code is then run on a virtual machine. Currently, this virtual machine is implemented using a straight-forward jump table and a message pump. The virtual machine runs over the byte code with a pointer, loads the instruction under the pointer, looks up an instruction handler in the jump table and jumps into it. The instruction handler carries out the appropriate actions and finally returns control to the message loop. The virtual machine's instruction pointer is incremented and the whole process starts over again. The performance I am able to achieve with this approach is actually quite amazing. Of course, the code of the actual instruction handlers is again fine-tuned by hand.

Now most "professional" run-time environments (like Java, .NET, etc.) use Just-in-Time compilation to translate the byte code into native code before execution. A VM using a JIT does usually have much better performance than a byte code interpreter. Now the question is, since all an interpreter basically does is load an instruction and look up a jump target in a jump table (remember the instruction handler itself is statically compiled into the interpreter, so it is already native code), will the use of Just-in-Time compilation result in a performance gain or will it actually degrade performance? I cannot really imagine the jump table of the interpreter to degrade performance that much to make up the time that was spent on compiling that code using a JITer. I understand that a JITer can perform additional optimization on the code, but in my case very aggressive optimization is already performed on the byte code level prior to execution. Do you think I could gain more speed by replacing the interpreter by a JIT compiler? If so, why?

I understand that implementing both approaches and benchmarking will provide the most accurate answer to this question, but it might not be worth the time if there is a clear-cut answer.

Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

所有深爱都是秘密 2024-10-19 06:43:45

答案在于单字节代码指令复杂度与跳转表开销的比率。如果您正在对大型矩阵乘法等高级操作进行建模,那么一点点开销将是微不足道的。如果您要递增单个整数,那么这当然会受到跳转表的显着影响。总体而言,平衡将取决于该语言所用于的时间紧迫任务的性质。如果它是一种通用语言,那么对于所有事物来说,以最小的开销更有用,因为您不知道在紧密循环中将使用什么。要快速量化潜在的改进,只需对执行一些简单操作(但无法优化的操作)的一些嵌套循环与等效的 C 或 C++ 程序进行基准测试即可。

The answer lies in the ratio of single-byte-code-instruction complexity to jump table overheads. If you're modelling high level operations like large matrix multiplications, then a little overhead will be insignificant. If you're incrementing a single integer, then of course that's being dramatically impacted by the jump table. Overall, the balance will depend upon the nature of the more time-critical tasks the language is used for. If it's meant to be a general purpose language, then it's more useful for everything to have minimal overhead as you don't know what will be used in a tight loop. To quickly quantify the potential improvement, simply benchmark some nested loops doing some simple operations (but ones that can't be optimised away) versus an equivalent C or C++ program.

缺⑴份安定 2024-10-19 06:43:45

JIT 理论上可以更好地优化,因为它具有编译时不可用的信息(特别是关于典型的运行时行为)。因此,它可以进行更好的分支预测、根据需要推出循环等。

我确信你的跳转表方法没问题,但我仍然认为与直接的 C 代码相比,它的性能相当差,你不觉得吗?

JIT can theoretically optimize better, since it has information not available at compile time (especially about typical runtime behavior). So it can for example do better branch prediction, roll out loops as needed, et.c.

I am sure your jumptable approach is OK, but I still think it would perform rather poor compared to straight C code, don't you think?

自找没趣 2024-10-19 06:43:45

当您使用解释器时,处理器中的代码缓存会缓存解释器代码;不是字节码(可能缓存在数据缓存中)。由于代码缓存比数据缓存快 2 到 3 倍,IIRC;如果进行 JIT 编译,您可能会看到性能提升。此外,您正在执行的本机真实代码可能是 PIC; JIT 代码可以避免这种情况。

恕我直言,其他一切都取决于字节码的优化程度。

When you use an interpreter, the code cache in your processor caches the interpreter code; not the byte code (which may be cached in the data cache). Since code caches are 2 to 3 times faster than data caches, IIRC; you may see a performance boost if you JIT compile. Also, the native, real code you are executing is probably PIC; something which can be avoided for JITted code.

Everything else depends on how optimized the byte code is, IMHO.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文