分支优化

发布于 2024-08-03 05:31:41 字数 389 浏览 6 评论 0原文

从性能的角度来看,分支函数调用的最佳实现是什么?

在简单的情况下,我们有一个相当大的 switch 语句,它解释字节码并根据代码执行函数调用。

在正常情况下,我们计算了执行相同操作的 goto 和标签。

做到这一点的绝对最佳方法是什么?

一个抽象的例子,

schedule: 
    swap_entity();
    goto *entity_start();

lb_code1:
    do_stuff();
    goto *next_code_item();

lb_code2:
    do_stuff();
    goto *next_code_item();

...

编辑:我对“分支函数调用”的引用可能有些错误。分支代码执行。

What is the best implementation, from a performance point of view, of branched function calls?

In the naive case we have a rather large switch statement that interprets bytecode and executes a function call depending on code.

In the normal case we have computed gotos and labels that do the same thing.

What is the absolute best way to do this?

An abstract example,

schedule: 
    swap_entity();
    goto *entity_start();

lb_code1:
    do_stuff();
    goto *next_code_item();

lb_code2:
    do_stuff();
    goto *next_code_item();

...

Edit: My reference to "branched function calls" was perhaps somewhat erroneous. Branched code execution.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

ゝ偶尔ゞ 2024-08-10 05:31:41

猜测可能是函数指针数组:

void dispatch(Message* message)
{
  //MessageType is a finite enum
  MessageType messageType = message->messageType;
  int index = (int)messageType;
  //there's an array element for each enum value
  FunctionPointer functionPointer = arrayOfFunctionPointers[index];
  (*functionPointer)(message);
}

实际答案取决于硬件,并且取决于问题的大小和 CPU 缓存等因素。

Maybe an array of function pointers, at a guess:

void dispatch(Message* message)
{
  //MessageType is a finite enum
  MessageType messageType = message->messageType;
  int index = (int)messageType;
  //there's an array element for each enum value
  FunctionPointer functionPointer = arrayOfFunctionPointers[index];
  (*functionPointer)(message);
}

The actual answer is hardware-dependent, and depends on things like the size of the problem and the CPU's cache.

失退 2024-08-10 05:31:41

这取决于。某些表驱动的方法通常是最快的,但您可能会发现这就是您的 switch 语句的实现方式。当然,你应该
不要认为该领域的任何建议
来自SO用户的是最好的。如果我们建议某些内容,您需要实现它并在打开所有编译器优化的情况下测量构建中的性能。

It depends. Some table driven approach will normally be fastest, but you may well find that is what your switch statement is implemented as. Certainly, you should
not take it as read that ANY recommendation in this area
from SO users is the best. If we suggest something, you need to implement it and measure the performance in a build with all compiler optimisations turned on.

傲娇萝莉攻 2024-08-10 05:31:41

如果您正在寻求速度提升,您应该考虑其他字节码调度机制。 之前有一个问题

基本上,您现在有一个可能每次都错误预测的 goto,然后是函数调用。使用诸如直接线程之类的技术,您可能可以显着减少解释器的开销。 内联线程 更难,但是更大的效益。

我提供了一些更多资源另一个问题。

If you're looking for a speed boost here, you should look at other bytecode dispatch mechanisms. There was a question which sort-of asked that before.

Basically, you now have a goto which is probably incorrectly predicted every time, followed by a function call. With a technique like direct threading, you can probably reduce your interpreter overhead significantly. Inline threading is harder, but with greater benefit.

I gave some further resources in the other question.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文