在 C 中调用和执行函数的最快方法是什么?
我定义和编译了很多函数(巨大的列表)。我使用函数指针通过在运行时动态发送参数来调用和执行函数。这是一个迭代过程,每次迭代都会涉及数十万次函数调用。我想知道调用编译函数的有效方法是什么。我感觉我的路比较慢。
I have a lot of functions(huge list) defined and compiled. And I use function pointers to call and execute the functions by sending arguments dynamically during runtime. It is an iterative process involving more than hundred thousand function calls every iteration. I want to know which is the efficient way of calling an compiled function. I feel my way is slower.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
您需要分析您的程序才能知道这是否是一个问题。如果您将 99% 的时间花在各个功能上,那么您可以期望的最佳改进是 1%,但即使这样也是不太可能的。
You need to profile your program to know if this is a problem. If you're spending 99% of your time in the individual functions, the best improvement you can hope for is 1%, and even that would be unlikely.
加快函数调用速度的唯一方法是编译器知道它将调用哪个函数。
也就是说,类似于:
可以内联到:
但如果编译器不知道要调用哪一个:
编译器不可能预测将调用哪个函数,因此无法内联它。
如果您的编译器支持它,您可以尝试使用
__fastcall
,但您需要分析您的代码并查看它是否产生积极的影响。这一层间接不会产生巨大的影响。分析您的代码并找出真正变慢的地方。
The only way you can speed up function calls is if the compiler knows what function it will be calling.
That is, something like:
Could be inlined to:
But if the compiler doesn't know which one to call:
The compiler cannot possibly predict which function will be called, and therefore cannot inline it.
If your compiler supports it, you might try using
__fastcall
, but you need to profile your code and see if it made a positive difference.This one level of indirection isn't going to make a huge difference. Profile your code and find where the real slowdowns are.
这取决于您如何确定要调用这数十万个函数中的哪一个。如果您正在通过函数指针列表进行线性搜索,那么是的,您可能会浪费很多时间。在这种情况下,您应该考虑将函数指针放入哈希表中,或者至少将它们存储在排序列表中,以便可以进行二分搜索。如果没有更多关于您正在做什么以及如何做的信息,就很难为您提供有用的建议。
正如其他人指出的那样,您绝对需要进行分析。听起来你不知道你正在做的事情是否很慢,在这种情况下你也不知道是否值得尝试优化它。
This depends on how you're determining which of these hundreds of thousands of functions to call. If you're doing a linear search through your function pointer list, then yes, you're probably wasting a lot of time. In this case, you should look into putting the function pointers into a hash table, or at least storing them in a sorted list so you can do a binary search. Without more information about what you are doing, and how you're doing it, it's difficult to give you useful advice.
Also you definitely need to profile, as others have pointed out. It sounds like you don't know if what you're doing is slow, in which case you also don't know whether it's worth trying to optimize it.
调用函数的开销主要是以下各项的组合:
首先,提出问题:
一旦你有了一个好的算法和一个有效的实现,你将不得不转向较低级别的优化方法 - 你可以使用汇编器来执行你自己的函数调用协议,该协议需要更少的数据被推送到堆栈上。如果它们是“叶函数”(不调用其他函数),您甚至可能不需要使用堆栈,因此可以避免每次调用时的一些指令开销。 (其中一些可以在 C 中通过用 goto 替换函数调用来完成 - 虽然它非常难看)
最后,您可以进入自修改代码的领域 - 从代表函数的片段构建新的机器代码,然后调用生成的代码。不过,这可能会变得非常特定于处理器并且很棘手 - 它的级别相当低。
The overhead of calling functions is mostly a combination of:
So to start with, ask questions:
Once you have a good algorithm and an efficient implementation, you would have to move down to lower level optimisation methods - you could use assembler to do your own function calling protocol that requires less data to be pushed on the stack. If they are "leaf functions" (that don't call other functions) you may not even need to use a stack, so can avoid a few instructions of overhead on every call. (Some of this could possibly be done in C by replacing function calls with gotos - it's very ugly though)
Lastly, you can get into the realms of self-modifying code - build new machine code out of snippets representing the functions and then call the generated code. This can get very processor specific and tricky though - it's pretty low level.
那么您可以创建自己的函数链接器,它可以将某些函数“片段”调用顺序链接在一起并缓存它们以避免开销。但它可能不会对你有太大帮助。
很大程度上取决于函数的大小。他们在记忆和其他方面是多么接近。例如,如果第二个函数调用正好在内存中的第一个函数调用之后,那么删除函数指针就没有什么意义,因为该函数的开头可能已经被缓存了。
即使您确实向我们提供了更多细节,这也不是一个简单的问题。
正如马克所说......分析器是你的朋友。
Well you could create your own function linker that can link together certain function "fragments" call orders and cache them to avoid overheads. It probably won't help you much though.
A lot depends on the size of the functions. How close they are to each other in memory and all sorts of other things. There would be little point in removing function pointers, for example, if the 2nd function call was right after the first in memory as the start of that function would likely already be cached.
Its not a simple question to answer even if you DID give us a few more details.
As Mark says ... A profiler is your friend.
您应该使用 QProf、Valgrind 或 gprof 来分析您的代码并查看执行时间最多的地方。根据结果,您应该优化占用最多时间的函数。
如果列表迭代过程确实占用了代码的大部分时间,那么您应该尝试优化调用。如果您要在列表中搜索以查找给定函数,您可以尝试列出最常用的函数,或者按调用频率对它们进行排序,这样您的搜索算法就不必在列表中查找太深的内容。找到它正在寻找的功能。
You should use a tool like QProf, Valgrind, or gprof to profile your code and see where the most execution time is being spent. Based on the results, you should optimize the function(s) that are taking up the most time.
If the list iteration procedure really is taking up most of the code's time, then you should try to optimize the calls. If you're searching through the list to find a given function, you might try making a list of the most frequently used functions, or ordering them by call frequency so that your search algorithm doesn't have to look as far into the list to find the function it's looking for.
取消引用函数指针所需的额外指令数量应该是构成函数体的指令数量的一小部分。积极内联每个函数调用不会产生巨大的差异。正如前面的答案所建议的,您确实需要使用分析器来确定瓶颈。
从总体上看,在这里或那里删掉一些指令不会带来任何重大改进。巨大的胜利将来自于改进你的算法。
The number of extra instructions required to dereference a function pointer should be a very small fraction of the number of instructions that make up the body of the function. Aggressively inlining every function call will not make a huge difference. As suggested by the earlier answers, you really need to use a profiler to determine the bottlenecks.
In the big scheme of things, shaving off a few instructions here or there will not make any significant improvements. The big wins will come from improving your algorithms.