ANSI C #define VS 函数
我对我的代码的性能有疑问。 假设我在 C 中有一个结构体:
typedef struct _CPoint
{
float x, y;
} CPoint;
以及一个使用该结构体的函数。
float distance(CPoint p1, CPoint p2)
{
return sqrt(pow((p2.x-p1.x),2)+pow((p2.y-p1.y),2));
}
我想知道用 #define 替换这个函数是否是一个聪明的主意,
#define distance(p1, p2)(sqrt(pow((p2.x-p1.x),2)+pow((p2.y-p1.y),2)));
我认为它会更快,因为不会有函数开销,而且我想知道我是否应该对我的程序中的所有其他函数使用这种方法程序以提高性能。所以我的问题是:
我应该用 #define 替换所有函数以提高代码的性能吗?
I have an question about performance of my code.
Let's say I have a struct in C for a point:
typedef struct _CPoint
{
float x, y;
} CPoint;
and a function where I use the struct.
float distance(CPoint p1, CPoint p2)
{
return sqrt(pow((p2.x-p1.x),2)+pow((p2.y-p1.y),2));
}
I was wondering if it would be a smart idea to replace this function for a #define,
#define distance(p1, p2)(sqrt(pow((p2.x-p1.x),2)+pow((p2.y-p1.y),2)));
I think it will be faster because there will be no function overhead, and I'm wondering if I should use this approach for all other functions in my program to increase the performance. So my question is:
Should I replace all my functions with #define to increase the performance of my code?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
不可以。您永远不应该根据感知到的性能差异来在宏和函数之间做出决定。您应该仅根据函数相对于宏的优点来评估它。一般选择功能。
宏有很多隐藏的缺点,可能会困扰您。举个例子,你在这里对宏的翻译是不正确的(或者至少没有保留原始函数的语义)。宏
distance
的参数每次都会被计算 2 次。想象一下,我进行了以下调用,在宏版本中,这实际上会导致 4 个函数调用,因为每个参数都会被评估两次。如果将
distance
保留为函数,则只会导致 3 次函数调用(距离和每个参数)。注意:在上述计算中,我忽略了sqrt
和pow
的影响,因为它们在两个版本中是相同的。No. You should never make the decision between a macro and a function based on a perceived performance difference. You should evaluate it soley based on the merits of functions over macros. In general choose functions.
Macros have a lot of hidden downsides that can bite you. Case in point, your translation to a macro here is incorrect (or at least not semantics preserving with the original function). The argument to the macro
distance
gets evaluated 2 times each. Imagine I made the following callIn the macro version this actually results in 4 function calls because each argument is evaluated twice. Had
distance
been left as a function it would only result in 3 function calls (distance and each argument). Note: I'm ignoring the impact ofsqrt
andpow
in the above calculations as they're the same in both versions.包括三件事:
distance
预处理器宏虽然函数保证某种类型安全,但由于需要在以下位置使用堆栈帧,它们也会导致性能损失:每个函数调用。内联函数的代码被复制到调用站点,这样就不会付出代价——但是,您的代码大小将会增加。宏不提供类型安全并且还涉及文本替换。
从这三个函数中进行选择,我通常会使用内联函数。仅当宏非常短并且以这种形式非常有用时才使用宏(例如 Linux 内核中的
hlist_for_each
)There are three things:
distance
aboveWhile functions guarantee some kind of type safety, they also incur a performance loss due to the fact that a stack frame needs to be used at each function call. code from inline functions is copied at the call site so that penalty is not paid -- however, your code size will increase. Macros provide no type safety and also involve textual substitution.
Choosing from all three, I'd usually use inline functions. Macros only when they are very short and very useful in this form (like
hlist_for_each
from the Linux kernel)Jared 是对的,在这种特定情况下,
pow
调用和sqrt
调用所花费的周期将比在对距离
的调用。有时人们认为小代码等于短时间。并非如此。
Jared's right, and in this specific case, the cycles spent in the
pow
calls and thesqrt
call would be in the range of 2 orders of magnitude more than the cycles spent in the call todistance
.Sometimes people assume that small code equals small time. Not so.
我建议使用内联函数而不是宏。它将为您提供宏的任何可能的性能优势,而不会出现丑陋的情况。 (宏有一些陷阱,使得它们作为函数的一般替换非常不确定。特别是,宏参数在每次使用时都会被评估,而函数参数在“调用”之前会被评估一次。)
(注意我也替换了
pow(dx, 2)
和dx * dx
两者是等效的,并且乘法可能更有效。一些编译器可能会尝试优化对 的调用。pow
...但猜猜他们用什么替换了它。)I'd recommend an
inline
function rather than a macro. It'll give you any possible performance benefits of a macro, without the ugliness. (Macros have some gotchas that make them very iffy as a general replacement for functions. In particular, macro args are evaluated every time they're used, while function args are evaluated once each before the "call".)(Note i also replaced
pow(dx, 2)
withdx * dx
. The two are equivalent, and multiplication is more likely to be efficient. Some compilers might try to optimize away the call topow
...but guess what they replace it with.)如果使用相当成熟的编译器,如果打开优化,它会在汇编级别为您执行此操作。
对于 gcc,-O3 或(对于“小”函数)甚至 -O2 选项都可以执行此操作。
有关详细信息,您可以考虑阅读此处 http://gcc.gnu.org/ onlinedocs/gcc/Optimize-Options.html 用于“-finline*”选项。
If using a fairly mature compiler it propaby will do this for you on assembly level if optimisation is swtiched on.
For gcc the -O3 or (for "small" functions) even the -O2 option will do this.
For details on this you might consider reading here http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html for "-finline*" options.