C/C++最快的 cmath 日志操作
我正在尝试计算 logab (并获取浮点数,而不是整数)。我计划将其作为 log(b)/log(a)
来执行。从数学上来说,我可以使用任何 cmath
对数函数(基数为 2、e 或 10)来进行此计算;然而,我会在程序中多次运行这个计算,所以我想知道其中一个是否比其他更快(或者更好,如果有一种更快但仍然简单的方法来做到这一点)。如果重要的话,a 和 b 都是整数。
I'm trying to calculate logab (and get a floating point back, not an integer). I was planning to do this as log(b)/log(a)
. Mathematically speaking, I can use any of the cmath
log functions (base 2, e, or 10) to do this calculation; however, I will be running this calculation a lot during my program, so I was wondering if one of them is significantly faster than the others (or better yet, if there is a faster, but still simple, way to do this). If it matters, both a and b are integers.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
首先,预先计算
1.0/log(a)
并将每个log(b)
乘以该表达式。编辑:我最初说自然对数(以 e 为底)最快,但其他人指出以 2 为底由处理器直接支持并且最快。我没有理由怀疑这一点。
编辑2:我最初假设
a
是一个常量,但在重新阅读这个问题时从未陈述过。如果是这样,那么预先计算就没有任何好处。但是,如果是这样,您可以通过适当选择变量名称来保持可读性:奇怪的是,Microsoft 不提供以 2 为基数的日志函数,这解释了为什么我不熟悉它。此外,用于计算对数的 x86 指令 自动包含乘法,不同基数所需的常量也可以通过 优化指令,所以我希望 3 个不同的对数函数具有相同的时序(即使基数 2 也必须乘以 1)。
First, precalculate
1.0/log(a)
and multiply eachlog(b)
by that expression instead.Edit: I originally said that the natural logarithm (base e) would be fastest, but others state that base 2 is supported directly by the processor and would be fastest. I have no reason to doubt it.
Edit 2: I originally assumed that
a
was a constant, but in re-reading the question that is never stated. If so then there would be no benefit to precalculating. If it is however, you can maintain readability with an appropriate choice of variable names:Strangely enough Microsoft doesn't supply a base 2 log function, which explains why I was unfamiliar with it. Also the x86 instruction for calculating logs includes a multiplication automatically, and the constants required for the different bases are also available via an optimized instruction, so I'd expect the 3 different log functions to have identical timing (even base 2 would have to multiply by 1).
由于
b
和a
是整数,因此您可以使用 位旋转来查找以 2 为底的对数。以下是一些:我将让您根据需要选择最佳的“快速对数”函数。
Since
b
anda
are integers, you can use all the glory of bit twiddling to find their logs to the base 2. Here are some:I'll leave it to you to choose the best "fast-log" function for your needs.
在我有数据的平台上,
log2
比其他平台稍快,符合我的预期。但请注意,差异极其极小(只有几个百分点)。这确实不值得担心。编写一个清晰的实现。然后测量性能。
On the platforms for which I have data,
log2
is very slightly faster than the others, in line with my expectations. Note however, that the difference is extremely slight (only a couple percent). This is really not worth worrying about.Write an implementation that is clear. Then measure the performance.
在8087指令集中,只有一条以2为底的对数指令,所以我猜这条指令是最快的。
当然,这类问题很大程度上取决于你的处理器/架构,所以我建议做一个简单的测试并计时。
In the 8087 instruction set, there is only an instruction for the logarithm to base 2, so I would guess this one to be the fastest.
Of course this kind of question depends largely on your processor/architecture, so I would suggest to make a simple test and time it.
答案是:
你甚至没有提到你的CPU类型、变量类型、编译器标志、数据布局。如果您需要并行执行很多操作,我确信会有一个 SIMD 选项。只要您使用对齐和清除简单循环(如果您喜欢古老的方法,则可以使用 valarray),您的编译器就会对其进行优化。
英特尔编译器很可能在这一领域针对英特尔处理器有特定的技巧。
如果您确实想要,可以使用 CUDA 并利用 GPU。
我想,如果您不幸缺少这些指令集 您可能会陷入困境摆弄级别并编写一个算法做了一个很好的近似。在这种情况下,我可以不止一个苹果派打赌 2-log 将比任何其他基本对数更快
The answer is:
You don't even mention your CPU type, the variable type, the compiler flags, the data layout. If you need to do lot's of these in parallel, I'm sure there will be a SIMD option. Your compiler will optimize that as long as you use alignment and clear simple loops (or valarray if you like archaic approaches).
Chances are, the intel compiler has specific tricks for intel processors in this area.
If you really wanted you could use CUDA and leverage GPU.
I suppose, if you are unfortunate enough to lack these instruction sets you could go down at the bit fiddling level and write an algorithm that does a nice approximation. In this case, I can bet more than one apple-pie that 2-log is going to be faster than any other base-log