C/C++最快的 cmath 日志操作

发布于 2024-11-19 07:04:48 字数 227 浏览 7 评论 0原文

我正在尝试计算 logab (并获取浮点数,而不是整数)。我计划将其作为 log(b)/log(a) 来执行。从数学上来说,我可以使用任何 cmath 对数函数(基数为 2、e 或 10)来进行此计算;然而,我会在程序中多次运行这个计算,所以我想知道其中一个是否比其他更快(或者更好,如果有一种更快但仍然简单的方法来做到这一点)。如果重要的话,a 和 b 都是整数。

I'm trying to calculate logab (and get a floating point back, not an integer). I was planning to do this as log(b)/log(a). Mathematically speaking, I can use any of the cmath log functions (base 2, e, or 10) to do this calculation; however, I will be running this calculation a lot during my program, so I was wondering if one of them is significantly faster than the others (or better yet, if there is a faster, but still simple, way to do this). If it matters, both a and b are integers.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

追星践月 2024-11-26 07:04:48

首先,预先计算 1.0/log(a) 并将每个 log(b) 乘以该表达式。

编辑:我最初说自然对数(以 e 为底)最快,但其他人指出以 2 为底由处理器直接支持并且最快。我没有理由怀疑这一点。

编辑2:我最初假设a是一个常量,但在重新阅读这个问题时从未陈述过。如果是这样,那么预先计算就没有任何好处。但是,如果是这样,您可以通过适当选择变量名称来保持可读性:

const double base_a = 1.0 / log(a);
for (int b = 0; b < bazillions; ++b)
    double result = log(b) * base_a;

奇怪的是,Microsoft 不提供以 2 为基数的日志函数,这解释了为什么我不熟悉它。此外,用于计算对数的 x86 指令 自动包含乘法,不同基数所需的常量也可以通过 优化指令,所以我希望 3 个不同的对数函数具有相同的时序(即使基数 2 也必须乘以 1)。

First, precalculate 1.0/log(a) and multiply each log(b) by that expression instead.

Edit: I originally said that the natural logarithm (base e) would be fastest, but others state that base 2 is supported directly by the processor and would be fastest. I have no reason to doubt it.

Edit 2: I originally assumed that a was a constant, but in re-reading the question that is never stated. If so then there would be no benefit to precalculating. If it is however, you can maintain readability with an appropriate choice of variable names:

const double base_a = 1.0 / log(a);
for (int b = 0; b < bazillions; ++b)
    double result = log(b) * base_a;

Strangely enough Microsoft doesn't supply a base 2 log function, which explains why I was unfamiliar with it. Also the x86 instruction for calculating logs includes a multiplication automatically, and the constants required for the different bases are also available via an optimized instruction, so I'd expect the 3 different log functions to have identical timing (even base 2 would have to multiply by 1).

謌踐踏愛綪 2024-11-26 07:04:48

由于 ba 是整数,因此您可以使用 位旋转来查找以 2 为底的对数。以下是一些:

  • 在 O(N) 操作中查找具有 MSB N 集合的整数的以 2 为底的对数(显而易见的方法)
  • 查找整数对数以 2 为底的整数使用 64 位 IEEE 浮点数
  • 使用查找表查找整数的对数以 2 为底
  • 在 O(lg(N)) 运算中查找 N 位整数的对数以 2 为底
  • 查找 N 位整数的对数以 2 为底在 O(lg(N)) 乘法和查找运算中,

我将让您根据需要选择最佳的“快速对数”函数。

Since b and a are integers, you can use all the glory of bit twiddling to find their logs to the base 2. Here are some:

  • Find the log base 2 of an integer with the MSB N set in O(N) operations (the obvious way)
  • Find the integer log base 2 of an integer with an 64-bit IEEE float
  • Find the log base 2 of an integer with a lookup table
  • Find the log base 2 of an N-bit integer in O(lg(N)) operations
  • Find the log base 2 of an N-bit integer in O(lg(N)) operations with multiply and lookup

I'll leave it to you to choose the best "fast-log" function for your needs.

咋地 2024-11-26 07:04:48

在我有数据的平台上,log2 比其他平台稍快,符合我的预期。但请注意,差异极其极小(只有几个百分点)。这确实不值得担心。

编写一个清晰的实现。然后测量性能。

On the platforms for which I have data, log2 is very slightly faster than the others, in line with my expectations. Note however, that the difference is extremely slight (only a couple percent). This is really not worth worrying about.

Write an implementation that is clear. Then measure the performance.

左耳近心 2024-11-26 07:04:48

在8087指令集中,只有一条以2为底的对数指令,所以我猜这条指令是最快的。

当然,这类问题很大程度上取决于你的处理器/架构,所以我建议做一个简单的测试并计时。

In the 8087 instruction set, there is only an instruction for the logarithm to base 2, so I would guess this one to be the fastest.

Of course this kind of question depends largely on your processor/architecture, so I would suggest to make a simple test and time it.

请止步禁区 2024-11-26 07:04:48

答案是:

  • 这取决于
  • 分析它

你甚至没有提到你的CPU类型、变量类型、编译器标志、数据布局。如果您需要并行执行很多操作,我确信会有一个 SIMD 选项。只要您使用对齐和清除简单循环(如果您喜欢古老的方法,则可以使用 valarray),您的编译器就会对其进行优化。

英特尔编译器很可能在这一领域针对英特尔处理器有特定的技巧。

如果您确实想要,可以使用 CUDA 并利用 GPU。

我想,如果您不幸缺少这些指令集 您可能会陷入困境摆弄级别并编写一个算法做了一个很好的近似。在这种情况下,我可以不止一个苹果派打赌 2-log 将比任何其他基本对数更快

The answer is:

  • it depends
  • profile it

You don't even mention your CPU type, the variable type, the compiler flags, the data layout. If you need to do lot's of these in parallel, I'm sure there will be a SIMD option. Your compiler will optimize that as long as you use alignment and clear simple loops (or valarray if you like archaic approaches).

Chances are, the intel compiler has specific tricks for intel processors in this area.

If you really wanted you could use CUDA and leverage GPU.

I suppose, if you are unfortunate enough to lack these instruction sets you could go down at the bit fiddling level and write an algorithm that does a nice approximation. In this case, I can bet more than one apple-pie that 2-log is going to be faster than any other base-log

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文