计算数学常数 e 的有效方法
由于存在许多除法运算,常数 e 作为无限级数之和的标准表示对于计算来说非常低效。那么有没有其他方法可以有效地计算常数呢?
The standard representation of constant e as the sum of the infinite series is very inefficient for computation, because of many division operations. So are there any alternative ways to compute the constant efficiently?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(12)
由于不可能计算“e”的每一位数字,因此您必须选择一个停止点。
双精度:16 位十进制数字
对于实际应用来说,“尽可能接近‘e’真实值的 64 位双精度浮点值——大约 16 位十进制数字”就绰绰有余了。
正如 KennyTM 所说,该值已经在数学库中为您预先计算好了。
如果你想自己计算,正如 Hans Passant 指出的那样,阶乘已经增长得非常快了。
该系列中的前 22 项对于计算到该精度来说已经过大了 - 如果结果存储在 64 位双精度浮点变量中,则添加该系列中的更多项不会改变结果。
我认为你眨眼所花费的时间比你的计算机进行 22 次除法所花费的时间还要长。所以我认为没有任何理由进一步优化它。
数千、数百万或数十亿的十进制数字
正如 Matthieu M. 指出的,这个值已经被计算出来,你可以从 Yee 的网站下载它。
如果您想自己计算,标准双精度浮点数无法容纳这么多数字。
你需要一个“bignum”库。
与往常一样,您可以使用现有的许多免费 bignum 库之一,也可以通过构建自己的另一个具有特殊功能的 bignum 库来重新发明轮子。
结果(一长串数字文件)并不是很有用,但计算它的程序有时用作测试“bignum”库软件的性能和准确性的基准,并用作检查稳定性和冷却能力的压力测试新机器硬件。
其中一页非常简要地描述了Yee 用于计算数学常数的算法。
维基百科“二进制拆分”文章提供了更多详细信息。
我认为您正在寻找的部分是数字表示:
而不是在内部将所有数字存储为小数点(或二进制小数点)前后的一长串数字,
Yee 将每个项和每个部分和存储为一个有理数——两个整数,每个整数都是一长串数字。
例如,假设一个工作 CPU 被分配了部分和,
而不是先对每个项进行除法,然后添加,然后将百万位定点结果返回给管理器 CPU:
该 CPU 可以添加首先用有理算术将系列中的所有项加在一起,然后将有理结果返回给管理器 CPU:两个整数,每个整数可能有几百位:以
这种方式将数千个项加在一起后,管理器 CPU 执行该操作并且只有在最后进行除法才能得到小数点后的小数位。
请记住避免 PrematureOptimization,并且
始终ProfileBeforeOptimizing。
Since it's not possible to calculate every digit of 'e', you're going to have to pick a stopping point.
double precision: 16 decimal digits
For practical applications, "the 64-bit double precision floating point value that is as close as possible to the true value of 'e' -- approximately 16 decimal digits" is more than adequate.
As KennyTM said, that value has already been pre-calculated for you in the math library.
If you want to calculate it yourself, as Hans Passant pointed out, factorial already grows very fast.
The first 22 terms in the series is already overkill for calculating to that precision -- adding further terms from the series won't change the result if it's stored in a 64 bit double-precision floating point variable.
I think it will take you longer to blink than for your computer to do 22 divides. So I don't see any reason to optimize this further.
thousands, millions, or billions of decimal digits
As Matthieu M. pointed out, this value has already been calculated, and you can download it from Yee's web site.
If you want to calculate it yourself, that many digits won't fit in a standard double-precision floating-point number.
You need a "bignum" library.
As always, you can either use one of the many free bignum libraries already available, or re-invent the wheel by building your own yet another bignum library with its own special quirks.
The result -- a long file of digits -- is not terribly useful, but programs to calculate it are sometimes used as benchmarks to test the performance and accuracy of "bignum" library software, and as stress tests to check the stability and cooling capacity of new machine hardware.
One page very briefly describes the algorithms Yee uses to calculate mathematical constants.
The Wikipedia "binary splitting" article goes into much more detail.
I think the part you are looking for is the number representation:
instead of internally storing all numbers as a long series of digits before and after the decimal point (or a binary point),
Yee stores each term and each partial sum as a rational number -- as two integers, each of which is a long series of digits.
For example, say one of the worker CPUs was assigned the partial sum,
Instead of doing the division first for each term, and then adding, and then returning a single million-digit fixed-point result to the manager CPU:
that CPU can add all the terms in the series together first with rational arithmetic, and return the rational result to the manager CPU: two integers of perhaps a few hundred digits each:
After thousands of terms have been added together in this way, the manager CPU does the one and only division at the very end to get the decimal digits after the decimal point.
Remember to avoid PrematureOptimization, and
always ProfileBeforeOptimizing.
如果您使用的是
double
或float
,则math.h
中已有一个M_E
常量。http://en.wikipedia.org 中还有
e
的其他表示形式/wiki/Representations_of_e#As_an_infinite_series;所有这些都将涉及分裂。If you're using
double
orfloat
, there is anM_E
constant inmath.h
already.There are other representions of
e
in http://en.wikipedia.org/wiki/Representations_of_e#As_an_infinite_series; all the them will involve division.我不知道有什么比级数的泰勒展开式“更快”的计算,即:
e = 1/0! + 1/1! + 1/2! + ...
或
1/e = 1/0! - 1/1! + 1/2! - 1/3! + ...
考虑到这些是 A. Yee 使用的,他计算了前 5000 亿位数字e,我想没有太多优化要做(或者更好,它可以优化,但还没有人找到方法,据我所知)
编辑
A非常粗糙的实现
输出
I'm not aware of any "faster" computation than the Taylor expansion of the series, i.e.:
e = 1/0! + 1/1! + 1/2! + ...
or
1/e = 1/0! - 1/1! + 1/2! - 1/3! + ...
Considering that these were used by A. Yee, who calculated the first 500 billion digits of e, I guess that there's not much optimising to do (or better, it could be optimised, but nobody yet found a way, AFAIK)
EDIT
A very rough implementation
Outputs
此页面很好地概述了不同的计算方法。
This page has a nice rundown of different calculation methods.
我在 CodeReviews 上针对 有关通过泰勒级数定义计算 e 的问题给出了这个答案(因此,其他方法是不是一个选项)。评论中建议了这里的交叉发帖。我删除了与其他主题相关的言论;那些对进一步解释感兴趣的人可能想查看原始帖子。
C 中的解决方案(应该足够容易适应 C++):
输出:
有两点:
这段代码不计算 1, 1*2, 1*2*3,... 其中是 O(n^2),但一次计算 1*2*3*...(即 O(n))。
它从较小的数字开始。如果我们尝试计算
1/1 + 1/2 + 1/6 + ... + 1/20!
并尝试添加它 1/21!,我们会添加
1/21! = 1/51090942171709440000 = 2E-20,
到 2.something,这对结果没有影响(
double
包含大约 16 个有效数字)。这种效果称为下溢。但是,当我们从这些数字开始时,即,如果我们计算 1/32!+1/31!+...,它们都会产生一些影响。
该解决方案似乎与 C 在我的 64 位机器上使用其
expl
函数计算的结果一致,使用 gcc 4.7.2 20120921 编译。I gave this answer at CodeReviews on the question regarding computing e by its definition via Taylor series (so, other methods were not an option). The cross-post here was suggested in the comments. I've removed my remarks relevant to that other topic; Those interested in further explanations migth want to check the original post.
The solution in C (should be easy enough to adapt to adapt to C++):
Output:
There are two important points:
This code doesn't compute 1, 1*2, 1*2*3,... which is O(n^2), but computes 1*2*3*... in one pass (which is O(n)).
It starts from smaller numbers. If we tried to compute
1/1 + 1/2 + 1/6 + ... + 1/20!
and tried to add it 1/21!, we'd be adding
1/21! = 1/51090942171709440000 = 2E-20,
to 2.something, which has no effect on the result (
double
holds about 16 significant digits). This effect is called underflow.However, when we start with these numbers, i.e., if we compute 1/32!+1/31!+... they all have some impact.
This solution seems in accordance to what C computes with its
expl
function, on my 64bit machine, compiled with gcc 4.7.2 20120921.您也许能够获得一些效率。由于每一项都涉及下一个阶乘,因此通过记住阶乘的最后一个值可以获得一定的效率。
展开方程:
不是计算每个阶乘,而是将分母乘以下一个增量。因此,将分母保留为变量并将其相乘会产生一些优化。
You may be able to gain some efficiency. Since each term involves the next factorial, some efficiency may be obtained by remembering the last value of the factorial.
Expanding the equation:
Instead of computing each factorial, the denominator is multiplied by the next increment. So keeping the denominator as a variable and multiplying it will produce some optimization.
如果您可以接受最多七位数字的近似值,请使用
如果您想要精确值:
其中 j 是虚数单位,pi 是众所周知的数学常数(欧拉恒等式)
If you're ok with an approximation up to seven digits, use
If you want the exact value:
where j is the imaginary unit and pi the well-known mathematical constant (Euler's Identity)
有几种“龙头”算法可以无限制地顺序计算数字。这很有用,因为您可以通过恒定数量的基本算术运算简单地计算“下一个”数字,而无需事先定义您希望产生多少个数字。
它们应用一系列连续的转换,使得下一个数字到达 1 的位置,这样它们就不会受到浮点舍入误差的影响。效率很高,因为这些变换可以表示为矩阵乘法,从而简化为整数加法和乘法。
简而言之,泰勒级数展开式
可以通过分解阶乘的小数部分来重写(请注意,为了使级数正规,我们已将
1
移动到左侧):我们可以定义一系列函数 f1(x) ... fn(x) 因此:
e 的值是从所有这些函数的组合中找到的:
我们可以观察到每个函数中 x 的值由下一个函数确定,并且每个函数这些值的范围是 [1,2] - 也就是说,对于任何这些函数,x 的值将是
1 <= x <= 2
因为情况就是这样,我们可以分别使用 x 的值 1 和 2 来设置 e 的下限和上限:
我们可以通过组合上面定义的函数来提高精度,并且当数字在下限和上限中匹配时,我们知道我们的计算结果e 的值精确到该数字:
由于第 1 位和第 10 位数字匹配,我们可以说 (e-1) 在第 10 位精度下的近似值为 1.7。当第一个数字在上限和下限之间匹配时,我们将其减去,然后乘以 10 - 这样,所讨论的数字始终位于浮点精度较高的 1 位置。
真正的优化来自于线性代数中将线性函数描述为变换矩阵的技术。组合函数映射到矩阵乘法,因此所有这些嵌套函数都可以简化为简单的整数乘法和加法。数字相减再乘以10的过程也构成了线性变换,因此也可以通过矩阵乘法来完成。
该方法的另一种解释:
http://www.hulver.com/scoop/story/ 2004/7/22/153549/352
描述该算法的论文:
http://www.cs.ox.ac。 uk/people/jeremy.gibbons/publications/spigot.pdf
通过矩阵算术执行线性变换的快速介绍:
https://people.math.gatech.edu/~cain/notes/ cal6.pdf
注意,该算法利用莫比乌斯变换,这是吉本斯论文中简要描述的一种线性变换。
There are several "spigot" algorithms which compute digits sequentially in an unbounded manner. This is useful because you can simply calculate the "next" digit through a constant number of basic arithmetic operations, without defining beforehand how many digits you wish to produce.
These apply a series of successive transformations such that the next digit comes to the 1's place, so that they are not affected by float rounding errors. The efficiency is high because these transformations can be formulated as matrix multiplications, which reduce to integer addition and multiplication.
In short, the taylor series expansion
Can be rewritten by factoring out fractional parts of the factorials (note that to make the series regular we've moved
1
to the left side):We can define a series of functions f1(x) ... fn(x) thus:
The value of e is found from the composition of all of these functions:
We can observe that the value of x in each function is determined by the next function, and that each of these values is bounded on the range [1,2] - that is, for any of these functions, the value of x will be
1 <= x <= 2
Since this is the case, we can set a lower and upper bound for e by using the values 1 and 2 for x respectively:
We can increase precision by composing the functions defined above, and when a digit matches in the lower and upper bound, we know that our computed value of e is precise to that digit:
Since the 1s and 10ths digits match, we can say that an approximation of (e-1) with precision of 10ths is 1.7. When the first digit matches between the upper and lower bounds, we subtract it off and then multiply by 10 - this way the digit in question is always in the 1's place where floating-point precision is high.
The real optimization comes from the technique in linear algebra of describing a linear function as a transformation matrix. Composing functions maps to matrix multiplication, so all of those nested functions can be reduced to simple integer multiplication and addition. The procedure of subtracting the digit and multiplying by 10 also constitutes a linear transformation, and therefore can also be accomplished by matrix multiplication.
Another explanation of the method:
http://www.hulver.com/scoop/story/2004/7/22/153549/352
The paper that describes the algorithm:
http://www.cs.ox.ac.uk/people/jeremy.gibbons/publications/spigot.pdf
A quick intro to performing linear transformations via matrix arithmetic:
https://people.math.gatech.edu/~cain/notes/cal6.pdf
NB this algorithm makes use of Mobius Transformations which are a type of linear transformation described briefly in the Gibbons paper.
从我的角度来看,计算 e 达到所需精度的最有效方法是使用以下表示形式:
e := lim (n -> inf): (1 + (1/n))^ n
特别是如果您选择
n = 2^x
,则只需 x 次乘法即可计算效力,因为:a^n = (a^2)^(n/ 2)、如果n%2=0
From my point of view, the most efficient way to compute e up to a desired precision is to use the following representation:
e := lim (n -> inf): (1 + (1/n))^n
Especially if you choose
n = 2^x
, you can compute the potency with just x multiplications, since:a^n = (a^2)^(n/2), if n % 2 = 0
二进制分裂方法非常适合模板元程序,该模板元程序生成代表与 e 的近似值相对应的有理数的类型。 13 次迭代似乎是最大值 - 任何更高的迭代都会产生“积分常数溢出”错误。
另一个(非元程序)模板化变体将在编译时计算双精度近似 e。这个没有迭代次数的限制。
在优化构建中,表达式
CalcE<16>::result ()
将被实际的双精度值替换。两者都可以说非常高效,因为它们在编译时计算 e :-)
The binary splitting method lends itself nicely to a template metaprogram which produces a type which represents a rational corresponding to an approximation of e. 13 iterations seems to be the maximum - any higher will produce a "integral constant overflow" error.
Another (non-metaprogram) templated variation will, at compile-time, calculate a double approximating e. This one doesn't have the limit on the number of iterations.
In an optimised build the expression
CalcE<16>::result ()
will be replaced by the actual double value.Both are arguably quite efficient since they calculate e at compile time :-)
@nico回复:
以下是从代数角度改进牛顿法收敛性的方法:
https://www.researchgate。 net/publication/52005980_Improving_the_Convergence_of_Newton's_Series_Approximation_for_e
它们是否可以与二进制分割结合使用以加快计算速度似乎是一个悬而未决的问题。尽管如此,这里有一个 Damian Conway 使用 Perl 的例子,说明了这种新方法在直接计算效率方面的改进。它位于标题为“
@nico Re:
Here are ways to algebraically improve the convergence of Newton’s method:
https://www.researchgate.net/publication/52005980_Improving_the_Convergence_of_Newton's_Series_Approximation_for_e
It appears to be an open question as to whether they can be used in conjunction with binary splitting to computationally speed things up. Nonetheless, here is an example from Damian Conway using Perl that illustrates the improvement in direct computational efficiency for this new approach. It’s in the section titled “???? is for estimation”:
http://blogs.perl.org/users/damian_conway/2019/09/to-compute-a-constant-of-calculusa-treatise-on-multiple-ways.html
(This comment is too long to post as a reply for answer on Jun 12 '10 at 10:28)
来自 wikipedia 将 x 替换为 1
From wikipedia replace x with 1