任意精度算术 解释
我正在尝试学习 C,但遇到了无法处理非常大的数字(即 100 位数字、1000 位数字等)的情况。 我知道存在可以执行此操作的库,但我想尝试自己实现它。
我只是想知道是否有人拥有或可以提供任意精度算术的非常详细的、简单化的解释。
I'm trying to learn C and have come across the inability to work with REALLY big numbers (i.e., 100 digits, 1000 digits, etc.). I am aware that there exist libraries to do this, but I want to attempt to implement it myself.
I just want to know if anyone has or can provide a very detailed, dumbed down explanation of arbitrary-precision arithmetic.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
虽然重新发明轮子对你的个人熏陶和学习非常有好处,但它也是一项极其艰巨的任务。 我不想劝阻您,因为这是一项重要的练习,而且是我自己做过的一项,但您应该意识到,较大的软件包可以解决工作中存在的微妙而复杂的问题。
例如,乘法。 天真地,你可能会想到“小学生”的方法,即将一个数字写在另一个数字之上,然后像你在学校学到的那样进行长乘法。 例子:
但是这个方法非常慢(O(n^2),n是位数)。 相反,现代 bignum 包使用离散傅立叶变换或数值变换将其转换为本质上 O(n ln(n)) 的运算。
这仅适用于整数。 当您使用某种类型的实际数字表示(log、sqrt、exp 等)的更复杂的函数时,事情会变得更加复杂。
如果您想了解一些理论背景,我强烈建议您阅读 Yap 的书的第一章,"基础知识算法代数问题”。 正如已经提到的,gmp bignum 库是一个优秀的库。 对于实数,我使用了 MPFR 并且喜欢它。
While re-inventing the wheel is extremely good for your personal edification and learning, its also an extremely large task. I don't want to dissuade you as its an important exercise and one that I've done myself, but you should be aware that there are subtle and complex issues at work that larger packages address.
For example, multiplication. Naively, you might think of the 'schoolboy' method, i.e. write one number above the other, then do long multiplication as you learned in school. example:
but this method is extremely slow (O(n^2), n being the number of digits). Instead, modern bignum packages use either a discrete Fourier transform or a Numeric transform to turn this into an essentially O(n ln(n)) operation.
And this is just for integers. When you get into more complicated functions on some type of real representation of number (log, sqrt, exp, etc.) things get even more complicated.
If you'd like some theoretical background, I highly recommend reading the first chapter of Yap's book, "Fundamental Problems of Algorithmic Algebra". As already mentioned, the gmp bignum library is an excellent library. For real numbers, I've used MPFR and liked it.
不要重新发明轮子:它可能会变成方形!
使用经过尝试和测试的第三方库,例如 GNU MP。
Don't reinvent the wheel: it might turn out to be square!
Use a third party library, such as GNU MP, that is tried and tested.
您的操作方式基本上与使用铅笔和纸相同...
malloc
和 < code>realloc) 根据需要16 位字的字节作为基本计算单位,
由您的体系结构决定。
二进制或十进制基数的选择取决于您对最大空间效率、人类可读性的期望以及芯片上是否存在二进制编码十进制 (BCD) 数学支持。
You do it in basically the same way you do with pencil and paper...
malloc
andrealloc
) as neededTypically you will use as you basic unit of computation
as dictated by your architecture.
The choice of binary or decimal base depends on you desires for maximum space efficiency, human readability, and the presence of absence of Binary Coded Decimal (BCD) math support on your chip.
你可以用高中数学水平来做到这一点。 尽管现实中使用了更先进的算法。 例如,将两个 1024 字节的数字相加:
结果必须大
一位
,以防相加时要考虑最大值。 看看这个:如果您想学习,TTMath 是一个很棒的库。 它是使用 C++ 构建的。 上面的例子很愚蠢,但这就是一般加法和减法的完成方式!
关于该主题的一个很好的参考是数学运算的计算复杂性。 它告诉您要实施的每个操作需要多少空间。 例如,如果您有两个
N位
数字,那么您需要2N位
来存储乘法结果。正如Mitch所说,到目前为止,这并不是一件容易实现的任务! 如果您了解 C++,我建议您看一下 TTMath。
You can do it with high school level of mathematics. Though more advanced algorithms are used in reality. So for example to add two 1024-byte numbers :
result will have to be bigger by
one place
in case of addition to take care of maximum values. Look at this :TTMath is a great library if you want to learn. It is built using C++. The above example was silly one, but this is how addition and subtraction is done in general!
A good reference about the subject is Computational complexity of mathematical operations. It tells you how much space is required for each operation you want to implement. For example, If you have two
N-digit
numbers, then you need2N digits
to store the result of multiplication.As Mitch said, it is by far not an easy task to implement! I recommend you take a look at TTMath if you know C++.
最终参考文献之一(恕我直言)是 Knuth 的 TAOCP 第 II 卷。 它解释了许多用于表示数字的算法以及对这些表示的算术运算。
One of the ultimate references (IMHO) is Knuth's TAOCP Volume II. It explains lots of algorithms for representing numbers and arithmetic operations on these representations.
假设您希望自己编写一个大整数代码,这可能非常简单,就像最近这样做过的人所说的那样(尽管是在 MATLAB 中)。以下是我使用的一些技巧:
我存储了每个单独的整数作为双精度数的十进制数字。 这使得许多操作变得简单,尤其是输出。 虽然它确实占用了比您希望的更多的存储空间,但这里的内存很便宜,并且如果您可以有效地对一对向量进行卷积,那么乘法就会变得非常高效。 或者,您可以将多个十进制数字存储在双精度数中,但请注意,执行乘法的卷积可能会导致非常大的数字出现数值问题。
单独存储一个符号位。
两数相加主要是数字相加,然后每一步检查进位。
一对数字的乘法最好以卷积后跟进位步骤的方式完成,至少如果您有快速卷积代码的话。
一对数字的乘法最好
即使将数字存储为一串单独的十进制数字,也可以通过除法(也称为 mod/rem 运算)在结果中一次获得大约 13 位十进制数字。 这比一次仅处理 1 个十进制数字的除法要高效得多。
要计算整数的整数幂,请计算指数的二进制表示形式。 然后根据需要使用重复的平方运算来计算幂。
许多操作(因式分解、素性测试等)将受益于 powermod 操作。 也就是说,当您计算 mod(a^p,N) 时,请在求幂的每一步减少结果 mod N,其中 p 已以二进制形式表示。 不要先计算 a^p,然后尝试减少它 mod N。
Assuming that you wish to write a big integer code yourself, this can be surprisingly simple to do, spoken as someone who did it recently (though in MATLAB.) Here are a few of the tricks I used:
I stored each individual decimal digit as a double number. This makes many operations simple, especially output. While it does take up more storage than you might wish, memory is cheap here, and it makes multiplication very efficient if you can convolve a pair of vectors efficiently. Alternatively, you can store several decimal digits in a double, but beware then that convolution to do the multiplication can cause numerical problems on very large numbers.
Store a sign bit separately.
Addition of two numbers is mainly a matter of adding the digits, then check for a carry at each step.
Multiplication of a pair of numbers is best done as convolution followed by a carry step, at least if you have a fast convolution code on tap.
Even when you store the numbers as a string of individual decimal digits, division (also mod/rem ops) can be done to gain roughly 13 decimal digits at a time in the result. This is much more efficient than a divide that works on only 1 decimal digit at a time.
To compute an integer power of an integer, compute the binary representation of the exponent. Then use repeated squaring operations to compute the powers as needed.
Many operations (factoring, primality tests, etc.) will benefit from a powermod operation. That is, when you compute mod(a^p,N), reduce the result mod N at each step of the exponentiation where p has been expressed in a binary form. Do not compute a^p first, and then try to reduce it mod N.
这是我用 PHP 做的一个简单(天真的)示例。
我实现了“加”和“乘”并将其用于指数示例。
http://adevsoft.com/simple-php- Arbitration- precision-integer-big-num-example/
代码片段
Here's a simple ( naive ) example I did in PHP.
I implemented "Add" and "Multiply" and used that for an exponent example.
http://adevsoft.com/simple-php-arbitrary-precision-integer-big-num-example/
Code snip
这完全取决于足够的存储和算法将数字视为较小的部分。 假设您有一个编译器,其中
int
只能是 0 到 99,并且您希望处理最大 999999 的数字(为了简单起见,我们在这里只关心正数)。为此,您可以为每个数字指定三个
int
,并使用您(应该)在小学时学到的加法、减法和其他基本运算的相同规则。在任意精度库中,用于表示数字的基本类型的数量没有固定限制,只要内存可以容纳即可。
例如加法:
123456 + 78
:从最低有效端开始计算:
事实上,这就是加法在 CPU 内部的位级上通常的工作方式。
减法类似(使用基类型减法并借位而不是进位),乘法可以通过重复加法(非常慢)或叉积(更快)来完成,除法更棘手,但可以通过数字的移位和减法来完成涉及(你小时候会学到的长除法)。
实际上,我已经编写了库来使用十的最大幂来执行此类操作,该十的最大幂在平方时可以适合整数(以防止将两个 int 相乘时溢出,例如 16-位
int
被限制为 0 到 99,在平方时生成 9,801 (<32,768),或者 32 位int
使用 0 到 9,999 生成 99,980,001 (<2,147,483,648) )这极大地简化了算法。一些需要注意的技巧。
1/ 当数字相加或相乘时,预先分配所需的最大空间,然后如果发现太多则减少。 例如,添加两个 100“数字”(其中数字是
int
)数字永远不会超过 101 个数字。 将 12 位数字乘以 3 位数字将永远不会生成超过 15 位数字(加上位数)。2/ 为了提高速度,仅在绝对必要时才对数字进行标准化(减少所需的存储空间) - 我的库将其作为单独的调用,以便用户可以在速度和存储问题之间做出决定。
3/ 正数和负数的加法是减法,减去负数与加上等值的正数相同。 通过在调整符号后互相调用加法和减法方法,可以节省大量代码。
4/ 避免用小数减去大数,因为你总是会得到这样的数字:
相反,从 11 中减去 10,然后对其求反:
以下是来自我必须执行此操作的库之一的注释(已转换为文本)。 不幸的是,代码本身是受版权保护的,但您也许能够找到足够的信息来处理四个基本操作。 假设下面的
-a
和-b
表示负数,a
和b
为零或正数。对于加法,如果符号不同,则使用负数的减法:
对于减法,如果符号不同,则使用负数的加法:
还有特殊处理以确保我们大数减去小数:
乘法使用入门级数学,如下所示:
实现此目的的方法包括一次(向后)提取 32 个数字的每个数字,然后使用 add 计算要添加到结果中的值(最初为零)。
ShiftLeft
和ShiftRight
运算用于快速将LongInt
乘以或除以换行值(10 表示“真实”数学)。 在上面的示例中,我们将 475 加零 2 次(32 的最后一位)得到 950(结果 = 0 + 950 = 950)。然后我们左移 475 得到 4750,右移 32 得到 3。将 4750 与零相加 3 次得到 14250,然后与 950 的结果相加得到 15200。
左移 4750 得到 47500,右移 3 得到 0。右移 32 现在为零,我们完成了,事实上 475 x 32 确实等于 15200。
除法也很棘手,但基于早期算术(“进入”的“gazinta”方法) )。 考虑以下
12345 / 27
的长除法:因此
12345 / 27
是457
,余数为6
。 验证:这是通过使用缩减变量(最初为零)一次缩减 12345 的一个部分,直到它大于或等于 27 来实现的。
然后我们只需从中减去 27,直到低于 27 - 数字减法是添加到顶行的部分。
当没有更多的部分需要拆除时,我们就得到了结果。
请记住,这些是非常基本的算法。 如果你的数字特别大,有更好的方法来进行复杂的算术。 您可以查看类似 GNU 多精度算术库 - 它比我自己的库更好、更快。
它确实有一个相当不幸的错误功能,因为如果内存不足,它就会简单地退出(在我看来,对于通用库来说这是一个相当致命的缺陷),但是,如果你能忽略这一点,它的功能非常好。
如果您因许可原因而无法使用它(或者因为您不希望您的应用程序无缘无故地退出),您至少可以从那里获取算法以集成到您自己的代码中。
我还发现 MPIR(GMP 的一个分支)的机构更愿意讨论潜在的问题变化——他们似乎对开发者更加友好。
It's all a matter of adequate storage and algorithms to treat numbers as smaller parts. Let's assume you have a compiler in which an
int
can only be 0 through 99 and you want to handle numbers up to 999999 (we'll only worry about positive numbers here to keep it simple).You do that by giving each number three
int
s and using the same rules you (should have) learned back in primary school for addition, subtraction and the other basic operations.In an arbitrary precision library, there's no fixed limit on the number of base types used to represent our numbers, just whatever memory can hold.
Addition for example:
123456 + 78
:Working from the least significant end:
This is, in fact, how addition generally works at the bit level inside your CPU.
Subtraction is similar (using subtraction of the base type and borrow instead of carry), multiplication can be done with repeated additions (very slow) or cross-products (faster) and division is trickier but can be done by shifting and subtraction of the numbers involved (the long division you would have learned as a kid).
I've actually written libraries to do this sort of stuff using the maximum powers of ten that can be fit into an integer when squared (to prevent overflow when multiplying two
int
s together, such as a 16-bitint
being limited to 0 through 99 to generate 9,801 (<32,768) when squared, or 32-bitint
using 0 through 9,999 to generate 99,980,001 (<2,147,483,648)) which greatly eased the algorithms.Some tricks to watch out for.
1/ When adding or multiplying numbers, pre-allocate the maximum space needed then reduce later if you find it's too much. For example, adding two 100-"digit" (where digit is an
int
) numbers will never give you more than 101 digits. Multiply a 12-digit number by a 3 digit number will never generate more than 15 digits (add the digit counts).2/ For added speed, normalise (reduce the storage required for) the numbers only if absolutely necessary - my library had this as a separate call so the user can decide between speed and storage concerns.
3/ Addition of a positive and negative number is subtraction, and subtracting a negative number is the same as adding the equivalent positive. You can save quite a bit of code by having the add and subtract methods call each other after adjusting signs.
4/ Avoid subtracting big numbers from small ones since you invariably end up with numbers like:
Instead, subtract 10 from 11, then negate it:
Here are the comments (turned into text) from one of the libraries I had to do this for. The code itself is, unfortunately, copyrighted, but you may be able to pick out enough information to handle the four basic operations. Assume in the following that
-a
and-b
represent negative numbers anda
andb
are zero or positive numbers.For addition, if signs are different, use subtraction of the negation:
For subtraction, if signs are different, use addition of the negation:
Also special handling to ensure we're subtracting small numbers from large:
Multiplication uses entry-level math as follows:
The way in which this is achieved involves extracting each of the digits of 32 one at a time (backwards) then using add to calculate a value to be added to the result (initially zero).
ShiftLeft
andShiftRight
operations are used to quickly multiply or divide aLongInt
by the wrap value (10 for "real" math). In the example above, we add 475 to zero 2 times (the last digit of 32) to get 950 (result = 0 + 950 = 950).Then we left shift 475 to get 4750 and right shift 32 to get 3. Add 4750 to zero 3 times to get 14250 then add to result of 950 to get 15200.
Left shift 4750 to get 47500, right shift 3 to get 0. Since the right shifted 32 is now zero, we're finished and, in fact 475 x 32 does equal 15200.
Division is also tricky but based on early arithmetic (the "gazinta" method for "goes into"). Consider the following long division for
12345 / 27
:Therefore
12345 / 27
is457
with remainder6
. Verify:This is implemented by using a draw-down variable (initially zero) to bring down the segments of 12345 one at a time until it's greater or equal to 27.
Then we simply subtract 27 from that until we get below 27 - the number of subtractions is the segment added to the top line.
When there are no more segments to bring down, we have our result.
Keep in mind these are pretty basic algorithms. There are far better ways to do complex arithmetic if your numbers are going to be particularly large. You can look into something like GNU Multiple Precision Arithmetic Library - it's substantially better and faster than my own libraries.
It does have the rather unfortunate misfeature in that it will simply exit if it runs out of memory (a rather fatal flaw for a general purpose library in my opinion) but, if you can look past that, it's pretty good at what it does.
If you cannot use it for licensing reasons (or because you don't want your application just exiting for no apparent reason), you could at least get the algorithms from there for integrating into your own code.
I've also found that the bods over at MPIR (a fork of GMP) are more amenable to discussions on potential changes - they seem a more developer-friendly bunch.