C++ BigInt 乘法概念问题

发布于 2024-08-29 20:01:18 字数 1353 浏览 9 评论 0原文

我正在用 C++ 构建一个小型 BigInt 库，以便在我的编程语言中使用。

结构如下：

short digits[ 1000 ];
int   len;

我有一个函数，通过将字符串拆分为单个字符并将它们放入数字中，将字符串转换为bigint。

数字中的数字全部颠倒过来，因此数字 123 如下所示：

digits[0]=3 digits[1]=3 digits[2]=1

我已经成功编写了添加函数，该函数运行良好。

它的工作原理有点像这样：（

overflow = 0
for i ++ until length of both numbers exceeded:
  add numberA[ i ] to numberB[ i ]
  add overflow to the result
  set overflow to 0
  if the result is bigger than 10:
    substract 10 from the result
    overflow = 1
  put the result into numberReturn[ i ]

在这种情况下，溢出是当我将 1 加到 9 时发生的情况：从 10 中减去 10，将 1 添加到溢出，溢出添加到下一个数字）

所以想想如何存储两个数字，就像那些：

   0 | 1 | 2
   ---------
A  2   -   -
B  0   0   1

上面表示 bigint 2 (A) 和 100 (B) 的数字。 - 表示未初始化的数字，无法访问它们。

所以添加上面的数字效果很好：从 0 开始，加 2 + 0，转到 1，加 0，转到 2，加 1

但是：

当我想用上面的结构进行乘法时，我的程序最终执行以下操作：

从 0 开始，将 2 与 0 相乘（eek），转到 1，...

所以很明显，对于乘法，我必须得到这样的顺序：

   0 | 1 | 2
   ---------
A  -   -   2
B  0   0   1

然后，一切都会是明确：从 0 开始，将 0 与 0 相乘，转到 1，将 0 与 0 相乘，转到 2，将 1 与 2 相乘

我如何设法将数字放入乘法的正确形式？
我不想做任何阵列移动/翻转 - 我需要性能！

原文

I'm building a small BigInt library in C++ for use in my programming language.

The structure is like the following:

short digits[ 1000 ];
int   len;

I have a function that converts a string into a bigint by splitting it up into single chars and putting them into digits.

The numbers in digits are all reversed, so the number 123 would look like the following:

digits[0]=3 digits[1]=3 digits[2]=1

I have already managed to code the adding function, which works perfectly.

It works somewhat like this:

overflow = 0
for i ++ until length of both numbers exceeded:
  add numberA[ i ] to numberB[ i ]
  add overflow to the result
  set overflow to 0
  if the result is bigger than 10:
    substract 10 from the result
    overflow = 1
  put the result into numberReturn[ i ]

(Overflow is in this case what happens when I add 1 to 9: Substract 10 from 10, add 1 to overflow, overflow gets added to the next digit)

So think of how two numbers are stored, like those:

   0 | 1 | 2
   ---------
A  2   -   -
B  0   0   1

The above represents the digits of the bigints 2 (A) and 100 (B).
- means uninitialized digits, they aren't accessed.

So adding the above number works fine: start at 0, add 2 + 0, go to 1, add 0, go to 2, add 1

But:

When I want to do multiplication with the above structure, my program ends up doing the following:

Start at 0, multiply 2 with 0 (eek), go to 1, ...

So it is obvious that, for multiplication, I have to get an order like this:

   0 | 1 | 2
   ---------
A  -   -   2
B  0   0   1

Then, everything would be clear: Start at 0, multiply 0 with 0, go to 1, multiply 0 with 0, go to 2, multiply 1 with 2

How can I manage to get digits into the correct form for multiplication?
I don't want to do any array moving/flipping - I need performance!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

红衣飘飘貌似仙 2024-09-05 20:01:18

为什么你使用 short 在 [0..9] 中存储数字 char 就足够了
你对乘法的思考不正确。在乘法的情况下，您需要一个双 for 循环，将 B 与 A 中的每个数字相乘，并将它们按正确的 10 次方移位求和。

编辑：由于一些匿名者在没有评论的情况下对此投了反对票，这基本上是乘法算法：

bigint prod = 0
for i in A
    prod += B * A[i] * (10 ^ i)

B与A[i]的乘法是通过一个额外的 for 循环，您还可以在其中跟踪进位。 (10 ^ i) 是通过偏移目标索引来实现的，因为 bigint 以 10 为基数。

Why are you using short to store digits in the [0..9] a char would suffice
You're thinking incorrectly about the multiplication. In the case of multiplication you need a double for loop that multiplies B with each digit in A and sums them up shifted with the correct power of ten.

EDIT: Since some anonymous downvoted this without a comment this is basically the multiplication algorithm:

bigint prod = 0
for i in A
    prod += B * A[i] * (10 ^ i)

The multiplication of B with A[i] is done by an extra for loop where you also keep track of the carry. The (10 ^ i) is achieved by offseting the destination indices since bigint is in base 10.

回复收藏 0 原文

傲鸠 2024-09-05 20:01:18

在我看来，你在问题中的例子是过度设计的。由于涉及乘法和加法的剪切次数，您的方法最终将比正常的长乘法更慢。当您一次可以乘以大约 9 时，不要限制自己一次只计算一位基数！。将 base10 字符串转换为一个巨大的值，然后对其进行操作。不要直接对字符串进行操作。你会发疯的。这是一些演示加法和乘法的代码。更改 M 以使用更大的类型。您也可以使用 std::vector，但是您会错过一些优化。

#include <iostream>
#include <string>
#include <algorithm>
#include <sstream>
#include <cstdlib>
#include <cstdio>
#include <iomanip>

#ifdef _DEBUG
#include <assert.h>
#define ASSERT(x) assert(x)
#else
#define ASSERT(x)
#endif

namespace Arithmetic
{
    const int M = 64;
    const int B = (M-1)*32;

    struct Flags
    {
        Flags() : C(false),Z(false),V(false),N(false){}
        void Clear()
        {
            C = false;
            Z = false;
            V = false;
            N = false;
        }
        bool C,Z,V,N;
    };

    static unsigned int hvAdd(unsigned int a, unsigned int b, Flags& f)
    {
        unsigned int c;
        f.Clear();
        //b = -(signed)b;
        c = a + b;

        f.N = (c >> 31UL) & 0x1;
        f.C = (c < a) && (c < b);
        f.Z = !c;
        f.V = (((signed)a < (signed)b) != f.N);

        return c;
    }

    static unsigned int hvSub(unsigned int a, unsigned int b, Flags& f)
    {
        unsigned int c;
        f.Clear();
        c = a - b;

        //f.N = ((signed)c < 0);
        f.N = (c >> 31UL) & 0x1;
        f.C = (c < a) && (c < b);
        f.Z = !c;
        f.V = (((signed)a < (signed)b) != f.N);

        return c;
    }


    struct HugeVal
    {
        HugeVal()
        {
            std::fill(part, part + M, 0);
        }
        HugeVal(const HugeVal& h)
        {
            std::copy(h.part, h.part + M, part);
        }
        HugeVal(const std::string& str)
        {
            Flags f;
            unsigned int tmp = 0;

            std::fill(part, part + M, 0);

            for(unsigned int i=0; i < str.length(); ++i){
                unsigned int digit = (unsigned int)str[i] - 48UL;
                unsigned int carry_last = 0;
                unsigned int carry_next = 0;
                for(int i=0; i<M; ++i){
                    tmp = part[i]; //the value *before* the carry add
                    part[i] = hvAdd(part[i], carry_last, f);
                    carry_last = 0;
                    if(f.C)
                        ++carry_last;
                    for(int j=1; j<10; ++j){
                        part[i] = hvAdd(part[i], tmp, f);
                        if(f.C)
                            ++carry_last;
                    }
                }
                part[0] = hvAdd(part[0], digit, f);
                int index = 1;
                while(f.C && index < M){
                    part[index] = hvAdd(part[index], 1, f);
                    ++index;
                }
            }
        }
        /*
        HugeVal operator= (const HugeVal& h)
        {
            *this = HugeVal(h);
        }
        */
        HugeVal operator+ (const HugeVal& h) const
        {
            HugeVal tmp;
            Flags f;
            int index = 0;
            unsigned int carry_last = 0;
            for(int j=0; j<M; ++j){
                if(carry_last){
                    tmp.part[j] = hvAdd(tmp.part[j], carry_last, f);
                    carry_last = 0;
                }
                tmp.part[j] = hvAdd(tmp.part[j], part[j], f);
                if(f.C)
                    ++carry_last;
                tmp.part[j] = hvAdd(tmp.part[j], h.part[j], f);
                if(f.C)
                    ++carry_last;
            }
            return tmp;
        }
        HugeVal operator* (const HugeVal& h) const
        {
            HugeVal tmp;

            for(int j=0; j<M; ++j){
                unsigned int carry_next = 0;
                for(int i=0;i<M; ++i){

                    Flags f;

                    unsigned int accum1 = 0;
                    unsigned int accum2 = 0;
                    unsigned int accum3 = 0;
                    unsigned int accum4 = 0;

                    /* Split into 16-bit values */
                    unsigned int j_LO = part[j]&0xFFFF;
                    unsigned int j_HI = part[j]>>16;
                    unsigned int i_LO = h.part[i]&0xFFFF;
                    unsigned int i_HI = h.part[i]>>16;

                    size_t index = i+j;
                    size_t index2 = index+1;

                    /* These multiplications are safe now. Can't overflow */
                    accum1 = j_LO * i_LO;
                    accum2 = j_LO * i_HI;
                    accum3 = j_HI * i_LO;
                    accum4 = j_HI * i_HI;


                    if(carry_next){ //carry from last iteration
                        accum1 = hvAdd(accum1, carry_next, f); //add to LSB
                        carry_next = 0;
                        if(f.C) //LSB produced carry
                            ++carry_next;
                    }

                    /* Add the lower 16-bit parts of accum2 and accum3 to accum1 */
                    accum1 = hvAdd(accum1, (accum2 << 16), f);
                    if(f.C)
                        ++carry_next;
                    accum1 = hvAdd(accum1, (accum3 << 16), f);
                    if(f.C)
                        ++carry_next;



                    if(carry_next){ //carry from LSB
                        accum4 = hvAdd(accum4, carry_next, f); //add to MSB
                        carry_next = 0;
                        ASSERT(f.C == false);
                    }

                    /* Add the higher 16-bit parts of accum2 and accum3 to accum4 */
                    /* Can't overflow */
                    accum4 = hvAdd(accum4, (accum2 >> 16), f);
                    ASSERT(f.C == false);
                    accum4 = hvAdd(accum4, (accum3 >> 16), f);
                    ASSERT(f.C == false);
                    if(index < M){
                        tmp.part[index] = hvAdd(tmp.part[index], accum1, f);
                        if(f.C)
                            ++carry_next;
                    }
                    carry_next += accum4;
                }
            }
            return tmp;
        }
        void Print() const
        {
            for(int i=(M-1); i>=0; --i){

                printf("%.8X", part[i]);
            }
            printf("\n");
        }
        unsigned int part[M];
    };

}


int main(int argc, char* argv[])
{

    std::string a1("273847238974823947823941");
    std::string a2("324230432432895745949");

    Arithmetic::HugeVal a = a1;
    Arithmetic::HugeVal b = a2;

    Arithmetic::HugeVal d = a + b;
    Arithmetic::HugeVal e = a * b;

    a.Print();
    b.Print();
    d.Print();
    e.Print();
    system("pause");
}

Your example in the question is over-engineering at its best in my opinion. Your approach will end up even slower than normal long multiplication, because of the shear number of multiplications and additions involved. Don't limit yourself to working at one base digit at a time when you can multiply approximately 9 at a time!. Convert the base10 string to a hugeval, and then do operations on it. Don't do operations directly on the string. You will go crazy. Here is some code which demonstrates addition and multiplication. Change M to use a bigger type. You could also use std::vector, but then you miss out on some optimizations.

#include <iostream>
#include <string>
#include <algorithm>
#include <sstream>
#include <cstdlib>
#include <cstdio>
#include <iomanip>

#ifdef _DEBUG
#include <assert.h>
#define ASSERT(x) assert(x)
#else
#define ASSERT(x)
#endif

namespace Arithmetic
{
    const int M = 64;
    const int B = (M-1)*32;

    struct Flags
    {
        Flags() : C(false),Z(false),V(false),N(false){}
        void Clear()
        {
            C = false;
            Z = false;
            V = false;
            N = false;
        }
        bool C,Z,V,N;
    };

    static unsigned int hvAdd(unsigned int a, unsigned int b, Flags& f)
    {
        unsigned int c;
        f.Clear();
        //b = -(signed)b;
        c = a + b;

        f.N = (c >> 31UL) & 0x1;
        f.C = (c < a) && (c < b);
        f.Z = !c;
        f.V = (((signed)a < (signed)b) != f.N);

        return c;
    }

    static unsigned int hvSub(unsigned int a, unsigned int b, Flags& f)
    {
        unsigned int c;
        f.Clear();
        c = a - b;

        //f.N = ((signed)c < 0);
        f.N = (c >> 31UL) & 0x1;
        f.C = (c < a) && (c < b);
        f.Z = !c;
        f.V = (((signed)a < (signed)b) != f.N);

        return c;
    }


    struct HugeVal
    {
        HugeVal()
        {
            std::fill(part, part + M, 0);
        }
        HugeVal(const HugeVal& h)
        {
            std::copy(h.part, h.part + M, part);
        }
        HugeVal(const std::string& str)
        {
            Flags f;
            unsigned int tmp = 0;

            std::fill(part, part + M, 0);

            for(unsigned int i=0; i < str.length(); ++i){
                unsigned int digit = (unsigned int)str[i] - 48UL;
                unsigned int carry_last = 0;
                unsigned int carry_next = 0;
                for(int i=0; i<M; ++i){
                    tmp = part[i]; //the value *before* the carry add
                    part[i] = hvAdd(part[i], carry_last, f);
                    carry_last = 0;
                    if(f.C)
                        ++carry_last;
                    for(int j=1; j<10; ++j){
                        part[i] = hvAdd(part[i], tmp, f);
                        if(f.C)
                            ++carry_last;
                    }
                }
                part[0] = hvAdd(part[0], digit, f);
                int index = 1;
                while(f.C && index < M){
                    part[index] = hvAdd(part[index], 1, f);
                    ++index;
                }
            }
        }
        /*
        HugeVal operator= (const HugeVal& h)
        {
            *this = HugeVal(h);
        }
        */
        HugeVal operator+ (const HugeVal& h) const
        {
            HugeVal tmp;
            Flags f;
            int index = 0;
            unsigned int carry_last = 0;
            for(int j=0; j<M; ++j){
                if(carry_last){
                    tmp.part[j] = hvAdd(tmp.part[j], carry_last, f);
                    carry_last = 0;
                }
                tmp.part[j] = hvAdd(tmp.part[j], part[j], f);
                if(f.C)
                    ++carry_last;
                tmp.part[j] = hvAdd(tmp.part[j], h.part[j], f);
                if(f.C)
                    ++carry_last;
            }
            return tmp;
        }
        HugeVal operator* (const HugeVal& h) const
        {
            HugeVal tmp;

            for(int j=0; j<M; ++j){
                unsigned int carry_next = 0;
                for(int i=0;i<M; ++i){

                    Flags f;

                    unsigned int accum1 = 0;
                    unsigned int accum2 = 0;
                    unsigned int accum3 = 0;
                    unsigned int accum4 = 0;

                    /* Split into 16-bit values */
                    unsigned int j_LO = part[j]&0xFFFF;
                    unsigned int j_HI = part[j]>>16;
                    unsigned int i_LO = h.part[i]&0xFFFF;
                    unsigned int i_HI = h.part[i]>>16;

                    size_t index = i+j;
                    size_t index2 = index+1;

                    /* These multiplications are safe now. Can't overflow */
                    accum1 = j_LO * i_LO;
                    accum2 = j_LO * i_HI;
                    accum3 = j_HI * i_LO;
                    accum4 = j_HI * i_HI;


                    if(carry_next){ //carry from last iteration
                        accum1 = hvAdd(accum1, carry_next, f); //add to LSB
                        carry_next = 0;
                        if(f.C) //LSB produced carry
                            ++carry_next;
                    }

                    /* Add the lower 16-bit parts of accum2 and accum3 to accum1 */
                    accum1 = hvAdd(accum1, (accum2 << 16), f);
                    if(f.C)
                        ++carry_next;
                    accum1 = hvAdd(accum1, (accum3 << 16), f);
                    if(f.C)
                        ++carry_next;



                    if(carry_next){ //carry from LSB
                        accum4 = hvAdd(accum4, carry_next, f); //add to MSB
                        carry_next = 0;
                        ASSERT(f.C == false);
                    }

                    /* Add the higher 16-bit parts of accum2 and accum3 to accum4 */
                    /* Can't overflow */
                    accum4 = hvAdd(accum4, (accum2 >> 16), f);
                    ASSERT(f.C == false);
                    accum4 = hvAdd(accum4, (accum3 >> 16), f);
                    ASSERT(f.C == false);
                    if(index < M){
                        tmp.part[index] = hvAdd(tmp.part[index], accum1, f);
                        if(f.C)
                            ++carry_next;
                    }
                    carry_next += accum4;
                }
            }
            return tmp;
        }
        void Print() const
        {
            for(int i=(M-1); i>=0; --i){

                printf("%.8X", part[i]);
            }
            printf("\n");
        }
        unsigned int part[M];
    };

}


int main(int argc, char* argv[])
{

    std::string a1("273847238974823947823941");
    std::string a2("324230432432895745949");

    Arithmetic::HugeVal a = a1;
    Arithmetic::HugeVal b = a2;

    Arithmetic::HugeVal d = a + b;
    Arithmetic::HugeVal e = a * b;

    a.Print();
    b.Print();
    d.Print();
    e.Print();
    system("pause");
}

回复收藏 0 原文

尸血腥色 2024-09-05 20:01:18

好吧，看到这个问题大约 11 年前就得到了回答，我想我将为正在编写自己的 BigInt 库的人提供一些指导。

首先，如果您想要的是纯粹的性能而不是研究如何实际编写高性能代码，请学习如何使用 GMP 或 OpenSSL。要达到 GMP 的性能水平，有一个非常陡峭的学习曲线。

好吧，让我们开始吧。

当可以使用更大的基数时，不要使用基数 10。
CPU 擅长加减乘除，所以要充分利用它们。

假设您有两个 BigInt，

a = {9,7,4,2,6,1,6,8} // length 8
b = {3,6,7,2,4,6,7,8} // length 8
// Frustrating writing for-loops to calculate a*b

当它们可以对基数 2^32 进行 1 次计算时，不要让它们以基数 10 进行 50 次计算：

a = {97426168}
b = {36724678}
// Literally only need to type a*b

如果您的计算机可以表示的最大数字是 2^64-1，则使用 2^32-1 作为BigInt 的基数，因为它将解决乘法时实际溢出的问题。

使用支持动态内存的结构。扩展你的程序来处理两个一百万位数的乘法可能会破坏你的程序，因为它在堆栈上没有足够的内存。在 C 中使用 std::vector 代替 std::array 或 raw int[] 来利用内存。
了解SIMD以提升您的计算性能。菜鸟代码中的典型循环无法同时处理多个数据。学习这一点应该可以将速度提高 3 到 12 倍。
了解如何编写自己的内存分配器。如果您使用 std::vector 来存储无符号整数，那么稍后您可能会遇到性能问题，因为 std::vector 仅用于一般目的。尝试根据自己的需要定制分配器，以避免每次执行计算时都进行分配和重新分配。
了解计算机的体系结构和内存布局。编写自己的汇编代码来适应特定的 CPU 架构肯定会提高您的性能。这也有助于编写您自己的内存分配器和 SIMD。
算法。对于小型 BigInt，您可以依靠小学乘法，但随着输入的增加，一定要好好看看 Karatsuba、Toom-Cook，最后是 FFT，以便在您的库中实现。

如果您遇到困难，请访问我的 BigInt 库。它没有自定义分配器、SIMD 代码或自定义汇编代码，但对于 BigInteger 的初学者来说应该足够了。

Alright seeing that this question is answered almost 11 years ago, I figure I'll provide some pointers for the one who is writing their own BigInt library.

First off, if what you want is purely performance instead of studying how to actually write performant code, please just learn how to use GMP or OpenSSL. There is a really really steep learning curve to reach the level of GMP's performance.

Ok, let's get right into it.

Don't use base 10 when you can use a bigger base.
CPUs are god-level good at addition, subtraction, multiplication, and division, so take advantage of them.

Suppose you have two BigInt

a = {9,7,4,2,6,1,6,8} // length 8
b = {3,6,7,2,4,6,7,8} // length 8
// Frustrating writing for-loops to calculate a*b

Don't make them do 50 calculations in base 10 when they could do 1 calculations of base 2^32:

a = {97426168}
b = {36724678}
// Literally only need to type a*b

If the biggest number your computer can represent is 2^64-1, use 2^32-1 as the base for your BigInt, as it will solve the problem of actually overflowing when doing multiplication.

Use a structure that supports dynamic memory. Scaling your program to handle the multiplication of two 1-million digits numbers would probably break you program since it doesn't have enough memory on the stack. Use a std::vector instead of std::array or raw int[] in C to make use of your memory.
Learn about SIMD to give your calculation a boost in performance. Typical loops in noobs' codes can't process multiple data at the same time. Learning this should speed things up from 3 to 12 times.
Learn how to write your own memory allocators. If you use std::vector to store your unsigned integers, chances are, later on, you'll suffer performance problems as std::vector is only for general purposes only. Try to tailor your allocator to your own need to avoid allocating and reallocating every time a calculation is performed.
Learn about your computer's architecture and memory layout. Writing your own assembly code to fit certain CPU architecture would certainly boost your performance. This helps with writing your own memory allocator and SIMD too.
Algorithms. For small BigInt you can rely on your grade school multiplication but as the input grows, certainly take a good look at Karatsuba, Toom-Cook, and finally FFT to implement in your library.

If you're stuck, please visit my BigInt library. It doesn't have custom allocator, SIMD code or custom assembly code, but for starters of BigInteger it should be enough.

回复收藏 0 原文

ら栖息 2024-09-05 20:01:18

安德烈亚斯是对的，你必须将一个数字乘以另一个数字的每一位，然后将结果相加。我认为最好将较长的数字乘以较短的数字。如果您在数组 char 中存储十进制数字确实就足够了，但是如果您想要性能，也许您应该考虑更大的类型。我不知道您的平台是什么，但例如在 x86 的情况下，您可以使用 32 位整数和硬件支持来给出 32 位乘法的 64 位结果。

回复收藏 0 原文