如何在 C 或 C++ 中对 128 位整数进行加法和减法 如果我的编译器不支持它们?
我正在为 128 位数字的长流编写一个压缩器。 我想将数字存储为差异 - 仅存储数字之间的差异而不是数字本身,因为我可以将差异打包在更少的字节中,因为它们更小。
然而,为了压缩,我需要减去这些 128 位值,为了解压缩,我需要添加这些值。 我的编译器的最大整数大小是 64 位宽。
有人有什么想法可以有效地做到这一点吗?
I'm writing a compressor for a long stream of 128 bit numbers. I would like to store the numbers as differences -- storing only the difference between the numbers rather than the numbers themselves because I can pack the differences in fewer bytes because they are smaller.
However, for compression then I need to subtract these 128 bit values, and for decompression I need to add these values. Maximum integer size for my compiler is 64 bits wide.
Anyone have any ideas for doing this efficiently?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
如果您需要的只是加法和减法,并且您已经有了二进制形式的 128 位值,那么库可能会很方便,但并不是绝对必要的。 这个数学对于你自己来说是微不足道的。
我不知道您的编译器对 64 位类型使用什么,因此我将使用 INT64 和 UINT64 来表示有符号和无符号 64 位整数。
If all you need is addition and subtraction, and you already have your 128-bit values in binary form, a library might be handy but isn't strictly necessary. This math is trivial to do yourself.
I don't know what your compiler uses for 64-bit types, so I'll use INT64 and UINT64 for signed and unsigned 64-bit integer quantities.
查看 GMP。
输出是
Take a look at GMP.
The output is
完全是偶然发现这篇相对较旧的文章,我认为有必要详细阐述 Volte 之前的猜想,以帮助没有经验的读者。
首先,128 位数字的有符号范围是 -2127 到 2127-1,而不是 -2127 到 2127 按照最初规定。
其次,由于有限算术的循环性质,两个 128 位数字之间所需的最大差为 -2127 到 2127-1,其存储先决条件为128 位,而不是 129 位。尽管 (2127-1) - (-2127) = 2128-1 显然更大大于最大 2127-1 正整数,算术溢出始终确保任意两个 n 位数字之间的最近距离始终落在 0 到 2 范围内n-1,因此隐式 -2n-1 到 2n- 1-1。
为了清楚起见,让我们首先检查假设的 3 位处理器如何实现二进制加法。 作为示例,请考虑下表,该表描述了 3 位整数的绝对无符号范围。
0 = 000b
1 = 001b
2 = 010b
3 = 011b
4 = 100b
5 = 101b
6 = 110b
7 = 111b ---> [溢出时循环回到 000b]
从上表中可以明显看出:
001b(1) + 010b(2) = 011b(3)
同样明显的是,将这些数字中的任何一个与其数字补码始终生成 2n-1:
010b(2) + 101b([2 的补码] = 5) = 111b(7) = (23-1)
由于当两个 n 位数字相加导致 (n+1 ) 位结果,因此,将这些数字中的任何一个与其数字补码 + 1 相加将始终产生 0:
010b(2) + 110b([2 的补码] + 1) = 000b(0 )
因此我们可以说 [n 的补码] + 1 = -n,因此 n + [n< 的补码/i>] + 1 = n + (-n) = 0。此外,如果我们现在知道 n + [< 的补集i>n] + 1 = 0,则 n + [n 的补码 - x] + 1 必须 = n - (n-x) = x。
将此应用于我们原始的 3 位表,得出:
0 = 000b = [0 的补码] + 1 = 0
1 = 001b = [7 的补码] + 1 = -7
2 = 010b = [6 的补码] + 1 = -6
3 = 011b = [5 的补码] + 1 = -5
4 = 100b = [4 的补码] + 1 = -4
5 = 101b = [3 的补码] + 1 = -3
6 = 110b = [2 的补码] + 1 = -2
7 = 111b = [1的补码] + 1 = -1 ---> [溢出时循环回到 000b]
无论表示抽象是正数、负数还是两者的组合(如带符号补码算术所暗示的那样),我们现在有 2n < i>n 位模式,可无缝服务正 0 至 2n-1 和负 0 至 -(2n )-1 根据需要而变化。 事实上,所有现代处理器都采用这样的系统来实现用于加法和减法运算的通用 ALU 电路。 当CPU遇到
i1 - i2
减法指令时,它会在内部对i2
执行[补+1]运算,然后通过加法电路处理操作数以计算i1
+ [i2
的补码] + 1。除了额外的进位/符号异或门溢出标志、有符号和无符号加法以及隐式减法之外,都是隐含的。如果我们将上表应用到输入序列 [-2n-1, 2n-1- 1, -2n-1] 正如 Volte 的原始回复中所示,我们现在能够计算以下 n 位差分:
diff #1:
(2n-1-1) - (-2n-1) =
3 - (-4) = 3 + 4 =
(-1) = 7 = 111b
差异 #2:
(-2n-1) - (2n-1-1) =
(-4) - 3 = (-4) + (5) =
(-7) = 1 = 001b
从我们的种子 -2n-1 开始,我们现在可以通过应用来重现原始输入序列上述每个差异依次为:
(-2n-1) + (diff #1) =
(-4) + 7 = 3 =
2n-1-1
(2n-1-1) + (差异 #2) =
3 + (-7) = (-4) =
-2n-1
您当然可能希望采用更哲学的方法来解决这个问题并推测为什么 2n 个循环顺序数需要超过 2n 个循环顺序微分?
塔利东。
Having stumbled across this relatively old post entirely by accident, I thought it pertinent to elaborate on Volte's previous conjecture for the benefit of inexperienced readers.
Firstly, the signed range of a 128-bit number is -2127 to 2127-1 and not -2127 to 2127 as originally stipulated.
Secondly, due to the cyclic nature of finite arithmetic the largest required differential between two 128-bit numbers is -2127 to 2127-1, which has a storage prerequisite of 128-bits, not 129. Although (2127-1) - (-2127) = 2128-1 which is clearly greater than our maximum 2127-1 positive integer, arithmetic overflow always ensures that the nearest distance between any two n-bit numbers always falls within the range 0 to 2n-1 and thus implicitly -2n-1 to 2n-1-1.
In order to clarify, let us first examine how a hypothetical 3-bit processor would implement binary addition. As an example, consider the following table which depicts the absolute unsigned range of a 3-bit integer.
0 = 000b
1 = 001b
2 = 010b
3 = 011b
4 = 100b
5 = 101b
6 = 110b
7 = 111b ---> [Cycles back to 000b on overflow]
From the above table it is readily apparent that:
001b(1) + 010b(2) = 011b(3)
It is also apparent that adding any of these numbers with its numeric complement always yields 2n-1:
010b(2) + 101b([complement of 2] = 5) = 111b(7) = (23-1)
Due to the cyclic overflow which occurs when the addition of two n-bit numbers results in an (n+1)-bit result, it therefore follows that adding any of these numbers with its numeric complement + 1 will always yield 0:
010b(2) + 110b([complement of 2] + 1) = 000b(0)
Thus we can say that [complement of n] + 1 = -n, so that n + [complement of n] + 1 = n + (-n) = 0. Furthermore, if we now know that n + [complement of n] + 1 = 0, then n + [complement of n - x] + 1 must = n - (n-x) = x.
Applying this to our original 3-bit table yields:
0 = 000b = [complement of 0] + 1 = 0
1 = 001b = [complement of 7] + 1 = -7
2 = 010b = [complement of 6] + 1 = -6
3 = 011b = [complement of 5] + 1 = -5
4 = 100b = [complement of 4] + 1 = -4
5 = 101b = [complement of 3] + 1 = -3
6 = 110b = [complement of 2] + 1 = -2
7 = 111b = [complement of 1] + 1 = -1 ---> [Cycles back to 000b on overflow]
Whether the representational abstraction is positive, negative or a combination of both as implied with signed twos-complement arithmetic, we now have 2n n-bit patterns which can seamlessly serve both positive 0 to 2n-1 and negative 0 to -(2n)-1 ranges as and when required. In point of fact, all modern processors employ just such a system in order to implement common ALU circuitry for both addition and subtraction operations. When a CPU encounters an
i1 - i2
subtraction instruction, it internally performs a [complement + 1] operation oni2
and subsequently processes the operands through the addition circuitry in order to computei1
+ [complement ofi2
] + 1. With the exception of an additional carry/sign XOR-gated overflow flag, both signed and unsigned addition, and by implication subtraction, are each implicit.If we apply the above table to the input sequence [-2n-1, 2n-1-1, -2n-1] as presented in Volte's original reply, we are now able to compute the following n-bit differentials:
diff #1:
(2n-1-1) - (-2n-1) =
3 - (-4) = 3 + 4 =
(-1) = 7 = 111b
diff #2:
(-2n-1) - (2n-1-1) =
(-4) - 3 = (-4) + (5) =
(-7) = 1 = 001b
Starting with our seed -2n-1, we are now able to reproduce the original input sequence by applying each of the above differentials sequentially:
(-2n-1) + (diff #1) =
(-4) + 7 = 3 =
2n-1-1
(2n-1-1) + (diff #2) =
3 + (-7) = (-4) =
-2n-1
You may of course wish to adopt a more philosophical approach to this problem and conjecture as to why 2n cyclically-sequential numbers would require more than 2n cyclically-sequential differentials?
Taliadon.
Boost 1.53 现在包含多精度:
输出:
Boost 1.53 now includes multiprecision:
Output:
有很多关于大整数数学的文献。 您可以使用免费提供的库之一(请参阅链接),也可以创建自己的库。 尽管我应该警告您,对于比加法和减法(以及移位)更复杂的事情,您需要使用不平凡的算法。
要进行加法和减法,您可以创建一个包含两个 64 位整数的类/结构。 您可以使用简单的学校数学来进行加法和减法。 基本上,就像用铅笔和纸一样进行加法或减法,并仔细考虑进位/借位。
搜索大整数。 顺便说一句,最新版本的 VC++、IntelC++ 和 GCC 编译器具有 128 位整数类型,尽管我不确定它们是否像您希望的那样易于访问(它们旨在与 sse2/xmms 寄存器一起使用)。
There is a lot of literature regarding large integer math. You can use one of the libraries freely available (see links) or you can roll your own. Although I should warn you, for anything more complicated than addition and subtraction (and shifts), you'll need to use non-trivial algorithms.
To add and subtract, you can create a class/structure that holds two 64-bit integers. You can use simple school math to do the addition and subtraction. Basically, do what you do with a pencil and paper to add or subtract, with careful consideration to carries/borrows.
Search for large integer. Btw recent versions of VC++, IntelC++ and GCC compilers have 128-bit integer types, although I'm not sure they are as easily accessible as you might like (they are intended to be used with sse2/xmms registers).
另外值得注意的是:如果目标仅仅是通过预处理来改进数字流的压缩,那么预处理后的流不必由精确的算术差异组成。 您可以使用 XOR (
^
) 代替+
和-
。 优点是 128 位 XOR 与 64 位部分上的两个独立 XOR 完全相同,因此既简单又高效。Also worth noting: if the goal is merely to improve the compression of a stream of numbers by preprocessing it, then the preprocessed stream doesn't have to be made of exactly arithmetic differences. You can use XOR (
^
) instead of+
and-
. The advantage is that a 128-bit XOR is exactly the same as two independent XORs on the 64-bit parts, so it is both simple and efficient.TomsFastMath 有点像 GMP(上面提到的),但是是公共领域,并且是根据速度极快(甚至包含针对 x86、x86-64、ARM、SSE2、PPC32 和 AVR32 的汇编代码优化)。
TomsFastMath is a bit like GMP (mentioned above), but is public domain, and was designed from the ground up to be extremely fast (it even contains assembly code optimizations for x86, x86-64, ARM, SSE2, PPC32, and AVR32).