如何在没有未定义行为的情况下进行双块添加?
编辑公共卫生警告 - 该问题包含关于未定义行为的错误假设。查看已接受的答案。
阅读最近的博客文章后,我一直在思考避免所有标准的实用性-C 和 C++ 代码中未定义的假设。这是从 C++ 中剪出的一个片段,用于执行无符号 128 位加法...
void c_UInt64_Pair::operator+= (const c_UInt64_Pair &p)
{
m_Low += p.m_Low;
m_High += p.m_High;
if (m_Low < p.m_Low) m_High++;
}
这显然依赖于有关溢出行为的假设。显然,大多数机器都可以支持正确类型的二进制整数(尽管可能是从 32 位块或其他内容构建的),但优化器可能利用此处未定义的标准行为的机会显然越来越大。也就是说,m_Low m_Low
m_Low
m_Low
m_Low
m_Low
m_Low
m_Low
p.m_Low
条件可以通过,如果 m_Low += p.m_Low
溢出,这是未定义的行为,因此优化器可以合法地决定条件总是失败。在这种情况下,这段代码就被破坏了。
因此,问题是...
如何在不依赖未定义行为的情况下编写上述内容的合理有效版本?
假设您有一个适当的 64 位二进制机器整数,但您有一个恶意编译器,它总是以最坏的可能(或不可能)的方式解释您的未定义行为。另外,假设您没有有一些特殊的内置、内在、库或任何可以为您做这件事的东西。
编辑小澄清 - 这不仅仅是检测溢出,还确保 m_Low 和 m_High 最终得到正确的模 2^64 结果,这也是标准未定义的。
EDIT Public health warning - this question includes a false assumption about undefined behaviour. See accepted answer.
After a reading recent blog post, I've been thinking a lot about the practicality of avoiding all standards-undefined assumptions in C and C++ code. Here is a snippet cut out of C++, to do an unsigned 128-bit addition...
void c_UInt64_Pair::operator+= (const c_UInt64_Pair &p)
{
m_Low += p.m_Low;
m_High += p.m_High;
if (m_Low < p.m_Low) m_High++;
}
This clearly relies on assumptions about overflow behaviour. Obviously most machines can support a binary integer of the right kind (though perhaps building from 32-bit chunks or whatever), but there's apparently a growing chance that the optimiser may exploit the standards-undefined behaviour here. That is, the only way that the m_Low < p.m_Low
condition can pass is if m_Low += p.m_Low
overflows, which is undefined behaviour, so the optimiser can legally decide that the condition always fails. In which case, this code is simply broken.
The question is, therefore...
How can you write a reasonably efficient version of the above without relying on undefined behaviour?
Assume that you have an appropriate 64-bit binary machine integer, but that you have a malicious compiler that will always interpret your undefined behaviour in the worst possible (or impossible) way. Also, assume that you don't have some special built-in, intrinsic, library or whatever to do it for you.
EDIT minor clarification - this isn't just about detecting overflow, but also ensuring that both m_Low and m_High end up with the correct modulo 2^64 results, which is also standards-undefined.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
根据 C++ 1998 标准,3.9.1(4):“无符号整数,声明为无符号,应遵守算术模 2^n 的法则,其中 n 是该特定大小的整数的值表示中的位数。”请注意,这里的“整数”指的是任何整数类型,而不仅仅是
int
。因此,假设这些是无符号整数,如类型中的“UInt64”所示,这是 C++ 中定义的行为,并且应该按预期工作。
From the C++ 1998 standard, 3.9.1(4): "Unsigned integers, declared unsigned, shall obey the laws of arithmetic modulo 2^n where n is the number of bits in the value representation of that particular size of integer." Note that "integer", here, refers to any integer type rather than just
int
.Therefore, assuming that those are unsigned integers, like the "UInt64" in the type suggests, this is defined behavior in C++ and should work as expected.
如果您想要一种真正有效的方法,则必须使用 C 或 C++ 以外的语言进行编码。为了实现合理的效率,您必须确保永远不会发生溢出,并在发生溢出时进行检测和补偿。
基本上,对于每个 64 位组件,您需要使用低 63 位和最高位分别计算加法。通过这些单独的计算,您可以计算出 64 位总数是多少,以及是否有进位。
然后,当您执行高 64 位加法时,您将添加进位(如果有)。如果由此产生进位,那么您就溢出了 128 位变量,并且您需要触发异常,或者以其他方式处理这种情况。
If you want an actually efficient method, you'll have to code in something other than C or C++. For reasonably efficient, you have to ensure that overflow never happens, and detect and compensate for when it would have.
Basically, for each 64-bit component, you need to separately calculate the additions using the low 63 bits, and the highest bits. From these separate calculations you can work out what the 64-bit total was, and if there was a carry.
Then when you do the upper 64-bit add, you add in the carry, if there is one. If a carry results from that, then you've overflowed your 128-bit variable, and you'll need to trigger an exception, or otherwise handle the case.