有符号 64 x 32 整数除法

发布于 2024-07-27 02:39:18 字数 790 浏览 5 评论 0原文

假设您有一个机器指令 udive,它通过采用 (32 位被除数 << 32) / 32 位除数来执行特殊情况 64 x 32 无符号除法,我们可以使用以下命令执行完整的 64 x 32 除法:

// assume: a / b guaranteed not to overflow
a = 64bit dividend, a.h & a.l are hi & lo 32bits respectively
b = 32bit divisor

q1 = udive(a.h, b)  // (a.h << 32) / b
r1 = -(q1 * b)      // remainder of the above, shortcut since a.h & 0xffffffff == 0
q2 = a.l / b        // a.l / b using regular unsigned division
r2 = a.l - (q2 * b) // remainder of the above
q = q1 + q2
r = r1 + r2

// r < r2, r overflowed and is >32bits, implies r > b since b is 32bits
// r >= b, quotient too small by 1, adjust
if (r < r2) or (r >= b)
    q = q + 1
return q

但是有符号情况给我带来了问题。 假设有一个等效的 sdive 指令执行 udive 的签名版本,我无法完全弄清楚如何处理余数等。

Assuming you have a machine instruction udive that does a special case 64 by 32 unsigned division by taking a (32bit dividend << 32) / 32bit divisor, we can do a full 64 by 32 division using the following:

// assume: a / b guaranteed not to overflow
a = 64bit dividend, a.h & a.l are hi & lo 32bits respectively
b = 32bit divisor

q1 = udive(a.h, b)  // (a.h << 32) / b
r1 = -(q1 * b)      // remainder of the above, shortcut since a.h & 0xffffffff == 0
q2 = a.l / b        // a.l / b using regular unsigned division
r2 = a.l - (q2 * b) // remainder of the above
q = q1 + q2
r = r1 + r2

// r < r2, r overflowed and is >32bits, implies r > b since b is 32bits
// r >= b, quotient too small by 1, adjust
if (r < r2) or (r >= b)
    q = q + 1
return q

However the signed case is giving me problems. Assuming an equivalent sdive instruction that does the signed version of udive, I can't quite work out how to deal with the remainders and whatnot.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

歌枕肩 2024-08-03 02:39:18

我认为,如果明确说明哪些变量是 32 位、哪些变量是 64 位以及比较是有符号的还是无符号的,那么你的无符号代码会更容易阅读。

黑客之乐这本书通常适合这类低级算术知识。 我目前手头没有副本,但它在给定 64/32->32 的情况下执行 64/64->64 的代码在线: http://www.hackersdelight.org/HDcode/newCode/divDouble.c

这通过简单地取绝对值来完成签名的情况输入,进行无符号除法,然后如果输入具有不同的符号则翻转结果位的符号。 这对我来说这可能是最好的方法(它肯定比其他方法更容易证明正确)。 您可能需要将股息作为特殊情况,使其成为可能的最小整数(如果它不会随波逐流)。

I think your unsigned code would be easier to read if it was explicit about which variables were 32 bit, which were 64 bit and whether the comparisons were signed or unsigned.

The book Hacker's Delight is generally good for this sort of low-level arithmetic stuff. I don't have a copy to hand at the moment, but its code for doing 64/64->64 given 64/32->32 is online: http://www.hackersdelight.org/HDcode/newCode/divDouble.c

That does the signed case by simply taking the absolute values of the inputs, doing an unsigned division and then flipping the sign of the result bit if the inputs had different sign. This suggests to me that this is probably the best approach (it's certainly easier to prove correct than the alternative). You might need to special case the dividend being the minimum possible integer if it doesn't just fall out in the wash.

江心雾 2024-08-03 02:39:18

谢谢回答。 我看过《黑客之乐》。 如果您查看 HD 中的 divdi3 函数,就会发现当除数为 32 位并且已知结果不会溢出时,会调用 DIVS(64/32->32 指令)。 我所在的机器没有通用的 64/32->32 指令,它有上面提到的特殊情况。 上面的 64/32->32 函数是我在未签名情况下对 DIVU 的实现,所以我试图为 DIVS 找出类似的东西。

我可以忘记 DIVS 路径,但这是绝大多数常见的情况,我想尽可能多地使用它,这只是一个棘手的实现。

很抱歉伪代码不清楚,但所有内容都是无符号的,而且大部分都是 32 位的。

DIVU(uint64 a, uint32 b) {
// assume: a / b guaranteed not to overflow
// a = 64bit dividend, a.h & a.l are hi & lo 32bits respectively
// b = 32bit divisor

uint32 q1 = udive(a.h, b)      // (a.h << 32) / b
uint32 r1 = -(q1 * b)          // remainder of the above, shortcut for (a.h << 32) - (q1 * b) since (a.h << 32) & 0xffffffff == 0
uint32 q2 = a.l / b            // a.l / b using regular unsigned division
uint32 r2 = a.l - (q2 * b)     // remainder of the above
uint32 q = q1 + q2
uint32 r = r1 + r2

// r < r2, r overflowed and is >32bits, implies r > b since b is 32bits
// r >= b, quotient too small by 1, adjust
if (r < r2) or (r >= b)
        q = q + 1
return q
}

thanks for answering. I've looked at Hacker's Delight. If you look at the divdi3 function in HD there's a call to DIVS, the 64/32->32 instruction, when the divisor is 32-bit and the result is known not to overflow. The machine I'm on doesn't have a general purpose 64/32->32 instruction, it has the special case mentioned above. The above function for 64/32->32 is my implementation for DIVU in the unsigned case, so I'm trying to work out something similar for DIVS.

I could just forget about the DIVS path, but that's the overwhelmingly common case and I want to hit it as much as possible, it's just a tricky implementation.

Sorry for the unclear pseudo code, but everything is unsigned and mostly 32bit.

DIVU(uint64 a, uint32 b) {
// assume: a / b guaranteed not to overflow
// a = 64bit dividend, a.h & a.l are hi & lo 32bits respectively
// b = 32bit divisor

uint32 q1 = udive(a.h, b)      // (a.h << 32) / b
uint32 r1 = -(q1 * b)          // remainder of the above, shortcut for (a.h << 32) - (q1 * b) since (a.h << 32) & 0xffffffff == 0
uint32 q2 = a.l / b            // a.l / b using regular unsigned division
uint32 r2 = a.l - (q2 * b)     // remainder of the above
uint32 q = q1 + q2
uint32 r = r1 + r2

// r < r2, r overflowed and is >32bits, implies r > b since b is 32bits
// r >= b, quotient too small by 1, adjust
if (r < r2) or (r >= b)
        q = q + 1
return q
}
压抑⊿情绪 2024-08-03 02:39:18

可以忽略divdi3调用div的特殊优化情况; 我想提请注意的是,当 divdi3 需要进行全强度除法时,它是通过调用 udivdi3 而不是通过与 udivdi3 算法等效的有符号除法来实现的。

看看 Knuth vol 2,我发现他基本上还说,你可以通过取绝对值的过程来进行有符号除法,进行无符号除法,然后修复符号位。 这对我来说很直观,因为有符号的 2s 补数不具有无符号数 a == (ah * 2^32) + al 所具有的便利属性,因此通过操作来组装 64 位操作并不容易两半分开。

前后摆弄不应该在无符号除法上有那么多额外的周期...

PS:这到底是什么奇怪的CPU? :-)

You can ignore divdi3's special optimisation case of calling divs; the thing I wanted to draw attention to was the fact that when divdi3 needs to do full-strength division it does it by calling udivdi3 rather than by having a signed-division equivalent to the udivdi3 algorithm.

Looking in Knuth vol 2 I see that he also basically says that you do signed division by the process of taking absolute values, doing an unsigned divide and then fixing up the sign bit afterwards. This makes intuitive sense to me, because signed 2s complement numbers don't have the convenient property that a == (a.h * 2^32) + a.l that unsigned numbers do, so it's not as easy to assemble 64 bit operations by operating on the two halves separately.

The before-and-after fiddling shouldn't be that many extra cycles over the unsigned divide...

PS: what weirdo CPU is this, anyway? :-)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文