汇编语言 - 如何进行取模?
x86 汇编中是否有类似模运算符或指令之类的东西?
Is there something like a modulo operator or instruction in x86 assembly?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
x86 汇编中是否有类似模运算符或指令之类的东西?
Is there something like a modulo operator or instruction in x86 assembly?
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(4)
如果您的模数/除数是已知常数,并且您关心性能,请参阅 这个和这个。对于直到运行时才知道的循环不变值,乘法逆甚至是可能的,例如参见 https://libdivide.com/< /a> (但是如果没有 JIT code-gen,这比仅对一个常量所需的步骤进行硬编码的效率要低。)
切勿将
div
用于已知的 2 幂:它是 很多对于余数,比和
慢,或者对于除法,比右移慢。查看 C 编译器输出,了解无符号或有符号除以 2 的幂的示例,例如 Godbolt 编译器浏览器。如果您知道运行时输入是 2 的幂,请使用 lea eax, [esi-1] ;和 eax, edi
或类似的东西来执行x & (y-1)。 Modulo 256 甚至更加高效:
movzx eax, cl
在最新的 Intel CPU 上具有零延迟 (mov-elimination),只要两个寄存器是分开的。在简单/一般情况下:运行时未知值
DIV
指令< /a> (及其对应的IDIV
用于签名数字)同时给出商和余数。对于无符号,余数和模是同一件事。对于签名的idiv
,它为您提供余数(不是模数) 可以为负数:例如
-5 / 2 = -2 rem -1
。 x86 除法语义与 C99 的%
运算符完全匹配。DIV r32
将EDX:EAX
中的 64 位数字除以 32 位操作数(在任何寄存器或内存中),并将商存储在EAX
code> 和EDX
中的其余部分。它因商溢出而出错。无符号 32 位示例(在任何模式下工作)
在 16 位汇编中,您可以执行
div bx
来除以DX:AX< 中的 32 位操作数/code> 由
BX
提供。请参阅英特尔的架构软件开发人员手册 了解更多信息。通常始终在无符号
div
之前使用xor edx,edx
将 EAX 零扩展为 EDX:EAX。 这就是“正常”32 位 / 32 位 => 32 位除法。对于有符号除法,在
idiv
之前使用cdq
来有符号-扩展 EAX进入 EDX:EAX。另请参阅 为什么在使用 DIV 指令之前 EDX 应为 0 ?。对于其他操作数大小,请使用cbw
(AL->AX)、cwd
(AX->DX:AX)、cdq
( EAX->EDX:EAX) 或cqo
(RAX->RDX:RAX) 将上半部分设置为0
或-1
根据低半部分的符号位。div
/idiv
可用于 8、16、32 和(在 64 位模式下)64 位的操作数大小。在当前的 Intel CPU 上,64 位操作数大小比 32 位或更小的速度慢得多,但 AMD CPU 只关心数字的实际大小,而不考虑操作数大小。请注意,8 位操作数大小很特殊:隐式输入/输出位于 AH:AL(又名 AX)中,而不是 DL:AL 中。请参阅 DOSBox 上的 8086 程序集: idiv 指令的错误?< /a> 为例。
有符号 64 位除法示例(需要 64 位模式)
限制/常见错误
div dword 10
无法编码为机器代码(因此您的汇编器将报告有关无效操作数的错误)。与
mul
/imul
不同(您通常应该使用更快的 2 操作数imul r32、r/m32
或 3 操作数imul r32、r/m32、imm8/32
而不是浪费时间编写高半结果),没有更新的操作码用于除以立即数或 32 位/32 位=>没有高半被除数输入的 32 位除法或余数。除法是如此缓慢并且(希望如此)罕见,以至于他们没有费心添加一种方法来让您避免 EAX 和 EDX,或者直接使用立即数。
如果商不适合一个寄存器,div 和 idiv 将出错(AL / AX / EAX / RAX,与被除数的宽度相同)。这包括除以零,但也适用于非零 EDX 和较小的除数。这就是为什么 C 编译器只是进行零扩展或符号扩展,而不是将 32 位值拆分为 DX:AX。
还有为什么
INT_MIN / -1
是 C 未定义行为:它会溢出 2 的补码系统(如 x86)上的有符号商。请参阅 为什么整数除以 -1(负一)导致 FPE? 作为 x86 与 ARM 的示例。在这种情况下,x86 idiv 确实会出错。x86 异常是
#DE
- 除法异常。在 Unix/Linux 系统上,内核向导致 #DE 异常的进程传递 SIGFPE 算术异常信号。 (在哪些平台上执行整数除以零会触发浮点异常吗?)对于
div
,使用high_half
high_half
的被除数除数
是安全的。例如,0x11:23 / 0x12
小于0xff
,因此它适合 8 位商。通过使用一个块的余数作为下一个块的上半除数 (EDX),可以实现大数除以小数的扩展精度除法。这可能就是为什么他们选择余数 = EDX 商 = EAX,而不是相反。
If your modulus / divisor is a known constant, and you care about performance, see this and this. A multiplicative inverse is even possible for loop-invariant values that aren't known until runtime, e.g. see https://libdivide.com/ (But without JIT code-gen, that's less efficient than hard-coding just the steps necessary for one constant.)
Never use
div
for known powers of 2: it's much slower thanand
for remainder, or right-shift for divide. Look at C compiler output for examples of unsigned or signed division by powers of 2, e.g. on the Godbolt compiler explorer. If you know a runtime input is a power of 2, uselea eax, [esi-1]
;and eax, edi
or something like that to dox & (y-1)
. Modulo 256 is even more efficient:movzx eax, cl
has zero latency on recent Intel CPUs (mov-elimination), as long as the two registers are separate.In the simple/general case: unknown value at runtime
The
DIV
instruction (and its counterpartIDIV
for signed numbers) gives both the quotient and remainder. For unsigned, remainder and modulus are the same thing. For signedidiv
, it gives you the remainder (not modulus) which can be negative:e.g.
-5 / 2 = -2 rem -1
. x86 division semantics exactly match C99's%
operator.DIV r32
divides a 64-bit number inEDX:EAX
by a 32-bit operand (in any register or memory) and stores the quotient inEAX
and the remainder inEDX
. It faults on overflow of the quotient.Unsigned 32-bit example (works in any mode)
In 16-bit assembly you can do
div bx
to divide a 32-bit operand inDX:AX
byBX
. See Intel's Architectures Software Developer’s Manuals for more information.Normally always use
xor edx,edx
before unsigneddiv
to zero-extend EAX into EDX:EAX. This is how you do "normal" 32-bit / 32-bit => 32-bit division.For signed division, use
cdq
beforeidiv
to sign-extend EAX into EDX:EAX. See also Why should EDX be 0 before using the DIV instruction?. For other operand-sizes, usecbw
(AL->AX),cwd
(AX->DX:AX),cdq
(EAX->EDX:EAX), orcqo
(RAX->RDX:RAX) to set the top half to0
or-1
according to the sign bit of the low half.div
/idiv
are available in operand-sizes of 8, 16, 32, and (in 64-bit mode) 64-bit. 64-bit operand-size is much slower than 32-bit or smaller on current Intel CPUs, but AMD CPUs only care about the actual magnitude of the numbers, regardless of operand-size.Note that 8-bit operand-size is special: the implicit inputs/outputs are in AH:AL (aka AX), not DL:AL. See 8086 assembly on DOSBox: Bug with idiv instruction? for an example.
Signed 64-bit division example (requires 64-bit mode)
Limitations / common mistakes
div dword 10
is not encodeable into machine code (so your assembler will report an error about invalid operands).Unlike with
mul
/imul
(where you should normally use faster 2-operandimul r32, r/m32
or 3-operandimul r32, r/m32, imm8/32
instead that don't waste time writing a high-half result), there is no newer opcode for division by an immediate, or 32-bit/32-bit => 32-bit division or remainder without the high-half dividend input.Division is so slow and (hopefully) rare that they didn't bother to add a way to let you avoid EAX and EDX, or to use an immediate directly.
div and idiv will fault if the quotient doesn't fit into one register (AL / AX / EAX / RAX, the same width as the dividend). This includes division by zero, but will also happen with a non-zero EDX and a smaller divisor. This is why C compilers just zero-extend or sign-extend instead of splitting up a 32-bit value into DX:AX.
And also why
INT_MIN / -1
is C undefined behaviour: it overflows the signed quotient on 2's complement systems like x86. See Why does integer division by -1 (negative one) result in FPE? for an example of x86 vs. ARM. x86idiv
does indeed fault in this case.The x86 exception is
#DE
- divide exception. On Unix/Linux systems, the kernel delivers a SIGFPE arithmetic exception signal to processes that cause a #DE exception. (On which platforms does integer divide by zero trigger a floating point exception?)For
div
, using a dividend withhigh_half < divisor
is safe. e.g.0x11:23 / 0x12
is less than0xff
so it fits in an 8-bit quotient.Extended-precision division of a huge number by a small number can be implemented by using the remainder from one chunk as the high-half dividend (EDX) for the next chunk. This is probably why they chose remainder=EDX quotient=EAX instead of the other way around.
如果计算 2 的幂模,使用按位 AND 比执行除法更简单且通常更快。如果 b 是 2 的幂,则 a % b == a & (b - 1)。
例如,我们在寄存器 EAX 中取一个值,模 64。
最简单的方法是 AND EAX, 63,因为 63 在二进制中是 111111。
我们对被屏蔽的较高数字不感兴趣。尝试一下!
类似地,不使用具有 2 次方的 MUL 或 DIV,而是使用位移位。但要小心有符号整数!
If you compute modulo a power of two, using bitwise AND is simpler and generally faster than performing division. If
b
is a power of two,a % b == a & (b - 1)
.For example, let's take a value in register EAX, modulo 64.
The simplest way would be
AND EAX, 63
, because 63 is 111111 in binary.The masked, higher digits are not of interest to us. Try it out!
Analogically, instead of using MUL or DIV with powers of two, bit-shifting is the way to go. Beware signed integers, though!
要查看模数运算符在各种架构上的样子,一个简单的方法是使用 Godbolt Compiler Explorer。
https://godbolt.org/z/64zKGr
An easy way to see what a modulus operator looks like on various architectures is to use the Godbolt Compiler Explorer.
https://godbolt.org/z/64zKGr
如果您不太关心性能并希望使用直接的方式,则可以使用
DIV
或IDIV
。DIV
或IDIV
在除法处仅采用一个操作数某个寄存器具有该操作数,该操作数可以
仅注册或内存位置。
当操作数是字节时:
AL = AL / 操作数,AH = 余数(模)。
例如:
MOV AL,31h; Al = 31h
DIV BL ; Al(商)= 08h,Ah(余数)= 01h
当操作数为单词时:
AX = (AX) / 操作数,DX = 余数(模)。
例如:
MOV AX,9031h; Ax = 9031h
DIV BX ; Ax=1808h & Dx(余数)= 01h
If you don't care too much about performance and want to use the straightforward way, you can use either
DIV
orIDIV
.DIV
orIDIV
takes only one operand where it dividesa certain register with this operand, the operand can
be register or memory location only.
When operand is a byte:
AL = AL / operand, AH = remainder (modulus).
Ex:
MOV AL,31h ; Al = 31h
DIV BL ; Al (quotient)= 08h, Ah(remainder)= 01h
when operand is a word:
AX = (AX) / operand, DX = remainder (modulus).
Ex:
MOV AX,9031h ; Ax = 9031h
DIV BX ; Ax=1808h & Dx(remainder)= 01h