我正在查看一个由NASM组装的Intel组装的示例。它具有说明:
add byte [ebx], 32
我从文档中如何知道“字节”是什么?
我正在阅读的书在文本中解释了“字节”如何告诉汇编程序我们只写一个字节到 ebx
。我尚不清楚我如何从查看文档中知道这一点。
从书籍和其他地方的示例中,看起来 add
指令具有两种形式:
-
add< dest> < src>
-
add< size> < dest> < src>
但是,当我查看英特尔文档[1]时,我看不到任何看起来像我的表格的东西。表中给出的每个说明都只有一个逗号,对我而言,它似乎所有相应的opcodes都只有两个输入。有一个表格“指令操作数编码”。操作数3和4
na。环顾网络,大多数网站都没有提及大小参数(如果适用于我的处理器,更不用说它)了。
我正在以386模式组装Intel(R)Core(TM)I7-6700HQ CPU:
nasm -f elf -g -F stabs -o $OBJECTFILE $1
ld -m elf_i386 -o $BUILDNAME $OBJECTFILE
也许该指令需要386的额外操作数,但不适合新的架构?
[1]“Intel®64和IA-32体系结构软件开发人员手册组合卷:1、2a,2b,2c,2d,2d,3a,3a,3b,3b,3c,3c,3d和4“,vol。 2a 3-31 Page 605在PDF中。
https> https://www.intel。 com/content/www/us/en/ens/developer/articles/technical/intel-sdm.html
I'm looking at an example of Intel assembly being assembled by NASM. It has the instruction:
add byte [ebx], 32
How would I know from the documentation what "byte" does?
The book I'm reading explains in the text how "byte" tells the assembler that we're only writing a single byte to ebx
. It's not clear to me how I'd know this from looking at the documentation.
From examples in the book and elsewhere, it looks like the ADD
instruction has two forms:
ADD <dest> <src>
ADD <size> <dest> <src>
However, when I look at the Intel documentation[1], I don't see anything that looks like either of my forms. Each of the instructions given in the table have only a single comma which, to me, makes it seem like all the corresponding opcodes take only two inputs. There is a table that gives "Instruction Operand Encoding". Operands 3 and 4
NA. Looking around the web, most sites don't mention anything about the size parameter (let alone if it applies to my processor).
I'm assembling on an Intel(R) Core(TM) i7-6700HQ CPU in 386 mode:
nasm -f elf -g -F stabs -o $OBJECTFILE $1
ld -m elf_i386 -o $BUILDNAME $OBJECTFILE
Maybe the instruction takes an extra operand for 386 but not for newer architectures?
[1] "Intel® 64 and IA-32 Architectures Software Developer’s Manual Combined Volumes: 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C, 3D, and 4", Vol. 2A 3-31 page 605 in the pdf.
https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html
发布评论
评论(1)
ASM源中的
字节
关键字设置了指令的操作数尺寸属性。在机器代码中,数字opcode或对于16/32/64位操作数大小所暗示的,当前的CPU模式和非字节opcode的前缀。 英特尔的手册记录机器代码表单,而不是ASM源语法。请参阅以下re:如何编码到机器代码中。
时,这也是为什么汇编者也有手册时, 手册分开。
与ISA a>
3.1 nasm源线的布局描述了助记符和操作数的语法布局。不幸的是,该部分忽略了您可以放在操作数字前的覆盖,只有
o16
之类的前缀,您才能将其放在Mnemonic的前面! (作为指定操作数大小的笨拙的手册方法。)手册确实具有在多个位置使用操作数大小覆盖的示例,例如 2.2 MASM用户的快速启动它指出NASM需要
MOV Word [var],2
,即使VAR是var dw 0
在MASM中,它将神奇地暗示该指令的操作数大小。并提及相同的指定器与strict> strict> strint
迫使即时编码,而不仅仅是操作数大小。例如添加ECX,严格的DWord 123
强制添加R/M32,IMM32
表单,而添加ECX,DWord 123
仍然允许添加R/M32,IMM8
表单。 https://www.felixcloutier.com/x86/add )( 像GAS和CLANG/LLVM一样,默认使用 at st&amp; t语法这与Intel Manuals用来谈论指令的内容截然不同。 ,在指令mnemonic上的后缀指定操作数大小的地方,例如
movb $'a',(%rdi)
而不是MASMMOV BYTE PTR [RDI], 'a'
(注意额外的ptr
关键字)或nasmmov byte [rdi],'a'
汇编语法取决于该工具,而不是ISA 。英特尔的手册,尤其是第2卷,列出了所有可用说明的部分,请执行 指定如何在ASM源中指定操作数大小的语法详细信息。
在ASM源中,寄存器可以暗示操作数大小的
说明,其中两个操作数必须相同的说明,寄存器操作数意味着ASM源语法中的操作数大小,因此您无需指定。例如
添加eax,[rdi]
不需要是添加eax,dword [rdi]
。但是Mov-Immediate到内存(或任何其他
op mem,IMM
指令),以及inc [mem]
的单一记忆说明和稀有指令,也是模棱两可的操作数不必像shl [rdi],cl
(目标大小为b/w/w/d/q)或movzx eax,[rdi] (源大小可以是字节或单词)
请参见我何时需要在组装中指定操作数的大小?
像nasm这样的好的汇编器会在这种歧义上出错。不太好的汇编商有时会选择默认设置。例如,燃气选择除了MOV以外的说明,例如
加入$ 1,(%rdi)
,直到最近才添加了警告!同样,
[rdi + rax]
指定64位地址尺寸,而[EDI + EAX]
将是32位地址大小。[1234]
之类的默认地址大小(在ASM源中)是当前模式的位,即计算机中不使用67
地址>地址大小的前缀代码。同样,这是关于ASM源级语法的100%。将指令编码到机器代码的某个模式中必然意味着操作数大小。
这就是为什么您需要告诉汇编程序CPU将在哪种模式中解码。通常,通过使用
nasm -felf64
组装来制作一个64位对象文件。在这种情况下,位32
会让您将错配的机器代码放入错误的对象文件中,而不是从按
按下EBX
在64--的编码中引起错误。位模式。The
byte
keyword in the asm source sets the operand-size attribute of the instruction. In machine code, that would be implied by the numeric opcode, or for 16/32/64-bit operand-size, by the current CPU mode and prefixes for the non-byte opcode. Intel's manual documents the machine-code forms, not asm source syntax.See the following re: how that gets encoded into machine code.
This is why assemblers have manuals, too, separate from the ISA manual.
For example, NASM's manual, Chapter 3: The NASM Language
3.1 Layout of a NASM Source Line describes the syntax layout of a mnemonic and operands. Unfortunately that section neglects to mention the overrides you can put in front of operands, only the prefixes like
o16
you can put in front of the mnemonic! (As a clunkier manual way to specify the operand-size.)The manual does has examples of usage of operand-size overrides in multiple places, e.g. in 2.2 Quick Start for MASM Users it points out that NASM needs
mov word [var], 2
even if var isvar dw 0
which in MASM would magically imply an operand-size for that instruction. And mention of the same specifiers when used withstrict
to force the encoding of the immediate, not just the operand-size. e.g.add ecx, strict dword 123
forces theadd r/m32, imm32
form, whileadd ecx, dword 123
still allows theadd r/m32, imm8
form. (https://www.felixcloutier.com/x86/add)Some other x86 assemblers, like GAS and clang/LLVM, by default use AT&T syntax that's very different from what Intel manuals use to talk about instructions, where operand-size is specified (if needed) by a suffix on the instruction mnemonic, like
movb $'a', (%rdi)
instead of MASMmov byte ptr [rdi], 'a'
(note the extraptr
keyword) or NASMmov byte [rdi], 'a'
Assembly syntax depends on the tool, not the ISA. Intel's manuals, especially vol.2, the part that lists every available instruction, do not specify the syntax details of how to specify operand-size in asm source when it would be ambiguous.
In asm source, a register can imply operand-size
For instructions where both operands must be the same size, a register operand implies the operand-size in asm source syntax, so you don't need to specify it. e.g.
add eax, [rdi]
doesn't need to beadd eax, dword [rdi]
.But mov-immediate to memory (or any other
op mem,imm
instruction) are ambiguous, as are one-operand memory instructions likeinc [mem]
, and the rare instructions where operands don't have to be the same size likeshl [rdi], cl
(destination size could be b/w/d/q) ormovzx eax, [rdi]
(source size could be byte or word)See When do I need to specify the size of the operand in Assembly?
Good assemblers like NASM will error on that ambiguity. Less-good assemblers will sometimes just pick a default. e.g. GAS picks dword for instructions other than MOV, e.g.
add $1, (%rdi)
, and only recently even added a warning about that!Similarly,
[rdi + rax]
specifies 64-bit address-size, while[edi + eax]
would be 32-bit address-size. The default address-size (in asm source) for something like[1234]
is the bitness of the current mode, i.e. not using a67
address-size prefix in the machine code.Again, this is 100% about asm source-level syntax. Encoding an instruction into machine code for a certain mode necessarily implies an operand-size.
That's why you need to tell the assembler what mode the CPU will be decoding in. e.g. with NASM
bits 32
if you're making a flat binary or switching modes in a bootloader. Or more normally by assembling withnasm -felf64
to make a 64-bit object file. In that case,bits 32
would let you put mismatched machine code into the wrong object file, instead of causing an error at assemble time frompush ebx
not being encodeable for 64-bit mode.