如何在汇编器中实现相对 JMP (x86)?
在为 x86 平台构建汇编器时,我在编码 JMP
指令时遇到了一些问题:(
OPCODE INSTRUCTION SIZE
EB cb JMP rel8 2
E9 cw JMP rel16 4 (because of 0x66 16-bit prefix)
E9 cd JMP rel32 5
...
来自我最喜欢的 x86 指令网站,http://siyobik.info/index.php?module=x86&id=147)
全部都是 相对跳转,其中每个编码(操作+操作数)的大小在第三列中。
现在我原来的(因此是错误的)设计为每条指令保留了最大(5 字节)空间。操作数尚不清楚,因为它跳转到未知位置。因此,我实现了一种“重写”机制,如果跳转的位置已知,则将操作数重写到内存中的正确位置,并用 NOP 填充其余部分。这在紧循环中是一个比较严重的问题。
现在我的问题是以下情况:
b: XXX
c: JMP a
e: XXX
...
XXX
d: JMP b
a: XXX (where XXX is any instruction, depending
on the to-be assembled program)
问题是我想要 JMP
指令的尽可能最小的编码(并且没有 NOP
填充)。
我必须先知道 c
处指令的大小,然后才能计算 a
处操作数的 a
和 b
之间的相对距离d。这同样适用于 c
处的 JMP
:它需要知道 d
的大小,然后才能计算 e 之间的相对距离
和 a
。
现有的汇编器如何解决这个问题,或者你会如何做到这一点?
这就是我正在思考的解决问题的方法:
首先将
JMP
与其目标之间的所有指令编码为操作码,如果该区域包含可变大小的操作码,则使用最大大小,例如5
JMP
。然后,通过选择可能的最小编码大小(3、4 或 5),将相对JMP
编码到目标,并计算距离。如果对任何可变大小的操作码进行编码,请更改之前的所有绝对操作数,以及跳过此编码指令的所有相对指令:当它们的操作数更改为选择最小可能的大小时,它们将被重新编码。此方法保证结束,因为可变大小的操作码仅可能缩小(因为它使用它们的最大大小)。
我想知道,也许这是一个过度设计的解决方案,这就是我问这个问题的原因。
While building my assembler for the x86 platform I encountered some problems with encoding the JMP
instruction:
OPCODE INSTRUCTION SIZE
EB cb JMP rel8 2
E9 cw JMP rel16 4 (because of 0x66 16-bit prefix)
E9 cd JMP rel32 5
...
(from my favourite x86 instruction website, http://siyobik.info/index.php?module=x86&id=147)
All are relative jumps, where the size of each encoding (operation + operand) is in the third column.
Now my original (and thus fault because of this) design reserved the maximum (5 bytes) space for each instruction. The operand is not yet known, because it's a jump to a yet unknown location. So I've implemented a "rewrite" mechanism, that rewrites the operands in the correct location in memory, if the location of the jump is known, and fills the rest with NOP
s. This is a somewhat serious concern in tight-loops.
Now my problem is with the following situation:
b: XXX
c: JMP a
e: XXX
...
XXX
d: JMP b
a: XXX (where XXX is any instruction, depending
on the to-be assembled program)
The problem is that I want the smallest possible encoding for a JMP
instruction (and no NOP
filling).
I have to know the size of the instruction at c
before I can calculate the relative distance between a
and b
for the operand at d
. The same applies for the JMP
at c
: it needs to know the size of d
before it can calculate the relative distance between e
and a
.
How do existing assemblers solve this problem, or how would you do this?
This is what I am thinking which solves the problem:
First encode all the instructions to opcodes between the
JMP
and it's target, if this region contains a variable-sized opcode, use the maximum size, e.g.5
for aJMP
. Then encode the relativeJMP
to it's target, by choosing the smallest possible encoding size (3, 4 or 5) and calculate the distance. If any variable-sized opcode is encoded, change all absolute operands before, and all relative instructions that skips over this encoded instruction: they are re-encoded when their operand changes to choose the smallest possible size. This method is guaranteed to end, as variable-sized opcodes only may shrink (because it uses the maximum size of them).
I wonder, perhaps this is an over-engineered solution, that's why I ask this question.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
在第一遍中,您将通过对所有跳转指令使用悲观字节计数来非常接近要使用的
jmp
代码。在第二遍中,您可以使用所选的悲观操作码填充跳转。然后,很少有跳转可以被重写以使用更少的一个或两个字节,只有那些非常接近最初的 8/16 位或 16/32 字节跳转阈值的跳转。由于候选者都是许多字节的跳转,因此它们不太可能处于关键循环情况,因此您可能会发现进一步的传递比两遍解决方案提供的好处很少或没有。
In the first pass you will have a very good approximation to which
jmp
code to use using a pessimistic byte counting for all jump instructions.On the second pass you can fill in the jumps with the pessimistic opcode chosen. Very few jumps could then be rewritten to use a byte or two less, only those that were very close to the 8/16 bit or 16/32 byte jump threshold originally. As the candidates are all jumps of many bytes, they are less likely to be in critical loop situations so you are likely to find that further passes offer little or no benefit over a two pass solution.
这是我使用过的一种方法,它可能看起来效率低下,但事实证明并不适合大多数现实生活中的代码(伪代码):
Here's one approach I've used that may seem inefficient but turns out not to be for most real-life code (pseudo-code):