JMP 或 NOP 字符串哪个更快?
我正在实现二进制翻译,并且必须处理长度约为 16 个操作码的 NOP (0x90) 序列。将 JMP(到末尾)放在此类序列的开头是否会提高性能?
I'm implementing binary translation and have to deal with sequences of NOPs (0x90) with length about 16 opcodes. Is it better for performance to place JMP (to the end) at start of such sequences?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
英特尔架构软件开发人员指南,第 2B 卷(新西兰说明) 包含下表(第 4-12 页)关于
NOP
:表 4-9。推荐的 NOP 指令多字节序列
这允许您构造特定大小的“填充
NOP
”。使用其中两个,您可以桥接 16 字节,尽管我赞同检查优化指南(针对您的目标 CPU)的建议,JMP
是否比两个这样的NOP< /代码>。
The Intel Architecture Software developer's guide, volume 2B (instructions N-Z) contains the following table (pg 4-12) about
NOP
:Table 4-9. Recommended Multi-Byte Sequence of NOP Instruction
This allows you to construct "padding
NOP
" of certain sizes. With two of those, you can bridge 16 Bytes, although I second the suggestion to check the optimization guides (for the CPU you're targeting) whether aJMP
is faster than two suchNOPs
.如果 NOP 是为了对齐流,那么它们比仅仅作为 NO OP 更有价值。如果您关心纯粹的速度,请参阅 Agner Fog 的优化手册第 1 卷。 4.。
If the
NOP
s are to align the stream, then they have more value than just being a NO OP. if your concerned with pure speed, see Agner Fog's Optimization Manuals Vol. 4.作为二进制翻译,我首先将其翻译为目标系统上的等效 nop )。一旦一切正常,就可以优化死代码。同时,由于这串指令引起了您的注意,请尝试了解它们的用途,也许是等待硬件执行某些操作,并确保您翻译的系统功能相同。
being a binary translation I would start by translating (them into equivalent nops on the target system). Once things are working then optimize out dead code. At the same time since this string of instructions caught your eye, try to understand what they were there for, perhaps waiting on hardware to do something, and make sure that your translated system functions the same.