JMP 或 NOP 字符串哪个更快?

发布于 2024-11-25 13:22:21 字数 79 浏览 3 评论 0原文

我正在实现二进制翻译,并且必须处理长度约为 16 个操作码的 NOP (0x90) 序列。将 JMP(到末尾)放在此类序列的开头是否会提高性能?

I'm implementing binary translation and have to deal with sequences of NOPs (0x90) with length about 16 opcodes. Is it better for performance to place JMP (to the end) at start of such sequences?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

淡忘如思 2024-12-02 13:22:21

英特尔架构软件开发人员指南,第 2B 卷(新西兰说明) 包含下表(第 4-12 页)关于 NOP

表 4-9。推荐的 NOP 指令多字节序列

Length    Assembly                                   Byte Sequence
=================================================================================
2 bytes   66 NOP                                     66 90H
3 bytes   NOP DWORD ptr [EAX]                        0F 1F 00H
4 bytes   NOP DWORD ptr [EAX + 00H]                  0F 1F 40 00H
5 bytes   NOP DWORD ptr [EAX + EAX*1 + 00H]          0F 1F 44 00 00H
6 bytes   66 NOP DWORD ptr [EAX + EAX*1 + 00H]       66 0F 1F 44 00 00H
7 bytes   NOP DWORD ptr [EAX + 00000000H]            0F 1F 80 00 00 00 00H
8 bytes   NOP DWORD ptr [EAX + EAX*1 + 00000000H]    0F 1F 84 00 00 00 00 00H
9 bytes   66 NOP DWORD ptr [EAX + EAX*1 + 00000000H] 66 0F 1F 84 00 00 00 00 00H

这允许您构造特定大小的“填充 NOP”。使用其中两个,您可以桥接 16 字节,尽管我赞同检查优化指南(针对您的目标 CPU)的建议,JMP 是否比两个这样的 NOP< /代码>。

The Intel Architecture Software developer's guide, volume 2B (instructions N-Z) contains the following table (pg 4-12) about NOP:

Table 4-9. Recommended Multi-Byte Sequence of NOP Instruction

Length    Assembly                                   Byte Sequence
=================================================================================
2 bytes   66 NOP                                     66 90H
3 bytes   NOP DWORD ptr [EAX]                        0F 1F 00H
4 bytes   NOP DWORD ptr [EAX + 00H]                  0F 1F 40 00H
5 bytes   NOP DWORD ptr [EAX + EAX*1 + 00H]          0F 1F 44 00 00H
6 bytes   66 NOP DWORD ptr [EAX + EAX*1 + 00H]       66 0F 1F 44 00 00H
7 bytes   NOP DWORD ptr [EAX + 00000000H]            0F 1F 80 00 00 00 00H
8 bytes   NOP DWORD ptr [EAX + EAX*1 + 00000000H]    0F 1F 84 00 00 00 00 00H
9 bytes   66 NOP DWORD ptr [EAX + EAX*1 + 00000000H] 66 0F 1F 84 00 00 00 00 00H

This allows you to construct "padding NOP" of certain sizes. With two of those, you can bridge 16 Bytes, although I second the suggestion to check the optimization guides (for the CPU you're targeting) whether a JMP is faster than two such NOPs.

下壹個目標 2024-12-02 13:22:21

如果 NOP 是为了对齐流,那么它们比仅仅作为 NO OP 更有价值。如果您关心纯粹的速度,请参阅 Agner Fog 的优化手册第 1 卷。 4.

If the NOPs are to align the stream, then they have more value than just being a NO OP. if your concerned with pure speed, see Agner Fog's Optimization Manuals Vol. 4.

知你几分 2024-12-02 13:22:21

作为二进制翻译,我首先将其翻译为目标系统上的等效 nop )。一旦一切正常,就可以优化死代码。同时,由于这串指令引起了您的注意,请尝试了解它们的用途,也许是等待硬件执行某些操作,并确保您翻译的系统功能相同。

being a binary translation I would start by translating (them into equivalent nops on the target system). Once things are working then optimize out dead code. At the same time since this string of instructions caught your eye, try to understand what they were there for, perhaps waiting on hardware to do something, and make sure that your translated system functions the same.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文