将“movss xmm0,cs:dword_5B27420”替换为“movss xmm0,立即”
我在 Ida Pro 中有一个 linux .so 文件,并且有以下指令:
movss xmm0, cs:dword_5B27420
是否可以使用与该指令相同或更少的字节数将固定值移动到 xmm0
中?
指令字节是:
F3 0F 10 05 C8 BB A 00
我想做类似的事情:
movss xmm0, 0.3
I have a linux .so file in Ida Pro and I have the following instruction:
movss xmm0, cs:dword_5B27420
Is it possible to move a fixed value into xmm0
using the same or less number of bytes than that instruction?
The instruction bytes are:
F3 0F 10 05 C8 BB A 00
I want to do something like:
movss xmm0, 0.3
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
不是更少的字节;如果这里没有空间,则必须跳到其他地方然后再返回,或者只是更改 RIP 相对地址以从其他地方加载不同的常量。 (例如,来自两个函数之间的填充,或者 .rodata 或 .data 中的备用空间(如果有的话)。)
XMM 寄存器没有 mov-immediate,并且 mov eax, __?float32?__(0.3)(5 字节)/
movd xmm0, eax
(4 字节)将占用更多的总字节数。 (这是整数值的 NASM 语法,该整数值是给定 FP 的位模式一些汇编器可能允许使用mov eax, 0.3
,以防万一有用。)除了立即数以外,还可以使用 pcmpeqd 来构造 FP 常量。 xmm0,xmm0(4 个字节),然后使用全 1 位模式进行移位或执行其他操作(例如
pabsd
)。但这至少有 2 条指令,除非您想要 NaN。 (请参阅 Agner Fog 的优化 asm 指南,以及 生成向量常量的最佳指令序列是什么苍蝇?)0.3f
不是一个简单的常量,与1.0f 不同,您甚至可以在
0xffffffff
的 3 条指令中实现左移和右移例如。 (但这仍然是三个指令,每个指令 4 和 5 个字节来构造 set1(1.0f))cmpps
是 SSE1 非标量,因此它的操作码比pcmpeqd
更小(没有强制前缀,只是0f c2
),但整体上并没有更小,因为它需要立即数作为比较谓词。Not in fewer bytes; if you don't have room here, you'd have to jump somewhere else and then back, or just change the RIP-relative address to load a different constant from somewhere else. (e.g. from padding between two functions, or spare space in .rodata or .data if there is any.)
There is no mov-immediate to XMM registers, and
mov eax, __?float32?__(0.3)
(5 bytes) /movd xmm0, eax
(4 bytes) would take more total bytes. (That's NASM syntax for the integer value that is the bit-pattern for the given FP constant. Some assemblers may allowmov eax, 0.3
, in case that's ever useful.)Ways other than immediates to construct FP constants include
pcmpeqd xmm0,xmm0
(4 bytes) and then shifting or doing other things (likepabsd
) with the all-ones bit patterns. But that's at least 2 instructions unless you want a NaN. (See Agner Fog's optimizing asm guide, and What are the best instruction sequences to generate vector constants on the fly?)0.3f
is not a simple constant you could materialize even in 3 instructions from0xffffffff
with left and right shifts, unlike1.0f
for example. (But that's still three instructions, 4 and 5 bytes each to construct set1(1.0f))cmpps
is SSE1 non-scalar so it has a smaller opcode thanpcmpeqd
(no mandatory prefixes, just0f c2
), but isn't any smaller overall because it needs an immediate for the compare predicate.