扫描二进制CPU功能用法
我正在调试一个应用程序,该应用程序可以在Intel CPU上正确运行,但不在另一个较新的AMD处理器上运行。我怀疑它可能已编译以使用某些特定于Intel的说明,从而导致撞车事故。但是,我正在寻找一种验证这一点的方法。我无法访问原始源代码。
是否有一个工具可以扫描二进制文件和列表,它可能会使用哪种CPU特定功能?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
有两种好的方法:
但是,您的想法是静态扫描二进制文件,无法区分仅在检查
cpuid
后才调用的函数中的代码。使用调试器查看故障指令
选择任何调试器。 GDB易于在任何Linux发行版上安装,也可能在Windows或Mac(或那里的LLDB)上安装。或选择其他任何调试器,例如使用GUID。
运行程序。故障后,使用调试器检查故障指令。
在Intel或AMD的X86 ASM参考手册中查找它,例如 https:// https:///www.felixcloutier.com/x86/ 是英特尔PDF的HTML刮擦。查看此指令的这种形式所需的ISA扩展。
例如,如果让编译器这样做,则此源可以编译以使用AVX-512指令,但首先只需要SSE2来编译。
(请在
。 AVX512 -O3 Ill.c 。
然后尝试运行它,例如我的Skylake-Client(非AVX512)GNU/Linux桌面。 (我还使用
strip A.Out
删除符号表(功能名称),例如仅二进制软件版本)。=>
指示当前程序计数器(x86-64中的RIP,但是GDB便携式定义$ PC
是任何ISA上的别名。) > vpbroadcastd xmm0,edi 。 (GCC实施
_MM_SET1_EPI32(ARGC)
我们告诉它AVX512可用时。当然,实际上试图执行不支持的指令是造成崩溃的直接原因。 (也有可能成为间接原因,例如使用
lzcnt eax,ecx
的程序,但是将其运行为bsr eax,ecx
,然后使用它,然后使用它 整数作为数组索引。不同的
vbroadcastss
等,而不是VP ...
整数指令。 (英特尔的约定是p ...
是包装的,... ps/pd
是包装的或包装的 - 流动性。)如果助记符以<<<代码> v ,您找不到条目,例如
vaddps
,这是因为该指令在AVX之前存在,并且在其Legacy-sse sse mnemonic下进行了记录,例如SSE1addps
确实列出了同时addps
和VADDPS
编码,包括允许ZMM寄存器的AVX-512编码,X/YMM16..31,以及像VADDPS YMM0 {K3} {k3} {Z},YMM1,YMM1,YMM2
。那是AVX-512F+VL指令。无论如何,回到我们的榜样。与故障指令相匹配的表条目如下。在编码操作数的MODR/M(
/r
)之前,请注意7C
OpCode字节。这是在4个字节前缀之前的,作为交叉检查,这确实是我们正在寻找的OpCode。它需要“ avx512vl avx512f”。
{k1} {z}
是可选的掩码。R32
是32位通用通用整数寄存器,例如edi
在这种情况下。XMM1
表示任何XMM寄存器都可以是该指令的第一个XMM操作数;在这种情况下,GCC选择了XMM0。我的CPU根本没有AVX-512,因此错误。
SDE指令组合
这应该在Windows或其他任何操作系统上都可以很好地工作。
具有
-mix
选项,其输出包括按必需的ISA扩展名进行分类。参见我如何监视SIMD指令用法的数量< /a> re:使用它。使用同一示例
A.Out
我与gdb一起使用:运行
/opt/sde-external-8.33.0-2019-02-02-07-lin/sde64 -.mix-./a./a 。 (有些动态链接器中有很多次。)IDK如果有一个选项可以忽略这一点,因为我希望它会在大型程序中变得非常肿。我认为,即使还有更多,它也可能只打印出前几个块。
然后,我们达到所需的部分:
ISA-SET-AVX512F_128
的1个计数是在我的CPU上错误的指令,该指令完全不支持AVX-512。 AVX512F_128是AVX512F(Foundation) + avx512vl (比其他允许其他vector length, 512位ZMM寄存器)。(也被视为
isa-ext-avx512evex
。evex是AVX-512矢量指令的机器代码前缀。AVX-512掩码指令,例如kandw k0,kandw k0,k1,k1,k2 vpermb
在Skylake-Server CPU上有任何支持AVX-512F代码>使用vex编码,例如Avx1/avx2 Simd指令, 除了AVX-512以外,每个扩展程序都有一个完全独立的名称。
静态拆卸
您 can 大多数二进制文件;如果他们没有混淆,则拆卸应该找到所有可能执行的说明。 (以及使用新指令的高性能代码不太可能使用删除拆卸器的黑客,例如跳入直线拆卸的中间,将其视为另一种说明; x86机器代码是可变的字节流- 长度指令。)
但是,这并不能告诉您哪些说明实际上确实执行;有些可能是在检查CPUID后才调用的功能,以找出是否支持必要的扩展。
(尽管我从未寻找过一个工具,但我不知道通过ISA扩展名对其进行分类的工具;通常是要确保他们不使用AVX2指令的开发人员,而AVX2指令仅在AVX1上使用的CPUS使用构建时间检查,或通过在模拟器下或在真实的CPU下运行。)
There are two good approaches:
But your idea, statically scanning the binary, can't distinguish code in functions that are only called after checking
cpuid
.Using a debugger to look at the faulting instruction
Pick any debugger. GDB is easy to install on any Linux distro, and probably also on Windows or Mac (or lldb there). Or pick any other debugger, e.g. one with a GUID.
Run the program. Once it faults, use the debugger to examine the faulting instruction.
Look it up in Intel or AMD's x86 asm reference manual, e.g. https://www.felixcloutier.com/x86/ is an HTML scrape of Intel's PDFs. See which ISA extension this form of this instruction requires.
For example, this source can compile to use AVX-512 instructions if you let the compiler do so, but only needs SSE2 to compile in the first place.
(See it on Godbolt with different compile options.)
Build with
gcc -march=skylake-avx512 -O3 ill.c
.Then try to run it, e.g. on my Skylake-client (non-AVX512) GNU/Linux desktop. (I also used
strip a.out
to remove the symbol table (function names), like a binary-only software release).The
=>
indicates the current program counter (RIP in x86-64, but GDB portably defines$pc
as an alias on any ISA.)So we faulted on
vpbroadcastd xmm0,edi
. (The way GCC implemented_mm_set1_epi32(argc)
when we told it AVX512 was available.)That doesn't involve memory access, and the fault was illegal-instruction not segmentation-fault anyway, so we can be sure that actually trying to execute an unsupported instruction was the direct cause of the crash here. (It's also possible for it to be an indirect cause, e.g. a program using
lzcnt eax, ecx
but an old CPU running it asbsr eax, ecx
, and then using that different integer as an array index. lzcnt/bsr is somewhat unlikely for your case since AMD has supported it for longer than Intel.)So let's check on vpbroadcastd: there are multiple entries for
vpbroadcast
in Intel's manual:vbroadcastss
etc., notvP...
integer instructions. (Intel's convention is thatp...
is packed-integer,...ps/pd
is packed-single or packed-float.)If the mnemonic starts with
v
and you can't find an entry, e.g.vaddps
, that's because the instruction existed before AVX, and is documented under its legacy-SSE mnemonic, like SSE1addps
which does list bothaddps
andvaddps
encodings, including the AVX-512 encodings that allow ZMM registers, x/ymm16..31, and masking likevaddps ymm0{k3}{z}, ymm1, ymm2
. That's an AVX-512F+VL instruction.Anyway, back to our example. The table entry that matches the faulting instruction was the following. Note the
7C
opcode byte before the ModR/M (/r
) that encodes the operands. That's present after the 4-byte EVEX prefix, as a cross-check that this is indeed the opcode we're looking for.It requires "AVX512VL AVX512F" according to the table. The
{k1}{z}
is optional masking.r32
is a 32-bit general-purpose integer register, likeedi
in this case.xmm1
means any XMM register can be the first xmm operand to this instruction; in this case GCC chose XMM0.My CPU doesn't have AVX-512 at all, so it faulted.
SDE instruction mix
This should work equally well on Windows or any other OS.
Intel's SDE (Software Development Emulator) has a
-mix
option, whose output includes categorizing by required ISA extension. See How do I monitor the amount of SIMD instruction usage re: using it.Using the same example
a.out
I used with GDB:Running
/opt/sde-external-8.33.0-2019-02-07-lin/sde64 -mix -- ./a.out
created a filesde-mix-out.txt
which contained a lot of stuff, including stats for how often different basic blocks were executed. (Some in the dynamic linker ran many times.) IDK if there's an option to omit that, because it would get pretty bloated for a large program, I expect. I think it might only print the top few blocks, even if there are many more.Then we get to the part we want:
The 1 count for
isa-set-AVX512F_128
is the instruction that would have faulted on my CPU, which doesn't support AVX-512 at all. AVX512F_128 is AVX512F (foundation) + AVX512VL (vector length, allowing vectors other than 512-bit ZMM registers).(It was also counted as
isa-ext-AVX512EVEX
. EVEX is the machine-code prefix for AVX-512 vector instructions. AVX-512 mask instructions likekandw k0, k1, k2
use VEX encoding, like AVX1/AVX2 SIMD instructions. But this wouldn't distinguish an Ice Lake new instruction likevpermb
faulting on a Skylake-server CPU that supports AVX-512F but not AVX512VBMI)Everything other than AVX-512 is probably simpler, since there's a fully separate name for each extension.
Static disassembly
You can disassemble most binaries; if they're not obfuscated then disassembly should find all the instructions that might ever execute. (And high-performance code that uses new instructions is unlikely to be using hacks that throw off a disassembler, like jumping into the middle of what straight-line disassembly would see as a different instruction; x86 machine code is a byte-stream of variable-length instructions.)
But that doesn't tell you which instructions actually do execute; some might be in functions that are only called after checking CPUID to find out if the necessary extensions are supported.
(And I don't know of a tool to categorize them by ISA extension, although I've never looked for one; usually developers wanting to make sure they didn't use AVX2 instructions in code that will run on AVX1-only CPUs use build-time checks, or test by running under an emulator or on a real CPU.)