合法代码中出现 0x90 (NOP) 序列

发布于 2024-12-10 06:12:37 字数 315 浏览 0 评论 0原文

背景: 我编写了一个 python 脚本来检查 IP 数据包,特别是数据包的有效负载/数据,以检测它是否可以用于缓冲区(堆栈)溢出。现在,据我了解,NOP 雪橇用于填充堆栈,以便指令指针最终会遇到您的漏洞利用代码,我可以通过查找重复出现的 0x90 轻松检测到这一点。我见过包含大量 NOP 命令的代码,在 SQL slammer 的情况下,NOP 命令少至 8 个,因此我也许至少可以使用 8 个。

现在我的问题是,NOP sled 是否经常在合法代码中使用?如果答案是肯定的,是否存在一些特定情况(这意味着我可以查找这些情况,然后将数据包排除为潜在无害),或者这种方法对于识别恶意代码来说不实用?

The background:
I've written a python script to inspect IP packets, specifically the payload/data of a packet in order to detect if it could be used in a buffer (stack) overflow. Now as I understand it a NOP sled is used to pad the stack so that the instruction pointer will eventually run into your exploit code, this I can easily detect by looking for repeating occurrences of 0x90. I've seen code with a lot of NOP commands to as few as 8 in the case of SQL slammer so I could perhaps use 8 as a minimum.

Now my question, are NOP sleds often used in legitimate code? If the answer is yes, are there a few specific cases (which means I can look for these cases and then rule out the packet as potentially harmless) or is this approach just not practical for identifying malicious code?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

梦亿 2024-12-17 06:12:45

我刚刚查看了我编译的最后一个二进制文件(Linux 上的非恶意 x86 代码),并发现:

016b5e0 458b c9ec 90c3 9090 9090 9090 9090 9090

我认为您可以得出结论,找到重复的 0x90 序列并不一定表明存在恶意意图。

I just looked through the last binary I compiled (non-malicious x86 code on Linux), and found:

016b5e0 458b c9ec 90c3 9090 9090 9090 9090 9090

I think you can conclude that finding repeated sequences of 0x90 doesn't necessarily indicate malicious intent.

捂风挽笑 2024-12-17 06:12:44

编译器将生成 NOP 来对齐代码 - 例如,在 x86 的某些迭代中,如果跳转目​​标与 4、8 甚至 16 字节边界对齐,则跳转执行速度会更快。

一些编译器尝试在可能的情况下使用“长 NOP”——占用多个字节空间的单个指令,并且可能正式执行某些操作,但对处理器状态没有影响——就像某些 x86 架构的迭代速度更快。例如,66 90 是一个两字节的 NOP,8d 74 26 00 是一个四字节的 NOP(技术上 lea 0(%esi,%eiz ,1),%esi,但正如您所看到的,只是将 %esi 中的值复制到自身,因此没有任何效果)。然而,这些并不能在所有情况下使用,而且在某些 x86 上最快的序列在其他 x86 上却经常,令人沮丧。我还没有阅读当前的微优化指南,但如果 Intel 和 AMD 正在努力使一串 90 成为执行长 NOP 的最快方法,并且有他们的编译器匹配。

The compiler will generate NOPs to align code -- for instance, on some iterations of the x86, jumps execute faster if the jump destination is aligned to a 4-, 8-, or even 16-byte boundary.

Some compilers try to use "long NOPs" when possible -- single instructions that take up more than one byte of space, and may formally do something, but have no effect on processor state -- as on some iterations of the x86 architecture this is faster. For instance, 66 90 is a two-byte NOP, and 8d 74 26 00 is a four-byte NOP (technically lea 0(%esi,%eiz,1),%esi, but as you can see that just copies the value in %esi to itself, so there's no effect). However, these can't be used in all cases, and the sequences that are fastest on some x86es are depressingly often really slow on others. I haven't read the current micro-optimization guidelines but I wouldn't be surprised if Intel and AMD were working to make a string of 90s the fastest way to do a long NOP, and had their compilers match.

蓝海似她心 2024-12-17 06:12:44

来自维基百科

NOP 最常用于计时目的,以强制内存
对齐,以防止危险,占用分支延迟槽,或作为
占位符稍后在程序中被活动指令替换
开发(或在重构时替换已删除的指令
有问题或耗时)。在某些情况下,NOP 可能有轻微的
副作用;例如,在 Motorola 68000 系列处理器上,
NOP 操作码将导致管道同步。

此外,可以想象,编译器可能会使用 0x90 作为未初始化数组的填充符,以防数组中的数据被解释为操作码,它不会执行任何操作。您会在 Visual Studio 中看到类似的效果,它用 0xCC 填充未初始化的数组,即 int 3 ,导致断点停止。

此外,可执行文件中的任何数据都可以包含任意数量的0x90,并且区分它和代码可能并不简单。

from wikipedia:

A NOP is most commonly used for timing purposes, to force memory
alignment, to prevent hazards, to occupy a branch delay slot, or as a
place-holder to be replaced by active instructions later on in program
development (or to replace removed instructions when refactoring would
be problematic or time-consuming). In some cases, a NOP can have minor
side effects; for example, on the Motorola 68000 series of processors,
the NOP opcode will cause a synchronization of the pipeline.

Furthermore, 0x90 may conceivably be used by compilers as filler for uninitialized arrays, in case the data in the array gets interpreted as opcodes, it does nothing. You see a similar effect with Visual studio which fills uninitialized arrays with 0xCC which is int 3 that causes a breakpoint halt.

Further yet, any data in the executable may contain any number of 0x90 and it may not be trivial to differentiate between it and code.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文