为什么CPU推测执行不会导致OOB程序崩溃？

发布于 2025-01-17 06:19:20 字数 448 浏览 4 评论 0原文

这些问题源于阅读Spectre 攻击论文。如果我理解正确的话，攻击源于 CPU 启发式推测执行（错误）代码分支的可能性。考虑这个例子（C语言）：

int arr[42];
if (i < 42) {
    int j = arr[i];
}

如果我正确理解了这篇论文，即使在i >= 42<时，int j = arr[i]也可以（在某些情况下）推测执行。 /代码>。我的问题是 - 当我访问超出其范围的数组时，我的程序经常会崩溃（Linux 上的分段错误，Windows 上的“程序执行了非法操作”错误）。

为什么在数组越界访问的情况下推测执行不会导致程序崩溃？

原文

The questions stems from reading the Spectre attack paper. If I understand it correctly the attack stems from the possibility of CPU heuristics speculatively executing (the wrong) branch of code.
Consider the example (in C):

int arr[42];
if (i < 42) {
    int j = arr[i];
}

If I understand the paper correctly, the int j = arr[i] can be (in certain circumstances) speculatively executed even when i >= 42. My question is - when I access array outside of its bounds my program would often crash (segmentation fault on Linux, "The program performed an illegal operation" error on Windows).

Why does speculative execution not cause programs to crash in case of array out of bound access?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

她说她爱他 2025-01-24 06:19:20

关键是，在现代 CPU 中，动词执行并不意味着您认为的意思。

执行指令是计算其输出和副作用（如果有）的行为。
但是，这不会改变程序状态。
乍一看这似乎很难理解，但实际上并不奇怪。

CPU 有一个由所有寄存器组成的相当大的内部存储器，其中大部分存储器对于程序员来说是不可见的，这部分被称为架构状态.
架构状态 (AS) 是 CPU 手册中记录的内容以及由指令序列（例如程序）更改的内容。

由于更改 AS 只能使用 ISA（手册）中给出的语义进行，并且 ISA 指定了串行语义（指令按照程序顺序一个接一个地完成），因此不允许并行性。
然而，现代 CPU 拥有大量可以独立完成工作的资源（称为执行单元）。

为了利用所有这些资源，CPU 的前端（负责从内存层次结构读取指令并将其馈送到执行单元的部分）能够在每个周期到达、解码和输出多于一条指令。
前端和后端（执行单元所在的位置）之间的边界不再真正处理指令（而是处理微指令），但这是 x86 CISC 的麻烦。

所以现在 CPU 一次被给予 4/6 uops 来“执行”，但如果 ISA 是串行的，除了对这些 uops 进行排队之外，它还能做什么？
好吧，前端的设计是为了让这些微指令不在 AS 上操作，而是在影子状态（SS，我的术语）上操作，它们的操作数被重命名，由大的部分组成CPU的不可见内存。
更改并行或乱序都可以，因为它不是 AS。
这就是执行：改变 SS。

真的值得吗？毕竟，AS 才是最重要的。
好吧，与执行相比，将 SS 转移到 AS 确实很快，所以这是值得的。
这是“重命名回来”（反转以前的重命名）的问题，称为指令退休。

事实上，退休的意义远不止于此。
由于执行不会影响 AS，因此副作用也不会影响它。
这些副作用包括异常，但推测性地处理异常太麻烦（需要协调大量资源），因此异常处理延迟直到退休。
这还有一个优点，就是在处理异常时拥有正确的 AS，并且只有在实际必须发生异常时才引发异常。

推测执行的要点是打赌，CPU打赌指令序列不会生成任何异常（包括页面错误），因此在大多数检查关闭的情况下执行它（我不能排除，从顶部在我看来，无论如何都不会进行某些检查）从而获得了很多优势。
当需要废弃这些指令时，将检查赌注，如果任何失败，则 SS 将被丢弃。

这就是为什么推测执行不会使程序崩溃。

Spectre 所依赖的事实是，推测性执行确实在某种意义上改变了 AS：缓存不会失效（再次出于性能原因，当赌注关闭时，SS 不会被复制到 AS 中）并且可能发生定时攻击.
这可以通过多种方式纠正，包括在从 TLB 读取时执行基本权限检查（毕竟仅使用权限 0 和 3，因此逻辑很简单）或向缓存行添加一个位以将其标记为推测的（被非推测代码视为无效）。

The key point is that in modern CPUs the verb executing doesn't mean what you think it means.

To execute an instruction is the act of computing its output and side effects if any.
However, this doesn't change the program state.
This seems hard to grasp at first but it's really nothing exotic.

The CPU has a quite big internal memory made by all its registers, most of this memory is not visible to the programmer, the part that is it is known as the architectural state.
The architectural state (AS) is what is documented in the CPU manuals and what is altered by a sequence of instructions (a program, for example).

Since altering the AS can only happen with the semantics given in the ISA (the manuals) and the ISA specify a serial semantics (instructions are completed one after the other in the program order) this doesn't allow parallelism.
However, a modern CPU has a lot of resources (known as execution unit) that can do their work independently.

To exploit all these resources the front-end of the CPU (the part that is responsible for reading instructions from the memory hierarchy and feeding them to the execution units) is able to reach, decode and output more that one instruction per cycle.
The boundary between the front-end and the back-end (where the execution units lie) is not really dealing with instructions anymore (but with uops) but that's an x86 CISC nuisance.

So now the CPU is given 4/6 uops to "execute" at a time but if the ISA is serial, what it could possibly do other than queuing these uops?
Well, the front-end is made so that these uops don't operate on the AS but on a shadow state (SS, my terminology here), their operands are renamed, made of part of the big invisible memory of the CPU.
Altering the in parallel or out-of-order is fine as it is not the AS.
This is what execution is: altering the SS.

Does it really worth it? Afterall it is the AS that matters.
Well, transferring the SS to the AS is really fast compared to execution, so it's worth it.
It is a matter of "renaming back" (inverting the previous renaming) and it is called retiring of the instructions.

Actually, retiring is a bit more than that.
Since execution doesn't affect the AS, the side effect should also not affect it.
These side effects include exceptions but speculatively handling an exception is too cumbersome (it needs to coordinate a lot of resources) so exception handling is delayed until retirement.
This also has the advantage of having the correct AS at the moment the exception is handled and the advantage of raising an exception only when it must actually be.

The point of speculative execution is to bet, the CPU bets that the instructions sequence doesn't generate any exception (including page fault) and thus execute it with most checks off (I cannot exclude, off the top of my head, that some check is not made regardless) thereby gaining a lot of advantage.
When it's time to retire those instructions the bets are checked and if any fails, the SS is discarded.

That's why speculatively execution doesn't crash your program.

What Spectre relies on is the fact that speculatively execution does indeed alter the AS in some sense: the caches are not invalidated (again for performance reasons, the SS is simply not copied into the AS when a bet is off) and timing attacks are possible.
This could be corrected in a number of ways, including performing a basic privilege check when reading from the TLB (after all only privileges 0 and 3 are used, so the logic is simple) or adding a bit to the cache lines to mark them speculative (treated as invalid by non speculative code).

回复收藏 0 原文

~没有更多了~