简单的流水线和超标量架构

发布于 2024-11-16 00:06:23 字数 489 浏览 13 评论 0原文

考虑这个指令流程图......

指令获取->指令解码->操作数获取->指令执行->写回

假设处理器支持

cisc 和 risc...如 intel 486

现在如果我们发出 risc 指令，则需要一个时钟周期来执行，所以没有问题...但是如果发出 cisc 指令，其执行将需要时间...

因此，执行 cisc 指令需要三个时钟周期，并且执行之前的阶段各需要一个时钟周期......

现在在超标量结构中，在处理第一个指令时发出的两个指令被转移到其他可用的功能单元中...但是在简单的流水线中不可能发生这种转移，因为只有一个功能单元可用于执行指令...

那么在简单的流水线情况下如何避免指令拥塞呢？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

情感失落者 2024-11-23 00:06:23

从技术上讲，x86 不是 RISC 处理器。这是一个 CISC 处理器。有些指令花费的时间更少，但它们不是 RISC 指令。我相信 Intel 在内部将指令转换为 RISC 指令，但这并不真正相关。

如果我们有需要不同时间量的指令，那么它就成为 CISC 处理器。管道化 CISC 处理器几乎是不可能的——据我所知，还没有人做到过。为了加快执行速度，您可以在 CPU 内部执行许多操作，例如乱序执行。因此，不可能出现管道拥塞，因为所有指令都必须按顺序执行。

现在如果我们发出一条 risc 指令，它需要一个时钟周期来执行，所以没有问题......但如果发出一条 cisc 指令，它的执行将需要时间......

一条 RISC 指令不一定需要一个时钟周期。在MIPS上，需要5个时钟周期。然而，流水线的要点是，执行一条指令后，下一条指令将在当前指令完成后完成一个时钟周期。

现在，在超标量结构中，在处理第一个指令时发出的两个指令被转移到其他可用的功能单元中......

在超标量架构中，两条指令同时执行和完成。在纯超标量架构中，循环如下所示（F = Fetch、D = Decode、X = eXecute、M = Memory、W = Writeback）：

(inst. 1) F D X M W
(inst. 2) F D X M W
(inst. 3)          F D X M W
(inst. 4)          F D X M W

但是在简单的流水线中不可能发生这种转移，因为只有一个功能单元可用于执行指令......

对，所以循环看起来像这样：

(inst. 1) F D X M W
(inst. 2)   F D X M W
(inst. 3)     F D X M W
(inst. 4)       F D X M W

现在，如果我们有需要不同时间量的指令（一个 CISC计算机），管道化比较困难，因为只有一个执行单元，我们可能必须等待上一条指令完成执行。在本例中，指令 1 需要 2 个执行周期，指令 2 需要 5 个执行周期，指令 3 需要 2 个执行周期，而指令 4 仅需要 1 个执行周期。

(inst. 1) F D X X M W
(inst. 2)         F D X X X X X M W
(inst. 3)                       F D X X M W
(inst. 4)                               F D X M W

因此，我们不能真正对 CISC 处理器进行流水线处理 - 我们必须等待执行周期完成才能继续执行。到下一条指令。我们不必在 MIPS 中执行此操作，因为它可以在解码阶段确定指令是否是分支以及目的地。

Technically speaking, the x86 is not a RISC processor. It's a CISC processor. There are instructions that take less time, but those aren't RISC instructions. I believe that Intel internally turns instructions into RISC instructions, but that's not really relevant.

If we have instructions which take different amounts of time, then that becomes a CISC processor. It's nearly impossible to pipeline a CISC processor - to the best of my knowledge nobody has done it. There are many things that you can do inside of the CPU itself in order to speed up execution, such as out-of-order execution. So, there's no way you can have pipeline congestion because all instructions must be executed sequentially.

now if we issue a risc instruction it takes one clock cycle to execute and so there is no problem...but if a cisc instruction is issued its execution will take time...

A RISC instruction does not necessarily take one clock cycle. On the MIPS, it takes 5. However, the point of pipelining is that after you execute one instruction, the next instruction will complete one clock cycle after the current one finishes.

now in a superscalar structure the two instructions issued while the first is being processed are diverted into other functional units available...

In a superscalar architecture, two instructions are executed and finish at the same time. In a pure superscalar architecture, the cycle looks like this(F = Fetch, D = Decode, X = eXecute, M = Memory, W = Writeback):

(inst. 1) F D X M W
(inst. 2) F D X M W
(inst. 3)          F D X M W
(inst. 4)          F D X M W

but there is no such diversion possible in simple pipelining as only one functional unit is available for execution of instructions....

Right, so the cycle looks like this:

(inst. 1) F D X M W
(inst. 2)   F D X M W
(inst. 3)     F D X M W
(inst. 4)       F D X M W

Now, if we have instructions that take a varying amount of time(a CISC computer), it's harder to pipeline, because there's only one execution unit, and we may have to wait for a previous instruction to finish executing. Instruction 1 takes 2 execution cycles, instruction 2 takes 5, instruction 3 takes two, and instruction 4 takes only one in this example

(inst. 1) F D X X M W
(inst. 2)         F D X X X X X M W
(inst. 3)                       F D X X M W
(inst. 4)                               F D X M W

Thus, we can't really pipeline CISC processors - we must wait for the execute cycle to finish before we can go onto the next instruction. We don't have to do this in MIPS because it can determine if an instruction is a branch and the destination in the decode phase.

回复收藏 0 原文

~没有更多了~