为什么没有一个寄存器包含 EAX 的高字节?

发布于 2024-07-07 01:07:15 字数 136 浏览 5 评论 0原文

%AX = (%AH + %AL)

那么为什么对于某些寄存器 %SOME_REGISTER 不使用 %EAX = (%SOME_REGISTER + %AX) 呢?

%AX = (%AH + %AL)

So why not %EAX = (%SOME_REGISTER + %AX) for some register %SOME_REGISTER?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

烦人精 2024-07-14 01:07:17

这里发布了很多答案,但没有一个真正回答给定的问题:为什么没有一个寄存器直接编码 EAX 的高 16 位或 RAX 的高 32 位? 答案归结为 x86 指令编码本身的限制。

16 位历史课

当 Intel 设计 8086 时,他们对许多指令使用了可变长度编码方案。 这意味着某些极其常见的指令,例如 POP AX,可以表示为单个字节 (58),而罕见(但仍然可能有用)的指令,例如 MOV CX、[BX+ SI+1023] 仍然可以表示,即使需要几个字节来存储它们(在本例中为 8B 88 FF 03)。

这似乎是一个合理的解决方案,但当他们设计它时,他们填满了大部分可用空间。 因此,例如,对于八个单独的寄存器(AX、CX、DX、BX、SP、BP、SI、DI)有八条 POP 指令,它们填写了操作码 58 到 5F,并且操作码 60 完全是另一回事 (PUSHA),操作码 57 (PUSH DI) 也是如此。 在这些之后或之前没有剩余空间可以容纳任何东西。 即使压入和弹出段寄存器(在概念上与压入和弹出通用寄存器几乎相同)也必须在不同的位置(大约 06/0E/16/1E 下方)进行编码,只是因为旁边没有空间其余的推/弹出指令。

同样,用于复杂指令(如 MOV CX, [BX+SI+1023])的“mod r/m”字节只有三位用于对寄存器进行编码,这意味着它只能代表八个寄存器全部的。 如果您只有八个寄存器,那没问题,但如果您想要更多,就会出现真正的问题。

(这里有 x86 架构中所有这些字节分配的优秀映射:https://i.sstatic.net /9u8BS.png 。请注意主映射中没有剩余空间,一些指令重叠字节,甚至由于 MMX 和 SSE 指令现在使用了多少辅助“0F”映射。)

< strong>走向 32 和 64 位

因此,为了让 CPU 设计从 16 位扩展到 32 位,他们已经遇到了一个设计问题,他们用前缀字节解决了这个问题:通过在所有标准 16 位指令前面添加一个特殊的“66”字节,CPU 知道您需要相同的指令,但需要 32 位版本 (EAX),而不是 16 位版本 (AX)。 设计的其余部分保持不变:整个 CPU 架构中仍然只有 8 个通用寄存器。

必须进行类似的黑客攻击才能将架构扩展到 64 位(RAX 等); 在那里,通过添加另一组表示“64 位”的前缀代码(REX, 40-4F)解决了这个问题(并且有效地向“mod r/m”添加了另外两位)字段),并且还丢弃没人使用过的奇怪的旧指令,并将其字节码重新用于更新的东西。

关于 8 位寄存器的旁白

那么,要问的一个更大的问题是,如果设计中只有 8 个寄存器的空间,那么像 AH 和 AL 这样的东西最初是如何工作的。 答案的第一部分是,不存在“PUSH AL”这样的东西——有些指令根本无法在字节大小的寄存器上进行操作! 唯一可以做到的就是一些特殊的奇怪内容(例如 AAD 和 XLAT)以及“mod r/m”指令的特殊版本:通过翻转非常特定的位在“mod r/m”字节中,这些“扩展指令”可以翻转以对 8 位寄存器而不是 16 位寄存器进行操作。 碰巧也有 8 个 8 位寄存器:AL、CL、DL、BL、AH、CH、DH 和 BH(按此顺序),并且与可用的 8 个寄存器槽非常吻合在“mod r/m”字节中。

英特尔当时指出,8086 的设计应该与 8080/8085“源兼容”:8086 中的每条 8080/8085 指令都有等效的指令,但它没有使用相同的字节码(它们甚至不接近),并且您必须重新编译(重新汇编)您的程序才能使其使用新的字节代码。 但“源兼容”是旧软件的一种前进方式,它允许 8085 的单独 A、B、C 等以及组合“BC”和“DE”寄存器仍然可以在新处理器上工作,即使它们现在已经过时了。称为“AL”和“BL”和“BX”和“DX”(或任何映射)。

所以这才是真正的答案:这并不是 Intel 或 AMD 故意“遗漏”了 EAX 的高 16 位寄存器,或者 RAX 的高 32 位寄存器:而是高 8 位寄存器是一个奇怪的遗留历史。考虑到架构向后兼容的要求,以更高的位大小复制他们的设计将非常困难。

性能考虑

还有一个考虑因素可以解释为什么从那时起就没有添加这些“高寄存器”:在现代处理器架构中,出于性能原因,可变大小的寄存器不会被添加。实际上是重叠的:AH 和 AL 不是 AX 的一部分,AX 不是 EAX 的一部分,EAX 也不是 RAX 的一部分:它们都是独立的寄存器,处理器设置一个当您操作其中一个时,其他人上的无效标志,以便它知道当您从其他人读取数据时需要复制数据。

(例如:如果您设置 AL = 5,处理器不会更新 AX。但如果您随后从 AX 读取,处理器会快速将 5 从 AL 复制到 AX 的底部位。)

通过保持寄存器独立,CPU 可以做各种聪明的事情,例如不可见的寄存器重命名,以使您的代码运行得更快,但这意味着如果您确实使用将小寄存器视为较大寄存器的旧模式,那么您的代码运行更慢,因为处理器必须停止并更新它们。 为了防止所有这些内部簿记失控,CPU 设计者明智地选择在较新的处理器上添加单独的寄存器,而不是添加更多重叠的寄存器。

(是的,这意味着在现代处理器上显式“MOVZX EAX,值”确实比使用旧的、更草率的方式“MOV AX,值/使用 EAX”更快。 ".)

结论

尽管如此,如果 Intel 和 AMD 真的愿意的话,他们是否可以添加更多“重叠”寄存器? 当然。 如果有足够的需求,有一些方法可以吸引它们。 但考虑到重大的历史包袱、当前的​​架构限制、显着的性能限制,以及当今大多数代码是由针对非重叠寄存器优化的编译器生成的事实,他们不太可能很快添加这些东西。

There are a lot of answers posted here, but none really answer the given question: Why isn't there a register that directly encodes the high 16 bits of EAX, or the high 32 bits of RAX? The answer boils down to the limitations of the x86 instruction encoding itself.

16-Bit History Lesson

When Intel designed the 8086, they used a variable-length encoding scheme for many of the instructions. This meant that certain extremely-common instructions, like POP AX, could be represented as a single byte (58), while rare (but still potentially useful) instructions like MOV CX, [BX+SI+1023] could still be represented, even if it took several bytes to store them (in this example, 8B 88 FF 03).

This may seem like a reasonable solution, but when they designed it, they filled out most of the available space. So, for example, there were eight POP instructions for the eight individual registers (AX, CX, DX, BX, SP, BP, SI, DI), and they filled out opcodes 58 through 5F, and opcode 60 was something else entirely (PUSHA), as was opcode 57 (PUSH DI). There's no room left over for anything after or before those. Even pushing and popping the segment registers — which is conceptually nearly identical to pushing and popping the general-purpose registers — had to be encoded in a different location (down around 06/0E/16/1E) just because there wasn't room beside the rest of the push/pop instructions.

Likewise, the "mod r/m" byte used for a complex instruction like MOV CX, [BX+SI+1023] only has three bits for encoding the register, which means it can only represent eight registers total. That's fine if you only have eight registers, but presents a real problem if you want to have more.

(There's an excellent map of all these byte allocations in the x86 architecture here: https://i.sstatic.net/9u8BS.png . Notice how there's no space left in the primary map, with some instructions overlapping bytes, and even how much of the secondary "0F" map is used now thanks to the MMX and SSE instructions.)

Toward 32 and 64 Bits

So to even allow the CPU design to be extended from 16 bits to 32 bits, they already had a design problem, and they solved that with prefix bytes: By adding a special "66" byte in front of all of the standard 16-bit instructions, the CPU knows you want the same instruction but the 32-bit version (EAX) instead of the 16-bit version (AX). The rest of the design stayed the same: There were still only eight total general-purpose registers in the overall CPU architecture.

Similar hackery had to be done to extend the architecture to 64-bits (RAX and friends); there, the problem was solved by adding yet another set of prefix codes (REX, 40-4F) that meant "64-bit" (and effectively added another two bits to the "mod r/m" field), and also discarding weird old instructions nobody ever used and reusing their byte codes for newer stuff.

An Aside on 8-Bit Registers

One of the bigger questions to ask, then, is how the heck things like AH and AL ever worked in the first place if there's only really room in the design for eight registers. The first part of the answer is that there's no such thing as "PUSH AL" — some instructions simply can't operate on the byte-sized registers at all! The only ones that can are a few special oddities (like AAD and XLAT) and special versions of the "mod r/m" instructions: By having a very specific bit flipped in the "mod r/m" byte, those "extended instructions" could be flipped to operate on the 8-bit registers instead of the 16-bit ones. It just so happens that there are exactly eight 8-bit registers, too: AL, CL, DL, BL, AH, CH, DH, and BH (in that order), and that lines up very nicely with the eight register slots available in the "mod r/m" byte.

Intel noted at the time that the 8086 design was supposed to be "source compatible" with the 8080/8085: There was an equivalent instruction in the 8086 for each of the 8080/8085 instructions, but it didn't use the same byte codes (they aren't even close), and you'd have to recompile (reassemble) your program to get it to use the new byte codes. But "source compatible" was a way forward for old software, and it allowed the 8085's individual A, B, C, etc. and combo "BC" and "DE" registers to still work on the new processor, even if they were now called "AL" and "BL" and "BX" and "DX" (or whatever the mapping was).

So that's really the real answer: It's not that Intel or AMD intentionally "left out" a high 16-bit register for EAX, or a high 32-bit register for RAX: It's that the high 8-bit registers are a weird leftover historical anomaly, and replicating their design at higher bit sizes would be really difficult given the requirement that the architecture be backward-compatible.

A Performance Consideration

There is one other consideration as to why those "high registers" haven't been added since, as well: Inside modern processor architectures, for performance reasons, the variably-sized registers don't actually overlap for real: AH and AL aren't part of AX, and AX isn't a part of EAX, and EAX isn't a part of RAX: They're all separate registers under the hood, and the processor sets an invalidation flag on the others when you manipulate one of them so that it knows it will need to copy the data when you read from the others.

(For example: If you set AL = 5, the processor doesn't update AX. But if you then read from AX, the processor quickly copies that 5 from AL into AX's bottom bits.)

By keeping the registers separate, the CPU can do all sorts of clever things like invisible register renaming to make your code run faster, but that means that your code runs slower if you do use the old pattern of treating the small registers as pieces of larger registers, because the processor will have to stall and update them. To keep all of this internal bookkeeping from getting out of hand, the CPU designers wisely chose to add separate registers on the newer processors rather than to add more overlapping registers.

(And yes, that means that it really is faster on modern processors to explicitly "MOVZX EAX, value" than to do it the old, sloppier way of "MOV AX, value / use EAX".)

Conclusion

With all that said, could Intel and AMD add more "overlapping" registers if they really really wanted to? Sure. There are ways to worm them in if there was enough demand. But given the significant historical baggage, the current architectural limitations, the notable performance limitations, and the fact that most code these days is generated by compilers optimized for non-overlapping registers, it's highly unlikely they'll add such things any time soon.

神经大条 2024-07-14 01:07:17

在过去的 8 位时代,有 A 寄存器。

在 16 位时代,有 16 位 AX 寄存器,它被分成两个 8 位部分,AH 和 AL,适合那些您仍然想使用 8 位值的时代。

在32位时代,引入了32位EAX寄存器,但AX、AH和AL寄存器都被保留。 设计人员认为没有必要引入一个新的 16 位寄存器来寻址 EAX 的第 16 位到第 31 位。

In the old 8-bit days, there was the A register.

In the 16-bit days, there was the 16 bit AX register, which was split into two 8 bit parts, AH and AL, for those times when you still wanted to work with 8 bit values.

In the 32-bit days, the 32 bit EAX register was introduced, but the AX, AH, and AL registers were all kept. The designers did not feel it necessary to introduce a new 16 bit register that addressed bits 16 through 31 of EAX.

一杆小烟枪 2024-07-14 01:07:15

只是为了一些澄清。 在 20 世纪 70 年代早期的微处理器时代,CPU 只有少量的寄存器和非常有限的指令集。 通常,算术单元只能对单个 CPU 寄存器(通常称为“累加器”)进行操作。 8 位 8080 上的累加器 Z80 处理器被称为“A”。 还有 6 个其他通用 8 位寄存器:B、C、D、E、H 和 H。 这六个寄存器可以配对形成 3 个 16 位寄存器:BC、DE 和 L。 HL。 在内部,累加器与标志寄存器组合形成 AF 16 位寄存器。

当 Intel 开发 16 位 8086 系列时,他们希望能够移植 8080 代码,因此他们保留了相同的基本寄存器结构:

8080/Z80  8086
A         AX
BC        BX
DE        CX
HL        DX
IX        SI    
IY        DI

由于需要移植 8 位代码,他们需要能够引用 8080 的各个 8 位部分。 AX、BX、CX 和 DX。 这些被称为 AL、AH,代表低和低。 BL/BH、CL/CH & AX 的高字节等 DL/DH。 九、 Z80 上的 IY 仅用作 16 位指针寄存器,因此无需访问 SI 和 SI 的两半部分。 DI。

当 80386 在 20 世纪 80 年代中期发布时,他们创建了所有寄存器的“扩展”版本。 因此,AX 变成了 EAX,BX 变成了 EBX 等等。不需要访问这些新扩展寄存器的前 16 位,因此他们没有创建 EAXH 伪寄存器。

AMD 在生产第一批 64 位处理器时也采用了同样的技巧。 AX 寄存器的 64 位版本称为 RAX。 所以,现在你有看起来像这样的东西:

|63..32|31..16|15-8|7-0|
               |AH.|AL.|
               |AX.....|
       |EAX............|
|RAX...................|

Just for some clarification. In the early microprocessor days of the 1970's, CPUs had only a small number of registers and a very limited instruction set. Typically, the arithmetic unit could only operate on a single CPU register, often referred to as the "accumulator". The accumulator on the 8 bit 8080 & Z80 processors was called "A". There were 6 other general purpose 8 bit registers: B, C, D, E, H & L. These six registers could be paired up to form 3 16 bit registers: BC, DE & HL. Internally, the accumulator was combined with the Flags register to form the AF 16 bit register.

When Intel developed the 16 bit 8086 family they wanted to be able to port 8080 code, so they kept the same basic register structure:

8080/Z80  8086
A         AX
BC        BX
DE        CX
HL        DX
IX        SI    
IY        DI

Because of the need to port 8 bit code they needed to be able to refer to the individual 8 bit parts of AX, BX, CX & DX. These are called AL, AH for the low & high bytes of AX and so on for BL/BH, CL/CH & DL/DH. IX & IY on the Z80 were only ever used as 16 bit pointer registers so there was no need to access the two halves of SI & DI.

When the 80386 was released in the mid 1980s they created "extended" versions of all the registers. So, AX became EAX, BX became EBX etc. There was no need to access to top 16 bits of these new extended registers, so they didn't create an EAXH pseudo register.

AMD applied the same trick when they produced the first 64 bit processors. The 64 bit version of the AX register is called RAX. So, now you have something that looks like this:

|63..32|31..16|15-8|7-0|
               |AH.|AL.|
               |AX.....|
       |EAX............|
|RAX...................|
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文