围绕数字的硬件表示进行思考：一个假设的二进制补码问题

发布于 2024-09-29 14:34:38 字数 839 浏览 13 评论 0原文

这是一个超级天真的问题（我知道），但我认为这将是一个很好的起点，可以考虑 CPU 的基本指令集实际上是如何执行的：

在二进制补码系统中，你不能反转符号您的实现可以表示的最大负数。理论上的原因很明显，因为最大负数的求反将超出实现的范围（范围总是类似于
-128 至 127）。

然而，当您尝试对最大负数执行求反运算时，实际发生的情况非常奇怪。例如，在 8 位表示中，最大的负数是 -128，即二进制的 1000 0000。通常，要对一个数字求负，您需要翻转所有位，然后加一位。但是，如果您尝试使用 -128 执行此操作，您最终会得到：

1000 0000 ->
0111 1111 ->
1000 0000

与开始时相同的数字。因此，维基百科称其为“奇怪的数字”。

在同一篇 wikipedia 文章中，它说上述否定

被检测为溢出条件，因为有进位进入但没有从最高有效位进位。

所以我的问题是这样的：
A) 这到底是什么意思？和
B) 似乎 CPU 每次执行基本算术运算时都需要执行额外的错误检查步骤，以避免与此否定相关的事故，从而产生巨大的开销。如果是这种情况，为什么不直接截断可以表示的数字范围以留下奇怪的数字（即 8 位的 -127 到 127）？如果情况并非如此，如何在不产生额外开销的情况下实现此类错误检查？

原文

This is a super naive question (I know), but I think that it will make for a good jumping off point into considering how the basic instruction set of a CPU actually gets carried out:

In a two's complement system, you cannot invert the sign of the most negative number that your implementation can represent. The theoretical reason for this is obvious in that the negation of the most negative number would be out of the range of the implementation (the range is always something like
-128 to 127).

However, what actually happens when you try to carry out the negation operation on the most negative number is pretty strange. For example, in an 8 bit representation, the most negative number is -128, or 1000 0000 in binary. Normally, to negate a number you would flip all the bits and then add one. However, if you try to do this with -128 you end up with:

1000 0000 ->
0111 1111 ->
1000 0000

the same number that you started out with. For this reason, wikipedia calls it "the weird number".

In that same wikipedia article, it says that the above negation

is detected as an overflow condition since there was a carry into but not out of the most-significant bit.

So my question is this:
A) What the heck does that mean? and
B) It seems like the CPU would need to perform an extra error checking step each and every time it carried out a basic arithmetic operation in order to avoid accidents relating to this negation, creating significant overhead. If that is the case, why not just truncate the range of numbers that can be represented to leave the weird number out (i.e. -127 to 127 for 8 bits)? If that isn't the case, how can you implement such error checking without creating extra overhead?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

我不会写诗 2024-10-06 14:34:38

MSB 的进位位用作标志来指示我们
需要更多位。没有它，我们将拥有一个模块化系统
算术¹没有任何方式检测我们何时回绕。

在模算术中，你不处理数字，而是处理
具有相同余数的数字的等价类。在这样的
在一个系统中，127 加 1 后，你会得到 -128，你会
得出+128 和−128 属于同一等价类的结论。

如果您将自己限制在 −127 到 +127 范围内的数字，您
就必须重新定义加法，因为 127 + 1 = −127 是无意义的。

当计算机向您呈现二进制补码算术时，
本质上是模块化算术，具有检测溢出的能力。

这就是 4 位加法器将 0001 添加到
0111。可以看到在MSB中进位和出位是
不同：

     0        0        0        1
     | 0      | 1      | 1      | 1
     | |      | |      | |      | |
     v v      v v      v v      v v
0 <- ADD <-1- ADD <-1- ADD <-1- ADD <- 0
^     |    ^   |        |        |
      v        v        v        v
      1        0        0        0

ALU 正是使用这个标志来表示发生了溢出，
无需任何额外步骤。

^{1.模运算的范围是从 0 到 255，而不是 -127 到 128，但基本思想是相同的。}

The carry-out bit from the MSB is used as a flag to indicate that we
need more bits. Without it, we would have a system of modular
arithmetic¹ without any way of detecting when we wrap around.

In modular arithmetic, you don’t deal with numbers but with
equivalence classes of numbers that have the same remainder. In such
a system, after adding 1 to 127, you would get −128, and you would
conclude that +128 and −128 belong to the same equivalence class.

If you restricted yourself to numbers in the range −127 to +127, you
would have to redefine addition, since 127 + 1 = −127 is nonsense.

Two’s-complement arithmetic, when presented to you by a computer, is
essentially modular arithmetic with the ability to detect an overflow.

This is what a 4-bit adder would look like when adding 0001 to
0111. You can see that in the MSB the carry-in and carry-out are
different:

     0        0        0        1
     | 0      | 1      | 1      | 1
     | |      | |      | |      | |
     v v      v v      v v      v v
0 <- ADD <-1- ADD <-1- ADD <-1- ADD <- 0
^     |    ^   |        |        |
      v        v        v        v
      1        0        0        0

It is this flag that the ALU uses to signal that an overflow occurred,
without any extra steps.

^{1. Modular arithmetic goes from 0 to 255 instead of −127 to 128, but the basic idea is the same.}

回复收藏 0 原文

嘿哥们儿 2024-10-06 14:34:38

这并不是说 CPU 会进行另一次检查，而是晶体管会在发生这种情况时发出通知。它们之所以如此构建，是因为工程师在开始设计之前就选择了两个补码。

结果是它发生在与返回非溢出结果相同的时钟周期内。

它是如何运作的？

“add 1”阶段实现了级联逻辑：从 LSB 开始，每个位依次服从真值表

old-bit  carry-in  new-bit  carry-out
-------------------------------------
   0        0        0         0
   0        1        1         0
   1        0        1         0
   1        1        0         1

（即 new-bit = old-bit xorarry-in 和 进位输出 = 旧位和进位输入）。 LSB 的“进位”是我们添加的 1，对于其余位，它是前一个位的“进位”（这就是为什么必须级联完成）。

最后一个电路只是添加了一个用于signed-overflow =（进位而不是进位）的电路。

It's not that the CPU does another check, its that the transistors are arranged to notice when this happens. And they are built that way because the engineers picked two-complement before they started designing the thing.

The result is that it happens during the same clock cycle as a non-overflowing result would be returned.

How does it work?

The "add 1" stage implements a cascade logic: starting with the LSB each bit is subjected in turn to the truth table

old-bit  carry-in  new-bit  carry-out
-------------------------------------
   0        0        0         0
   0        1        1         0
   1        0        1         0
   1        1        0         1

(that is new-bit = old-bit xor carry-in and carry-out = old-bit and carry-in). The "carry-in" for the LSB is the 1 that we're adding, and for the rest of the bits it is the "carry-out" of the previous one (which is why this has to be done in a cascade).

The last of these circuits just adds a circuit for signed-overflow = (carry-in and not carry-out).

回复收藏 0 原文

一笔一画续写前缘 2024-10-06 14:34:38

首先，维基百科文章指出不能将负符号数否定为有符号数。他们的意思是因为需要 9 位来表示正 128，这是 8 位寄存器无法做到的。如果您要从负有符号转换为正无符号，那么您就有足够的位。当你否定 0x80 时，硬件应该给你 0x80，因为这是正确的答案。

对于加法、减法、乘法等，二进制补码的加法与小学的十进制数学没有什么不同。您将二进制数排列起来，添加列，该列的结果是最低有效数字，其余的将转移到下一列。因此，例如将 0b001 与 0b001 相加

，将最右边一列中的两个相加，结果为 0b10（十进制 2），写零然后进位一，一加零加零是一，什么都不进位，零加零是零，结果是0b010。

最右边的一列，1 加 1 是 0b10，我们写 0 进位 1，进位 1 同时是最右边列的进位输出，也是第二列的进位输入。另外，在铅笔和纸上的数学中，我们通常只讨论非零的进位，但如果你仔细想想，你总是携带一个数字，就像我们的第二列一加零就是一进位零。

您可以将二进制补码否定视为反转并加一，或者遍历位并反转到某个点然后不反转，或者取零减去数字的结果。

您可以使用铅笔和纸以二进制形式进行减法运算，与十进制相比，在借用时会让您头疼，但确实有效。对于你所问的问题，请考虑反转并加一。

如果你把它减少到少于 8 位，你就更容易理解这个问题了，3 是一个可以管理的数字，一切都从那里开始扩展。

因此，下面的第一列是输入，第二列是反转版本，第三列是第二列加一。第四列是 msbit 的进位，第五列是 msbit 的进位。

000 111 000 1 1
001 110 111 0 0
010 101 110 0 0
011 100 101 0 0
100 011 100 1 0
101 010 011 0 0
110 001 010 0 0
111 000 001 0 0

真正快速了解一下将一位添加到两位：

对于将一添加到数字的情况，从第二位执行到第三位的唯一情况是当您的位全部为 1 时，其中有一个零停止级联进位位。因此，在上面的三位反转表中，只有两种情况下进位到 msbit 中的是 111 和 011，因为这是唯一两种低位全部被设置的情况。对于 111 情况，msbit 有一个进位输入和一个进位输出。对于 011 情况，msbit 有进位输入，但没有进位输出。

因此，正如其他人所说，芯片中连接有晶体管，如果设置了 msbit 进位且未设置 msbit 进位，则在某处设置一些标志，否则清除该标志。

因此请注意上面的三位示例的规模。如果在反转之后和添加之前您有 0b01111111 那么您将得到一个进位而没有进位。如果你有 0b11111111 那么你会得到一个进位和一个进位。请注意，零也是一个数字，当您将其反转时，您会得到相同的数字，不同之处在于，当这些位被视为有符号时，零可以在求反时表示，而全零的 1 则不能。

但最重要的是，这不是一场危机或世界末日，处理器中有大量数学和其他运算，其中进位位和有效位从一侧或另一侧脱落，并且溢出和下溢正在触发大多数时候，程序员从不检查此类情况，这些位只是掉到地板上，有时会导致程序崩溃，有时程序员使用 16 位数字进行 8 位数学运算只是为了确保不会发生任何不良情况，或者出于同样的原因使用 8 位数字进行 5 位数学运算。

请注意，硬件不知道加法和减法有符号或无符号。硬件也不知道如何减法。硬件加法器是三位加法器（两个操作数和进位），具有结果和进位。将其中的 8 个连接起来，您就有一个 8 位加法器或减法器，不带进位的加法是直接与 lsbit 进位输入的零连接的两个操作数。带进位的加法是直接与连接到 lsbit 进位的进位位连接的两个操作数。 lsbit 进位。减法是加法，第二个操作数反转，进位位加 1。至少从高层次的角度来看，这些逻辑都可以以两种在随意检查时难以理解的方式得到优化和实现。

真正有趣的练习是乘法，想想用铅笔和纸做二进制乘法，然后意识到它比十进制容易得多，因为它只是一系列移位和加法。给定足够的门，您可以将每个结果位表示为方程，方程的输入作为操作数。这意味着如果您愿意，您可以进行单时钟乘法，在早期，门太多了，因此完成了多时钟移位和加法，今天我们烧毁门并获得单时钟乘法。另请注意，理解这一点还意味着，如果您确实说 16 位 = 8 位乘以 8 位乘法，则无论是有符号乘法还是无符号乘法，低 8 位结果都是相同的。因为大多数人都会做类似 int = int * int; 的事情如果您关心的只是结果位（不检查标志等），那么您确实不需要有符号或无符号乘法。有趣的东西..

First off the wikipedia article states it cannot be negated from a negative signed number to a signed number. And what they mean is because it takes 9 bits to represent positive 128, which you cannot do with an 8 bit register. If you are going from negative signed to positive unsigned as a conversion, then you have enough bits. And the hardware should give you 0x80 when you negate 0x80 because that is the right answer.

For add, subtract, multiply, etc addition in twos complement is no different than decimal math from elementary school. You line up your binary numbers, add the columns, the result for that column is the least significant digit and the rest is carried over to the next column. So adding a 0b001 to 0b001 for example

Add the two ones in the rightmost column, the result is 0b10 (2 decimal), write zero then carry the one, one plus zero plus zero is one, nothing to carry, zero plus zero is zero, the result is 0b010.

The right most column where 1 plus 1 is 0b10 and we write 0 carry the one, well that carry the one is at the same time the carry out of the right most column and is the carry in of the second column. Also, with pencil and paper math we normally only talk about carry of something when it is non-zero but if you think about it you are always carrying a number like our second columns one plus zero is one carry the zero.

You can think of a twos complement negate as invert and add one, or walking the bits and inverting up to a point then not inverting, or taking the result of zero minus the number.

You can work subtract in binary using pencil and paper for what it is worth, makes your head hurt when borrowing compared to decimal, but works. For what you are asking though think of invert and add one.

It is easier to wrap your head around this if you take it down to even fewer bits than 8, three is a manageable number, it all scales from there.

So the first column below is the input, the second column is the inverted version and the third column is the second column plus one. The fourth column is the carry in to the msbit, the fifth column is the carry out of the msbit.

000 111 000 1 1
001 110 111 0 0
010 101 110 0 0
011 100 101 0 0
100 011 100 1 0
101 010 011 0 0
110 001 010 0 0
111 000 001 0 0

Real quick look at at adding a one to two bits:

For the case of adding one to a number, the only case where you carry out from the second bit into the third bit is when your bits are all ones, a single zero in there stops the cascading carry bits. So in the three bit inversion table above the only two cases where you have a carry into the msbit is 111 and 011 because those are the only two cases where those lower bits are all set. For the 111 case the msbit has a carry in and a carry out. for the 011 case the msbit has a carry in but not a carry out.

So as stated by someone else there are transistors wired up in the chip, if msbit carry in is set and msbit carry out is not set then set some flag somewhere, otherwise clear the flag.

So note that the three bit examples above scale. if after you invert and before you add one you have 0b01111111 then you are going to get a carry in without the carry out. If you have 0b11111111 then you get a carry in and a carry out. Note that zero is also a number where you get the same number back when you invert it, the difference is that when the bits are considered as signed, zero can be represented when negated, 1 with all zeros cannot.

The bottom line though is that this is not a crisis or end of the world thing there is a whole lot of math and other operations in the processor where carry bits and significant bits are falling off one side or the other and overflows and underflows are firing off, etc. Most of the time the programmers never check for such conditions and those bits just fall on the floor, sometimes causing the program to crash or sometimes the programmer has used 16 bit numbers for 8 bit math just to make sure nothing bad happens, or uses 8 bit numbers for 5 bit math for the same reason.

Note that the hardware doesnt know signed or unsigned for addition and subtraction. also the hardware doesnt know how to subtract. Hardware adders are three bit adders (two operands and carry in) with a result and carry out. Wire 8 of these up you have an 8 bit adder or subtractor, add without carry is the two operands wired directly with a zero wired in as the lsbit carry in. Add with carry is the two operands wired directly with the carry bit wired to the lsbit carry in. Subtract is add with the second operand inverted and a one on the carry in bit. At least from a high level perspective, that logic can all get optimized and implemented in ways often two hard to understand on casual inspection.

The really fun exercise is multiply, think about doing binary multiplication with pencil and paper, then realize it is much easier than decimal, because it is just a series of shifts and adds. given enough gates you can represent each result bit as a equation with the inputs to the equation being the operands. meaning you can do a single clock multiply if you wish, in the early days that was too many gates, so multi clock shift and adds were done, today we burn the gates and get single clock multiplies. Also note that understanding this also means that if you do say a 16 bit = 8 bit times 8 bit multiply, the lower 8 bit result is the same whether it is a signed multiply or unsigned. Since most folks do things like int = int * int; you really dont need a signed or unsigned multiply if all you care about is the result bits (no checking of flags, etc). fun stuff..

回复收藏 0 原文

§普罗旺斯的薰衣草 2024-10-06 14:34:38

在ARM体系结构手册（DDI100E）中：

OverflowFrom
     Returns 1 if the addition or subtraction specified as its parameter 
     caused a 32-bit signed overflow. [...]
     Subtraction causes an overflow if the operands have different signs,
     and the first operand and the result have different signs.

NEG
     [...]
     V Flag = OverflowFrom(0 - Rm)

NEG是计算数字的负数的指令，即二进制补码。

V 标志表示有符号溢出，可用于条件分支。它与其他三个标志 Z（零）、C（进位）和 N（负）一起在不同的处理器架构中是相当标准的。

对于 0 - (-128) = 0 + 128 = -128，第一个操作数是 0，第二个操作数以及结果是 -128 code>，因此满足溢出条件，并且设置V标志。

In the ARM Architecture Manual (DDI100E):

OverflowFrom
     Returns 1 if the addition or subtraction specified as its parameter 
     caused a 32-bit signed overflow. [...]
     Subtraction causes an overflow if the operands have different signs,
     and the first operand and the result have different signs.

NEG
     [...]
     V Flag = OverflowFrom(0 - Rm)

NEG is the instruction for computing the negation of a number, i.e. the twos complement.

The V flag signals signed overflow and can be used for conditional branching. It's fairly standard across different processor architectures, together with the three other flags Z (zero), C (carry) and N (negative).

For 0 - (-128) = 0 + 128 = -128 the first operand is 0 and the second operand as well as the result is -128, so the condition for overflow is satisfied, and the V flag is set.

回复收藏 0 原文

~没有更多了~