当前位置：文江博客话题详情

33字节的错误检测代码，检测前32字节中的位翻转

发布于 2024-12-01 12:01:20 字数 96 浏览 5 评论 0原文

您能否建议一个错误检测方案来检测 33 字节消息的前 32 字节中的一个可能的位翻转使用附加数据不超过 8 位？

Pearson 哈希可以成为解决方案吗？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

千纸鹤带着心事 2024-12-08 12:01:20

检测任何消息中的单个位翻转只需要一位额外的位，与消息的长度无关：只需将消息中的所有位异或在一起并将其添加到末尾。如果任何一位翻转，末尾的奇偶校验位将不匹配。

如果您要求检测哪个位翻转，这是无法完成的，一个简单的参数就可以说明这一点：额外的 8 位可以表示最多 256 类 32 字节消息，但零消息和 256 类消息每一位都必须属于不同的类别。因此，必须明确分类的消息有 257 条，而类别只有 256 个。

回复收藏 0 原文

寒江雪… 2024-12-08 12:01:20

您可以在任何长度的消息中仅用一位额外的位来检测一位翻转（如 @Daniel Wagner 所述）。简单地说，奇偶校验位可以指示1位的总数是奇数还是偶数。显然，如果错误的位数是偶数，则奇偶校验位将失败，因此无法检测到 2 位错误。

现在，为了更容易地理解为什么不能仅用 8 位对 32 字节（256 位）进行纠错，请阅读汉明码（如 ECC 内存中使用的）。这种方案使用特殊的纠错奇偶校验位（以下称为“EC奇偶校验”），仅对总位数的子集的奇偶校验进行编码。对于每 2^m - 1 总位，您需要使用 m EC 位。这些代表遵循“x 位打开，x 位关闭”模式的每个可能的不同掩码，其中 x 是 2 的幂。因此，一次位数越大，数据/奇偶校验就越好你得到的比特率。例如，总共 7 个位将允许在丢失 3 个 EC 位后仅编码 4 个数据位，但总共 31 个位可在丢失 5 个 EC 位后编码 26 个数据位。

现在，要真正理解这一点，可能需要举一个例子。考虑以下几组面具。前两行要从上到下读取，指示位数（我标记为 MSB 的“最高有效字节”）：

  MSB                                LSB
   |                                  |
   v                                  v
   33222222 22221111 11111100 0000000|0
   10987654 32109876 54321098 7654321|0
   -------- -------- -------- -------|-
1: 10101010 10101010 10101010 1010101|0
2: 11001100 11001100 11001100 1100110|0
3: 11110000 11110000 11110000 1111000|0
4: 11111111 00000000 11111111 0000000|0
5: 11111111 11111111 00000000 0000000|0

首先要注意的是，每列中表示 0 到 31 的二进制值从右到左（读取第 1 行到第 5 行中的位）。这意味着每个垂直列都彼此不同（重要部分）。出于特定原因，我在位号 0 和 1 之间放置了一条垂直的额外行：第 0 列无用，因为它没有设置任何位。

为了执行纠错，我们将接收到的数据位与每个 EC 位的预定义掩码进行按位与，然后将所得奇偶校验与 EC 位进行比较。对于发现不匹配的任何计算奇偶校验，请查找仅设置这些位的列。例如，如果根据接收到的数据值计算时纠错位 1、4 和 5 是错误的，则列 #25（仅在这些掩码中包含 1）一定是错误位，并且可以通过翻转它来纠正。如果只有一个纠错位出错，则错误就出在该纠错位上。这是一个类比，可以帮助您理解为什么会这样：

一共有 32 个相同的盒子，其中一个装有弹珠。您的任务是仅使用老式秤（带有两个平衡平台来比较不同物体重量的秤）来定位大理石，并且您只能尝试 5 次称重。解决方案相当简单：在秤的每一侧放 16 个盒子，较重的一侧表示弹珠在哪一侧。丢弃较轻的 16 个盒子，然后保留较重的 8 和 8 个盒子，然后是 4 和 4，然后是 2 和 2，最后通过比较最后 2 个盒子的重量 1 比 1 来找到弹珠：最重的盒子里装有大理石。您仅用 32、16、8、4 和 2 个盒子 5 次称量就完成了任务。

类似地，我们的位模式将盒子分为 5 个不同的组。向后看，第五个 EC 位确定错误是在左侧还是右侧。在我们的场景中，位 #25 是错误的，因此我们知道错误位位于组的左侧（位 16-31）。在 EC 位#4 的下一个掩码中（仍然向后退一步），我们只考虑位 16-31，并且我们发现“较重”的一侧又是左侧，因此我们缩小了位 24-31 的范围。沿着决策树向下，每次将可能的列数减半，当我们到达 EC 位 1 时，只剩下 1 个可能的位——我们的“盒子里的弹珠”。

注意：这个类比很有用，但并不完美：1 位不是用弹珠来表示的——错误位的位置是用弹珠来表示的。

现在，一些人尝试使用这些掩码并思考如何安排，就会发现存在一个问题：如果我们尝试制作所有 31 位数据位，那么我们还需要 5 位用于 EC。但是，我们如何判断 EC 位本身是否错误呢？只要有一个 EC 位错误就会错误地告诉我们某些数据位需要纠正，并且我们会错误地翻转该数据位。 EC 位必须以某种方式为自己编码！解决方案是将奇偶校验位定位在数据的内部，位于上面位模式中仅设置一位的列中。这样，任何数据位错误都会触发两个个EC位错误，这样，如果只有一个EC位错误，我们就知道它本身是错误的，而不是表明某个数据位是错误的。错误的。满足一位条件的列是 1、2、4、8 和 16。数据位将从位置 2 开始在这些列之间交错。（请记住，我们不使用位置 0，因为它永远不会提供任何信息--我们的 EC 位根本不会被设置）。

最后，为整体奇偶校验再添加一位将允许检测 2 位错误并可靠地纠正 1 位错误，因为我们可以随后将 EC 位与其进行比较：如果 EC 位表示有问题，但奇偶校验位表示其他情况，我们知道有 2 位错误，无法进行纠正。我们可以使用丢弃的位 #0 作为奇偶校验位！事实上，现在我们正在编码以下模式：

0: 11111111 11111111 11111111 11111111

这为我们提供了最终总共 6 个错误检查和纠正 (ECC) 位。无限期地扩展使用不同掩码的方案如下所示：

32 bits - 6 ECC bits = 26 data
64 bits - 7 ECC bits = 57 data
128 bits - 8 ECC bits = 120 data
256 bits - 9 ECC bits = 247 data
512 bits - 10 ECC bits = 502 data

现在，如果我们确定只会得到 1 位错误，则可以省去 #0 奇偶校验位，因此我们有以下结果：

31 bits - 5 ECC bits = 26 data
63 bits - 6 ECC bits = 57 data
127 bits - 7 ECC bits = 120 data
255 bits - 8 ECC bits = 247 data
511 bits - 9 ECC bits = 502 data

这没有变化，因为我们没有得到更多的数据位。哎呀！您请求的 32 字节（256 位）无法用单个字节进行纠错，即使我们知道最坏情况下只能有 1 位错误，并且我们知道 ECC 位将是正确的（允许我们将它们移出数据区域并将它们全部用于数据）。我们需要比现有的多两位——必须向上滑动到下一个 512 位范围，然后省略 246 个数据位以获得 256 个数据位。因此，这又是一个 ECC 位和一个数据位（因为我们只有 255 位，这正是 Daniel 告诉你的）。

摘要：：您需要 33 个字节 + 1 位来检测前 32 个字节中哪一位翻转。

注意：如果您要发送 64 个字节，那么您的比率为 32:1，因为你只需 10 位就可以纠正错误。但在现实应用中，ECC 的“帧大小”无法无限期地增加，原因如下： 1) 一次处理的位数可能远小于帧大小，从而导致总体效率低下（想想 ECC RAM）。 2）能够准确纠正一位的机会越来越少，因为帧越大，出错的机会就越大，2个错误就打败了纠错能力，而3个或更多就可以打败甚至错误-检测能力。 3) 一旦检测到错误，帧大小越大，必须重传的损坏片的大小也越大。

You can detect one bit flip with just one extra bit in any length message (as stated by @Daniel Wagner). The parity bit can, simply put, indicate whether the total number of 1-bits is odd or even. Obviously, if the number of bits that are wrong is even, then the parity bit will fail, so you cannot detect 2-bit errors.

Now, for a more accessible understanding of why you can't error-correct 32 bytes (256 bits) with just 8 bits, please read about the Hamming code (like used in ECC memory). Such a scheme uses special error-correcting parity bits (henceforth called "EC parity") that only encode the parity of a subset of the total number of bits. For every 2^m - 1 total bits, you need to use m EC bits. These represent each possible different mask following the pattern "x bits on, x bits off" where x is a power of 2. Thus, the larger the number of bits at once, the better the data/parity bit ratio you get. For example, 7 total bits would allow encoding only 4 data bits after losing 3 EC bits, but 31 total bits can encode 26 data bits after losing 5 EC bits.

Now, to really understand this probably will take an example. Consider the following sets of masks. The first two rows are to be read top down, indicating the bit number (the "Most Significant Byte" I've labeled MSB):

  MSB                                LSB
   |                                  |
   v                                  v
   33222222 22221111 11111100 0000000|0
   10987654 32109876 54321098 7654321|0
   -------- -------- -------- -------|-
1: 10101010 10101010 10101010 1010101|0
2: 11001100 11001100 11001100 1100110|0
3: 11110000 11110000 11110000 1111000|0
4: 11111111 00000000 11111111 0000000|0
5: 11111111 11111111 00000000 0000000|0

The first thing to notice is that the binary values for 0 to 31 are represented in each column going from right to left (reading the bits in rows 1 through 5). This means that each vertical column is different from each other one (the important part). I put a vertical extra line between bit numbers 0 and 1 for a particular reason: Column 0 is useless because it has no bits set in it.

To perform error-correcting, we will bitwise-AND the received data bits against each EC bit's predefined mask, then compare the resulting parity to the EC bit. For any calculated parities discovered to not match, find the column in which only those bits are set. For example, if error-correcting bits 1, 4, and 5 are wrong when calculated from the received data value, then column #25--containing 1s in only those masks--must be the incorrect bit and can be corrected by flipping it. If only a single error-correcting bit is wrong, then the error is in that error-correcting bit. Here's an analogy to help you understand why this works:

There are 32 identical boxes, with one containing a marble. Your task is to locate the marble using just an old-style scale (the kind with two balanced platforms to compare the weights of different objects) and you are only allowed 5 weighing attempts. The solution is fairly easy: you put 16 boxes on each side of the scale and the heavier side indicates which side the marble is on. Discarding the 16 boxes on the lighter side, you then weigh 8 and 8 boxes keeping the heavier, then 4 and 4, then 2 and 2, and finally locate the marble by comparing the weights of the last 2 boxes 1 to 1: the heaviest box contains the marble. You have completed the task in only 5 weighings of 32, 16, 8, 4, and 2 boxes.

Similarly, our bit patterns have divided up the boxes in 5 different groups. Going backwards, the fifth EC bit determines whether an error is on the left side or the right side. In our scenario with bit #25, it is wrong, so we know that the error bit is on the left side of the group (bits 16-31). In our next mask for EC bit #4 (still stepping backward), we only consider bits 16-31, and we find that the "heavier" side is the left one again, so we have narrowed down the bits 24-31. Following the decision tree downward and cutting the number of possible columns in half each time, by the time we reach EC bit 1 there is only 1 possible bit left--our "marble in a box".

Note: The analogy is useful, though not perfect: 1-bits are not represented by marbles--the erroring bit location is represented by the marble.

Now, some playing around with these masks and thinking how to arrange things will reveal that there is a problem: If we try to make all 31 bits data bits, then we need 5 more bits for EC. But how, then, will we tell if the EC bits themselves are wrong? Just a single EC bit wrong will incorrectly tell us that some data bit needs correction, and we'll wrongly flip that data bit. The EC bits have to somehow encode for themselves! The solution is to position the parity bits inside of the data, in columns from the bit patterns above where only one bit is set. This way, any data bit being wrong will trigger two EC bits to be wrong, making it so that if only one EC bit is wrong, we know it is wrong itself instead of it signifying a data bit is wrong. The columns that satisfy the one-bit condition are 1, 2, 4, 8, and 16. The data bits will be interleaved between these starting at position 2. (Remember, we are not using position 0 as it would never provide any information--none of our EC bits would be set at all).

Finally, adding one more bit for overall parity will allow detecting 2-bit errors and reliably correcting 1-bit errors, as we can then compare the EC bits to it: if the EC bits say something is wrong, but the parity bit says otherwise, we know there are 2 bits wrong and cannot perform correction. We can use the discarded bit #0 as our parity bit! In fact, now we are encoding the following pattern:

0: 11111111 11111111 11111111 11111111

This gives us a final total of 6 Error-Checking and Correcting (ECC) bits. Extending the scheme of using different masks indefinitely looks like this:

32 bits - 6 ECC bits = 26 data
64 bits - 7 ECC bits = 57 data
128 bits - 8 ECC bits = 120 data
256 bits - 9 ECC bits = 247 data
512 bits - 10 ECC bits = 502 data

Now, if we are sure that we only will get a 1-bit error, we can dispense with the #0 parity bit, so we have the following:

31 bits - 5 ECC bits = 26 data
63 bits - 6 ECC bits = 57 data
127 bits - 7 ECC bits = 120 data
255 bits - 8 ECC bits = 247 data
511 bits - 9 ECC bits = 502 data

This is no change because we don't get any more data bits. Oops! 32 bytes (256 bits) as you requested cannot be error-corrected with a single byte, even if we know we can have only a 1-bit error at worst, and we know the ECC bits will be correct (allowing us to move them out of the data region and use them all for data). We need TWO more bits than we have--one must slide up to the next range of 512 bits, then leave out 246 data bits to get our 256 data bits. So that's one more ECC bit AND one more data bit (as we only have 255, exactly what Daniel told you).

Summary:: You need 33 bytes + 1 bit to detect which bit flipped in the first 32 bytes.

Note: if you are going to send 64 bytes, then you're under the 32:1 ratio, as you can error correct that in just 10 bits. But it's that in real world applications, the "frame size" of your ECC can't keep going up indefinitely for a few reasons: 1) The number of bits being worked with at once may be much smaller than the frame size, leading to gross inefficiencies (think ECC RAM). 2) The chance of being able to accurately correct a bit gets less and less, since the larger the frame, the greater the chance it will have more errors, and 2 errors defeats error-correction ability, while 3 or more can defeat even error-detection ability. 3) Once an error is detected, the larger the frame size, the larger the size of the corrupted piece that must be retransmitted.

回复收藏 0 原文