汉明码如何工作?

发布于 2024-07-10 13:54:41 字数 120 浏览 6 评论 0原文

传输数据时,汉明码显然允许您重新创建已通过线路损坏的数据(纠错码)。

它是如何工作的?它有哪些限制(如果有)?

是否有更好的纠错解决方案(相对于重传)? 是否存在重传效果更好的情况?

When transmitting data, the Hamming code apparently allows you to recreate data that has been corrupted over the wire (an error correcting code).

How does this work and what are its limitations, if any?

Are there any better solutions for error correction (as opposed to retransmission)? Are there circumstances where retransmission is better?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

一花一树开 2024-07-17 13:54:41

让我们尝试解释一下:

我们有一个 3 位数字。 可能性可以表示为一个立方体,其中每个位代表一个轴。 八种可能性都在角落里。

000 --------001
 | \         | \
 | 100---------101
 |  |        |  |
 |  |        |  |
010-|-------011 |
   \|          \|
   110---------111

每个缺陷(例如 101 读为 100)都会导致立方体上的一条线发生偏移。

如果我们使用两位用于数据,第三位用于奇偶校验(例如偶校验)。 我们丢失了 4 个数据点。 但它的优点是我们可以检测单个位故障(将偶数个 1 转换为奇数个 1)。
奇数用 * 标记。 我们看到每个奇数(错误传输的)单词都被偶数(正确传输的)单词所包围。 因此,如果我们收到 100,我们就可以确定它是错误的。

但是(如果出现一位故障)正确的表示可能是 000、101 或 110。因此,我们可以检测到出现了错误,但我们无法检测到错误所在:

 000 -------*001
  | \         | \
  |*100---------101
  |  |        |  |
  |  |        |  |
*010-|-------011 |
    \|          \|
    110--------*111

这称为一位错误检测代码。

如果我们使用另一位进行检查,从而删除一位数据。 我们剩下 1 个数据位和 2 个“校验位”。
在本例中,假设 000 和 111 是有效的数据表示,而其他 6 个不是。 现在,如果有一点在运输过程中被损坏,我们就会遇到一个有趣的情况。
如果我们发送 000 并接收 010,我们会看到 010 有一个有效邻居 (000) 和两个无效邻居(110 和 011)。 现在我们知道我们打算发送 000 并且能够纠正它:

 000 -------*001
  | \         | \
  |*100--------*101
  |  |        |  |
  |  |        |  |
*010-|------*011 |
    \|          \|
   *110---------111

这称为一位纠错码。

请注意,一位纠错码也是两位错误检测码。

更笼统地说。

如果有 n 个校验位,则有一个位错误检测代码。
如果有 2n 个校验位,则有一个位纠错码。

当然,您应该订购“有效”代码,以便它们不会彼此相邻。

Let's try to explain it a bit:

We have a 3 bit number. The possibilities can be presented as a cube where each bit represent an axis. The eight possibilities are on the corners.

000 --------001
 | \         | \
 | 100---------101
 |  |        |  |
 |  |        |  |
010-|-------011 |
   \|          \|
   110---------111

Each defect (for example 101 is read as 100) results in a shift on a line on the cube.

If we use two bits for the data and the third for a parity check (say for example even parity). We lose 4 datapoints. But it has the advantage that we can detect a single bit failure (which transforms an even count of 1's into an odd count of ones).
The odd numbers are marked with a *. And we see that each odd (wrongly transmitted) word is cornered by even (rightfully transmitted) words. So if we receive 100, we can be sure it is wrong.

But (with a single bit failure) the correct representation could have been 000, 101 or 110. So we can detect something has been wrong but we cannot detect what was wrong:

 000 -------*001
  | \         | \
  |*100---------101
  |  |        |  |
  |  |        |  |
*010-|-------011 |
    \|          \|
    110--------*111

This is called a one bit error detecting code.

If we use another bit for checking and thus remove one for the data. We are left with 1 databit and 2 "check bits".
In this case, lets assume 000 and 111 are valid data representations and the other six are not. Now we have an interesting situation if one bit is mangled during transport.
If we send 000 and receive 010, we see that 010 has one valid neighbor (000) and two invalid ones (110 and 011). So now we know that we intended to send 000 and are able to correct that:

 000 -------*001
  | \         | \
  |*100--------*101
  |  |        |  |
  |  |        |  |
*010-|------*011 |
    \|          \|
   *110---------111

This is called a one bit error correcting code.

Please note that a one bit error correcting code is also a two bit error detecting code.

And to put it more generally.

If you have n check bits, you have a n bit error detecting code.
If you have 2n check bits, you have a n bit error correcting code.

Of course you should order the "valid" codes so that they do not border on each other.

韶华倾负 2024-07-17 13:54:41

这是真正的高级概述。

Suppose that every time I send a message, I send thrie copies of the text.
Suppose that every time I send z message, I send three copies of the teyt.
Suppose that every tyme I send a message, I send three copies if the tezt.

通过比较字符并在每个位置进行简单多数投票,您可以纠正单个错误字符。 然而,这种方案的成本(必须发送的数据量)很高,并且它没有捕获不同副本的相应位置中不太可能但可能出现两个错误的情况(如上面示例的最后一个词)。

汉明码(以及其他类型的纠错码,例如里德所罗门码)基于计算额外数据(而不是简单的复制)的公式。 添加的位取决于数据位的组合,当在接收端重复计算时,复制错误会产生可检测的变化模式。

为了便于说明,让我们从简单的奇奇偶校验开始,添加一位以确保消息中的总位数为奇数。 因此,消息 10110110 变为 101101100(五个 1,不需要额外的 1),并且消息 10010110 变为 100101101 (四个 1,需要额外的 1)。 如果您收到一条 101101101 消息并看到有六个 1,则您知道存在错误,但不知道错误在哪里。 假设我们添加更多奇偶校验位,每个奇偶校验位仅取决于消息的部分,如下所示,通过复制考虑的位并使用“-”表示忽略的位:

10110110
1-1-0-1- => 0
-0-1-1-0 =>  1
10--01-- =>   1
--11--10 =>    0
1011---- =>     0
----0110 =>      1

因此完整的消息为10110110011001。 现在假设传输错误更改了消息中的第三位,使其读取为 10010110011001。 当接收器重新运行错误校验计算时,它无法匹配:

10010110
1-0-0-1- => 1*
-0-1-1-0 =>  1
10--01-- =>   1
--01--10 =>    1*
1001---- =>     1*
----0110 =>      1

并且无法匹配的校验位正是受第三个数据位影响的校验位。 这不是真正的、鲁棒的纠错方案; 这只是一个草图,用于说明构建冗余如何帮助识别错误的确切性质。

Here's the really high-level overview.

Suppose that every time I send a message, I send thrie copies of the text.
Suppose that every time I send z message, I send three copies of the teyt.
Suppose that every tyme I send a message, I send three copies if the tezt.

By comparing characters and taking a simple majority vote in each position, you can correct single erroneous characters. However the cost of this scheme (amount of data that must be sent) is high, and it doesn't catch the unlikely-but-possible case of two errors in corresponding positions of different copies (as in the last word of the sample above).

Hamming codes (and other kinds of error-correcting codes, such as Reed-Solomon) are based on formulas that compute the extra data (rather than simple duplication). The added bits depend on combinations of the data bits in a way that errors in copying make detectable patterns of changes when the computation is repeated at the receiving end.

For sake of illustration, let's start with simple odd parity, adding a single bit to ensure that the total number of bits in a message is odd. So the message 10110110 becomes 101101100 (five 1s, no extra 1 needed) and the message 10010110 becomes 100101101 (four 1s, extra 1 needed). If you receive a message of 101101101 and see that there are six 1s, you know that there's an error, but don't know where. Suppose we add more parity bits, each on depending only on a part of the message, as illustrated below by copying the bits considered and using '-' for bits ignored:

10110110
1-1-0-1- => 0
-0-1-1-0 =>  1
10--01-- =>   1
--11--10 =>    0
1011---- =>     0
----0110 =>      1

so the complete message is 10110110011001. Now suppose a transmission error changes the third bit in the message, so that it reads 10010110011001. When the receiver re-runs the error-checking computation, it fails to match:

10010110
1-0-0-1- => 1*
-0-1-1-0 =>  1
10--01-- =>   1
--01--10 =>    1*
1001---- =>     1*
----0110 =>      1

and the check bits that fail to match are exactly the ones affected by the third data bit. This is not a real, robust error-correction scheme; it's just a sketch to illustrate how building in redundancy can help in identifying the exact nature of an error.

你怎么敢 2024-07-17 13:54:41

您可以在此处找到有关其工作方式的详细信息

< a href="http://en.wikipedia.org/wiki/Error- Correcting_code" rel="nofollow noreferrer">此处

You will find details on the way it works here

More general information about Error Corecting Codes is available here

貪欢 2024-07-17 13:54:41

有关汉明码的信息可在此处此处

至于适用性,解释了原因:

  1. 像汉明这样的纠错码适用于不能请求重传的单工通道。

  2. 错误检测加重传通常是首选,因为它更有效。

为了进行比较,请考虑误码率为每比特的通道。 令块大小为 1000 位。

为了纠正单个错误(通过汉明码),每个块需要 10 个校验位。 要传输 1000 个块,需要 10,000 个校验位(开销)。
要检测单个错误,每个块一个奇偶校验位就足够了。 要传输 1000 个块,只需重传一个外部块(由于每比特的错误率),从而仅产生 2001(= 1000 + 1001)比特的开销。

Information about Hamming code is available here and here.

As for suitability , this explains why :

  1. Error-correcting codes like Hamming are suitable for simplex channels where no retransmission can be requested.

  2. Error-detection plus retransmission is often preferred because it is more efficient.

For comparison, consider a channel with error rate of per bit. Let block size be 1000 bits.

To correct a single error (by Hamming code), 10 check bits per block are needed. To transmit 1000 blocks, 10,000 check bits (overhead) are required.
To detect a single error, a single parity bit per block will suffice. To transmit 1000 blocks, only one exter block (due to the error rate of per bit) will have to be retransmitted, giving the overhead of only 2001 (= 1000 + 1001) bits.

慢慢从新开始 2024-07-17 13:54:41

汉明码是一种数学技巧,可纠正串行传输中最多 4 个丢失的位。 它还用于支持现代存储芯片中的“奇偶校验位”。

限制是可以恢复的比特数,不能超过4个。如果丢失超过4个比特,则需要重传。

不同的情况需要不同的纠错技术。 其中一些技术已在此处的其他帖子中列出。

Hamming code is a mathematical trick to correct up to 4 lost bits in a serial transmission. It is also used in favour of the "parity bit" in modern memory chips.

Limitations are in the number of bits which can be restored, which is not more than 4. If more than 4 bits are lost, retransmission is required.

Different situations require different error correction techniques. Some of these techniques are listed in the other posts here.

街道布景 2024-07-17 13:54:41

@GameCat 以及 2 位错误检测代码怎么样。

在这种情况下,假设 111 已更改为 100。那么我们可以确定有 2 位错误,我们知道这一点是因为 111 和 100 之间的距离是 2 位,而 000 和 100 之间的距离是 1 位。 因此,如果我们知道存在 2 位错误,我们就可以确定正确的值是多少。

@GameCat and what about 2-bit error detecting code.

In this case lets say 111 has changed to 100. Then we can be sure that there are 2 bits wrong and we know this because the distance between 111 and 100 is 2 bits, and the distance of 000 and 100 is 1 bit. So if we know that there is 2-bit error we can be sure what is the right value.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文