在 n 位上生成 k 大小的纠错码的算法
我想为我想要分类的 k 个不同输入生成 n 位代码。该代码的主要要求是纠错标准:不同输入的任意两个编码之间的最小成对距离最大化。我不需要它是精确的 - 近似即可,并且易于使用和计算实现的速度也是一个优先事项。
一般情况下,n为数百,k为数十。
另外,k 个不同的 n 位二进制编码之间的最小汉明距离是否存在相当严格的界限?
I want to generate a code on n bits for k different inputs that I want to classify. The main requirement of this code is the error-correcting criteria: that the minimum pairwise distance between any two encodings of different inputs is maximized. I don't need it to be exact - approximate will do, and ease of use and speed of computational implementation is a priority too.
In general, n will be in the hundreds, k in the dozens.
Also, is there a reasonably tight bound on the minimum hamming distance between k different n-bit binary encodings?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
为给定的参数找到精确的最佳纠错码是非常困难的,甚至近似最佳的编码也是困难的。最重要的是,有些代码没有任何像样的解码算法,而对于其他代码来说,解码问题非常棘手。
但是,您询问的是特定范围的参数,其中 n ≫ k,如果我理解正确,您需要长度为 n 的 k 维代码。 (因此 k 位被编码为 n 位。)在这个范围内,首先,随机码可能具有非常好的最小距离。唯一的问题是解码从不切实际到难以处理,并且实际计算最小距离也不是那么容易。
其次,如果您想要 n ≫ k 情况的显式代码,那么您可以使用 BCH 做得相当好q=2 的代码。正如维基百科页面所解释的,BCH 代码有一个很好的解码算法。
关于最小汉明距离的上限,在 n ≫ k 范围内,您应该从 汉明界,也称为体积界限或球体堆积界限。界限的想法简单而优美:如果最小距离是 t,那么代码可以纠正最大距离Floor((t-1)/2) 的错误。如果您可以纠正某个半径的错误,则意味着该半径的汉明球不会重叠。另一方面,可能的单词总数为 2n,因此如果将其除以一个汉明球中的点数(在二进制情况下是二项式系数之和),您会得到无错误码字数量的上限。打破这个界限是可能的,但对于较大的最小距离来说这并不容易。在这种制度下,这是一个非常好的界限。
The problem of finding the exact best error-correcting code for given parameters is very hard, even approximately best codes are hard. On top of that, some codes don't have any decent decoding algorithms, while for others the decoding problem is quite tricky.
However, you're asking about a particular range of parameters where n ≫ k, where if I understand correctly you want a k-dimensional code of length n. (So that k bits are encoded in n bits.) In this range, first, a random code is likely to have very good minimum distance. The only problem is that decoding is anywhere from impractical to intractible, and actually calculating the minimum distance is not that easy either.
Second, if you want an explicit code for the case n ≫ k, then you can do reasonably well with a BCH code with q=2. As the Wikipedia page explains, there is a good decoding algorithm for BCH codes.
Concerning upper bounds for the minimum Hamming distance, in the range n ≫ k you should start with the Hamming bound, also known as the volume bound or the sphere packing bound. The idea of the bound is simple and beautiful: If the minimum distance is t, then the code can correct errors up to distance floor((t-1)/2). If you can correct errors out to some radius, it means that the Hamming balls of that radius don't overlap. On the other hand, the total number of possible words is 2n, so if you divide that by the number of points in one Hamming ball (which in the binary case is a sum of binomial coefficients), you get an upper bound on the number of error-free code words. It is possible to beat this bound, but for large minimum distance it's not easy. In this regime it's a very good bound.