可以处理更大字符串的Python纠删码库?
我正在寻找一个适用于更大输入的 python 擦除编码库。到目前为止,我已经检查过:
- unireedsolomon:256 字节输入失败,未维护
- reedsolo/reedsolomon:300 字节输入失败,无提示。
- Reed-Solomon 显然是一个学习项目,错误跟踪器已禁用
- pyeclib:使用 reed-solomon 编码的 100 字节输入失败,并且似乎没有提供任何有关有效的文档参数,这样我就无法弄清楚如何测试其他算法(liberasurecode也没有)
我想要可以处理 n=10,000 k=2,000 左右,最好更大。
I'm looking for an python erasure coding library that works for larger inputs. So far I've checked out:
- unireedsolomon: fails for 256-byte inputs, unmaintained
- reedsolo/reedsolomon: fails for a 300-byte input silently.
- Reed-Solomon clearly a learning project, bug tracker disabled
- pyeclib: fails for 100-byte input using reed-solomon encoding, and doesn't seem to provide any documentation on valid parameters, such that I couldn't figure out how to test other algorithms (nor does liberasurecode)
I want something that can handle n=10,000 k=2,000 or so, ideally larger.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
只有域多项式必须是素数或本原多项式,而不是生成多项式。如果您想要 RS(10000, 8000, 2000) (n = 10000, k = 8000, nk = 2000) 代码,GF(2^16) 具有原始约简多项式 x^16 + x^12 + x^3 + x可以使用^1 + 1。生成多项式的次数为 2000。假设第一个连续根为 2,则生成多项式 = (x-2)(x-4)(x-8) ... (x-2^2000)(所有这些数学在 GF(2^16) 中完成,+ 和 - 都是异或)。修正将涉及生成 2000 个综合症并使用 Berlekamp Massey 或 Sugiyama 的扩展欧几里得解码器。我不知道是否有Python库支持GF(2^16)。
https://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_ Correction#Berlekamp%E2%80%93Massey_decoder
https://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_ Correction#Euclidean_decoder
通过交织可以避免大的 n 和 k。像 LTO 这样的磁带驱动器将大数据块视为使用 GF(2^8) 跨行(称为 C1)和下列(称为 C2)交错的矩阵。 LTO-8 使用 RS(249,237,13),我认为这是 ECC 使用列来纠正行。有 32 个读|写头,有 32 个交错,可能跨行。我不知道跨行的 RS() 代码是什么,也不知道交错的列是什么。
Only the field polynomial has to be prime or primitive, not the generating polynomial. If you wanted a RS(10000, 8000, 2000) (n = 10000, k = 8000, n-k = 2000) code, GF(2^16) with primitive reducing polynomial x^16 + x^12 + x^3 + x^1 + 1 could be used. The generating polynomial would be of degree 2000. Assuming first consecutive root is 2, then generating polynomial = (x-2)(x-4)(x-8) ... (x-2^2000) (all of this math done in GF(2^16), + and - are both xor). Correction would involve generating 2000 syndromes and using Berlekamp Massey or Sugiyama's extended Euclid decoder. I don't know if there are Python libraries that support GF(2^16).
https://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_correction#Berlekamp%E2%80%93Massey_decoder
https://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_correction#Euclidean_decoder
Large n and k can be avoided by interleaving. Tape drives like LTO treat large data blocks as matrices interleaved across rows (called C1) and down columns (called C2) using GF(2^8). LTO-8 uses RS(249,237,13), which I assume is the ECC used down columns to correct rows. With 32 read|write heads, there's an interleave of 32, probably across rows. I don't know what the RS() code is across rows, or what the interleave down columns is.