压缩解码

发布于 2024-12-28 13:21:11 字数 449 浏览 1 评论 0原文

我目前正在阅读有关编码/解码数据的 DEFLATE 方法。 组成:

据我了解,该过程由两部分 将重复信息(在指定窗口内)替换为对前一个相同片段的引用。

二.使用霍夫曼编码来减少最常出现的符号的大小。

我有一个关于(i)的问题。 DEFLATE 使用 LZ77,它基于大小窗口搜索信息,如果发现任何重复信息,则用“指针”替换它。这是完全有道理的。

然而,当使用LZ77解码时,DEFLATE如何识别指针呢? (指针是长度-距离对;您如何辨别它是指针还是初始数据中存在的数字?)

参考:http://en.wikipedia.org/wiki/DEFLATE#Duplicate_string_elimination

I am currently reading about the DEFLATE method for encoding/decoding data. I understand that the process is composed of two parts:

i. Replace duplicate information (within a specified window) with a reference back to the previous identical piece.

ii. Use Huffman coding to reduce the size of the most commonly occurring symbols.

I have a question with regards to (i). DEFLATE uses LZ77 which, based on a size window, searches through the information and, if it finds any duplicate information, replaces it with a "pointer". That makes perfect sense.

However, when decoding using LZ77 how does DEFLATE recognize a pointer? (Pointers are length-distance pairs; how can you discern if it's a pointer or just a number that was present in the initial data?)

Reference: http://en.wikipedia.org/wiki/DEFLATE#Duplicate_string_elimination

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

属性 2025-01-04 13:21:11

建议阅读更精确的 Deflate RFC 1951 规范,并回答这样的问题问题。

您将在 => 中看到什么3.2.5.压缩块(长度和距离代码)

“文字和长度字母表合并为一个字母表”,

这意味着,只需检索下一个符号,您就可以立即知道它是文字 (0..255) 还是匹配项长度(257..285),甚至是块的末尾(256)。在匹配长度的情况下,还必须解码参考(偏移量)。偏移量使用单独的树进行编码。

It's recommended to read the Deflate RFC 1951 specification, which is much more precise, and answer such questions.

What you'll see in => 3.2.5. Compressed blocks (length and distance codes)

"the literal and length alphabets are merged into a single alphabet"

which means that, by simply retrieving the next symbol, you immediately know if it is a literal (0..255), or a match length (257..285), or even an end of block (256). In case of a match length, a reference (offset) must be decoded too. Offset are encoded using a separate tree.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文