如何从霍夫曼编码比特流中解码消息?
如何从霍夫曼编码比特流中解码消息? 我不太清楚霍夫曼算法的思想。
据我了解,假设我收到一条短信“我的名字是 XYZ”。
那么编码过程是这样的: 1.统计字符出现的频率。 2. 按值对频率进行排序。 3. 构建一棵树。 4. 将左边缘视为 0,将右边缘视为 1 来遍历树,以获取预期的消息字符。 5. 连接代码以找到比特流。
现在的问题是,如何从编码的比特流中找到原始消息?
我认为我们需要再次构建霍夫曼树。
但是如何从比特流构建霍夫曼树呢?
How to decode the message from a Huffman Encoded bit stream?
I am not clear about the idea of Huffman Algorithm.
As far as I understand it, Suppose I am given a text message "My name is XYZ".
Then the encoding process goes this way:
1. Count the frequency of the characters.
2. Sort the frequency by values.
3. Construct a tree.
4. Traverse the tree by considering left-edge as 0 and right-edge as 1 to get to the intended message character.
5. Concatenate the codes to find the bit stream.
Now the problem is, How can I find the original message from the encoded bit stream?
I think we need to construct the Huffman tree again.
But how can I construct the Huffman tree from the bit stream?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
没有原始树就无法解码消息。因此发送方必须将其包含在消息中(如果消息很长,开销会很小),或者双方在发送消息之前就树达成一致。然后过程是相反的:从流中一点一点地读取并遍历树。一旦你击中叶子,就会发出字符。
Message can't be decoded without original tree. So sending party must include it in message (in case of long messages with overhead will be small) or both parties agree about tree before sending messages. Then process is reverse: you read bit by bit from stream and traverse tree. Once you hit leaf then emit character.
为此有几种选择。一种是使用固定的霍夫曼树 - 例如,如果您正在编码普通的英语文本,字符的相对频率通常足够接近常数,您只需让发送者和接收者事先达成一致就可以相当合理地做到这一点他们都会使用这三个中的一个。
对于您描述的两遍霍夫曼算法,您几乎只能将树(以某种形式)与数据一起传输。
您还可以使用动态霍夫曼树,例如,您从上面第一个选项中概述的树开始,但在处理数据时,发射器和接收器都会调整树以适应观察到的频率。唯一的技巧是每个字符必须在调整之前使用树进行编码,然后进行调整,然后处理下一个字符。这样接收端可以与发送端保持同步。
There are a couple of choices for this. One is to used a fixed Huffman tree -- just for example, if you're encoding normal English text, the relative frequencies of characters are usually close enough to constant that you can do pretty reasonably just having the sender and receiver have a prior agreement of the three they'll both use.
For the two-pass Huffman algorithm you describe, you're pretty much stuck with transmitting the tree (in some form) along with the data.
You can also use a dynamic Huffman tree, where you (for example) start with a tree as outlined in the first option above, but as they process the data, the transmitter and receiver both adjust the tree to fit the observed frequencies. The only trick to this is that each character must be encoded using the tree before adjustment, then do the adjustment, then process the next character. This way the receiving end can stay in synch with the sending end.