使用霍夫曼代码压缩文件的步骤
我知道有很多涉及霍夫曼代码的问题,包括我自己的另一个问题,但我想知道实际编码文本文件的最佳方法是什么。减压看似微不足道;遍历树,在 0 处向左,在 1 处向右,打印字符。
但是,如何进行压缩呢?以某种方式将字符的位表示存储在树的节点中?每次遇到该角色时都在树中搜索该角色并追踪步骤?采用哪种编码方式重要吗?
到目前为止,我有一棵霍夫曼树,其中叶节点没有与之关联的二进制值。我的麻烦是将二进制值分配给树中的每个字符。
谢谢
I know there are many questions involving Huffman Code, including another one from myself, but I am wondering what would be the best way to actually encode a text file. Decompression seems trivial; traversing the tree, going left at 0 and right on 1, printing the character.
Though, how does one go about compression? Somehow store the bit representation of the character in it's node the tree? Search the tree for the character each time it is encountered and trace the steps? Does it matter which way this is coded?
Thus far, I have a huffman tree where the leaf nodes do not have a binary value associated with them. My trouble is assigning the binary values to each character in the tree.
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
好吧,如果您要基于字符对文件进行编码,我看不到问题,只需保留符号的哈希表,然后构建一棵树和使用您想要的任何约定将其写入文件的开头,然后将新的字母表应用于文本。看看 DEFLATE 中采用的方法,它用于压缩 PNG 图像。
编辑
目前还不清楚问题是什么。
树中的每个节点代表一个唯一的符号。你不必搜索任何东西,只有当你已经计算出每个符号的出现次数时,你才能构建霍夫曼树。
那么您已经开发了一种构建树的算法,问题是如何将二进制值分配给节点?或者在哪里存储这些值?树本身自然地代表二进制值,您实际上可以在树构建过程中跟踪它们,只需在插入操作中跟踪项目“路径”并将该值存储在节点内,或者创建一个哈希表(如果不这样做)想要修改节点实体。
Well, if you are going to encode a file on a character basis, i can't see the problem, just keep the hash table of symbols, then construct a tree & write it in the beginning of a file using whatever convention you want, hten apply new alphabet to the text. Take a look at the approach taken in DEFLATE, which is used to compress PNG images.
EDIT
It is not really clear what the problem is.
Each node in the tree represents an unique symbol. You don't have to search for anything, you can construct the Huffman tree only when you have already calculated each symbol's occurrence.
So you have already developed an algorithm to construct a tree and the problem is about how to assign the binary values to the nodes? Or where to store these values? The tree itself represents binary values naturally, you can actually track them during the tree construction, just keep the track of an items 'path' in the insert operation and store that value inside a node, or create a hash table if you don't want to modify the node entity.