最佳压缩的最佳哈夫曼树
我正在编写霍夫曼字符串压缩器,我想确认我正在对我的树进行最佳压缩。
我正在使用这种树:
而不是这种树:
我认为超过 10 个单个字符,不可能压缩为 8 位。
第一个图像真的是最佳图像吗?
I am coding an Huffman string compressor and I would like to have a confirmation I am doing the optimal compression with my tree.
I am using this kind of tree:
Instead of this kinda tree:
I think that over 10 single characters, it's not possible to compress on 8 bits..
Is the first image really the optimal one?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
最基本的想法是添加两个最小的节点,创建一个新节点,其值是其 2 个子节点的总和。
遵守此规则直至树的根部可保证生成的树将是最佳。
因此,您无法控制树的形状:它完全取决于字符的概率分布。如果概率分布看起来像斐波那契数列,它最终可能会成为一棵退化树(每层一个分支)。
因此,创建具有预设最大深度的霍夫曼树更加复杂,并且需要打破总是添加 2 个最小节点的通常规则。生成的树显然不是最优的。
The very basic idea is to add the two smallest nodes, creating a new node which value is the sum of its 2 children.
Respecting this rule up to the root of the tree guarantee that the tree produced will be optimal.
Therefore, you have no control on the shape of the tree : it entirely depends on the probability distribution of characters. It may end up being a degenerated tree (one branch per level) if the probability distribution looks like a Fibonacci serie.
Creating Huffman tree with a pre-set maximum depth is therefore more complex, and requires to break the usual rule of always adding the 2 smallest nodes. The resulting tree will obviously not be optimal.