我不明白这个霍夫曼算法的实现
template<class T>
void huffman(MinHeap<TreeNode<T>*> heap, int n)
{
for(int i=0;i<n-1;i++)
{
TreeNode<T> *first = heap.pop();
TreeNode<T> *second = heap.pop();
TreeNode<T> *bt = new BinaryTreeNode<T>(first, second, first.data, second.data);
heap.push(bt);
}
}
在我的 C++ 数据结构基础教科书中,它给出了 2 页的定义霍夫曼编码,以及上面的代码。对我来说,这本书不够详细,所以我进行了谷歌搜索,了解了霍夫曼编码的过程是如何工作的。教科书声称,在上述代码的末尾,生成了一棵霍夫曼树。但对我来说,这似乎是错误的,因为霍夫曼树不一定是完整的树,但由于 heap.push()
,上面的代码似乎总是给出完整的树。那么有人可以向我解释一下这段代码如何没有错误吗?
template<class T>
void huffman(MinHeap<TreeNode<T>*> heap, int n)
{
for(int i=0;i<n-1;i++)
{
TreeNode<T> *first = heap.pop();
TreeNode<T> *second = heap.pop();
TreeNode<T> *bt = new BinaryTreeNode<T>(first, second, first.data, second.data);
heap.push(bt);
}
}
In my Fundamentals of Data Structures in C++ textbook, it gave a 2 page definition of Huffman coding, and the code above. To me, the book wasn't enough detailed, so I've done the googling and I learned how the process of Huffman coding works. The textbook claims that at the end of the code above, a Huffman tree is made. But to me it seems wrong, because a Huffman tree, is not necessary a complete tree, but the code above seems to always give a complete tree because of the heap.push()
. So can someone explain to me how this piece of code is not wrong?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
堆的树结构不一定与生成的霍夫曼树相匹配——相反,堆包含部分霍夫曼树的森林,最初每个树都包含一个符号节点。然后,循环重复获取权重最小的两个节点,将它们组合成一个节点,并将所得组合节点放回原处。在该过程结束时,堆包含一棵完成的树。
The heap's tree structure does not necessarily match the resulting Huffman tree -- rather, the heap contains a forest of partial Huffman trees, initially each consisting of a single symbol node. The loop then repeatedly takes the two nodes with the least weight, combines them into one node, and puts the resulting combined node back. At the end of the process, the heap contains one finished tree.
霍夫曼编码的工作原理是在每一步取两个最低值的项。当您第一次调用该函数时(因为您的 MinHeap 是按值排序的),两个最低值的项将被弹出并“组合”到一个决策节点中,然后将其放回堆中。该节点根据其子分数的总和进行评分,然后放回堆中。将其插回堆中会根据其分数将其放置到正确的位置;如果它仍然低于任何其他项目,它将是第一个,否则它将在其他地方。
所以这个算法是从下往上构建树,当你清空堆时,你将拥有一棵完整的树。但我不明白“n”是什么意思;循环应该是
while (heap.size() > 1)
。无论如何,树不是“满的”,不同的分支将具有不同的长度,具体取决于初始堆中项目的频率如何评分。Huffman encoding works by taking the two lowest value items at each step. When you first call the function (since your MinHeap is sorted by value) the two lowest value items are popped off and "combined" into a decision node which is then put back into the heap. That node is scored by the sum of its child scores and put back into the heap. Inserting it back into the heap puts it into the right place based on its score; if it's still lower than any other items it'll be first, otherwise it'll be somewhere else.
So this algorithm is building the tree from the bottom up, and by the time you empty the heap you'll have a complete tree. I don't understand what the 'n' is for, though; the loop should be
while (heap.size() > 1)
. Regardless, the tree is not "full", different branches will be different lengths depending on how the frequencies of the items in the initial heap are scored.