高效的霍夫曼树搜索,同时记住所采取的路径

发布于 2024-07-19 13:31:58 字数 751 浏览 18 评论 0原文

作为与我的有关存储哈夫曼树的有效方式的问题相关的后续问题我想知道搜索二叉树(基于霍夫曼编码输出)并存储到特定节点的路径的最快和最有效的方法是什么。

这就是我目前所拥有的:

  • 将根节点添加到队列
  • 当队列不为空时 ,从队列中弹出项目
    • 检查这是否是我们正在寻找的内容
      • 是的: 跟随头指针回到根节点,同时在每个访问的节点上检查它是左节点还是右节点并记录下来。
      • 退出搜索
    • 将左右节点入队

由于这是一棵霍夫曼树,因此我要查找的所有条目都将存在。 上面是广度优先搜索,这被认为是霍夫曼树的最佳搜索,因为源中经常出现的项目在树中的位置较高,以获得更好的压缩,但是我找不到一种好方法来跟踪我们如何使用我放入节点中的头指针在不回溯的情况下到达特定节点。

在这种情况下,我还以相反的顺序获取所有右/左路径,例如,如果我们沿着头到根,我们发现从根开始它是右,左,左,我们得到左, 左右。 或二进制的 001,而我正在寻找的是以有效的方式获得 100。

还建议将从根到节点的路径存储为节点内的单独值,但是,如果我们有一棵树大于我们为此目的创建的变量可以容纳的位数,那么这就会崩溃。存储数据的点也会占用大量内存。

As a follow up question related to my question regarding efficient way of storing huffman tree's I was wondering what would be the fastest and most efficient way of searching a binary tree (based on the Huffman coding output) and storing the path taken to a particular node.

This is what I currently have:

  • Add root node to queue
  • while queue is not empty, pop item off queue
    • check if it is what we are looking
      • yes:
        Follow a head pointer back to the root node, while on each node we visit checking whether it is the left or right and making a note of it.
      • break out of the search
    • enqueue left, and right node

Since this is a Huffman tree, all of the entries that I am looking for will exist. The above is a breadth first search, which is considered the best for Huffman trees since items that are in the source more often are higher up in the tree to get better compression, however I can't figure out a good way to keep track of how we got to a particular node without backtracking using the head pointer I put in the node.

In this case, I am also getting all of the right/left paths in reverse order, for example, if we follow the head to the root, and we find out that from the root it is right, left, left, we get left, left, right. or 001 in binary, when what I am looking for is to get 100 in an efficient way.

Storing the path from root to the node as a separate value inside the node was also suggested, however this would break down if we ever had a tree that was larger than however many bits the variable we created for that purpose could hold, and at that point storing the data would also take up huge amounts of memory.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

原谅我要高飞 2024-07-26 13:31:58

如果您一次一位地解码霍夫曼编码的数据,您的性能将会很差。 尽管您希望避免使用查找表,但如果您关心性能,这是唯一的方法。 霍夫曼代码的创建方式从左到右都是唯一的,非常适合快速表查找。

If you're decoding Huffman-encoded data one bit at a time, your performance will be poor. As much as you'd like to avoid using lookup tables, that's the only way to go if you care about performance. The way Huffman codes are created, they are left-to-right unique and lend themselves perfectly to a fast table lookup.

顾忌 2024-07-26 13:31:58

创建值字典-> 位字符串,这将为您提供最快的查找。

如果这些值的大小已知,您可能只需使用一个位字符串数组即可通过它们的索引查找这些值。

Create a dictionary of value -> bit-string, that would give you the fastest lookup.

If the values are a known size, you can probably get by with just an array of bit-strings and look up the values by their index.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文