如何最少地表示检查树的检查状态?
如果我有一个树结构,其节点可以有零到多个子节点,每个节点都保存一些数据值以及布尔开关,那么我如何最小化具有特定开关值的节点表示该树的状态?
例如,假设我的树看起来像这样:
A[0] -> B[1] -> C[1]
|-----> D[1]
|-----> E[1]
这里我们有一个检查了 4 个节点的状态,有没有办法以简洁的方式表示这个状态?最简单的方法是将四个节点列为正在检查的节点,但如果节点 B 有 100 个子节点而不是只有 4 个节点怎么办?
我当前的思路是将每个节点的祖先存储在数据组件中,并根据祖先集来描述检查的状态,从而最大限度地减少表示状态所需的数据。在下面的树中,节点 N 的祖先表示为 n'。所以上面的树现在看起来像这样:
A[0, {a}] -> B[1, {a', b}] -> C[1, {a' b' c}]
|--------------> D[1, {a' b' d}]
|--------------> E[1, {a' b' e}]
现在您可以分析树并看到节点 A 的所有子节点都被检查,并简单地描述状态为具有数据元素 a' 的节点设置为 1,或者只是 [a' ]。如果节点 D 的状态切换为 0,则可以将树状态描述为 [a' not d]。
是否有可以用来解决此类问题的数据结构或算法?关于更好的方法有什么想法吗?对分析算法有什么想法吗?
谢谢
If i have a tree structure whose nodes can have zero to many children, with each node holding some data value along with a boolean switch, how do i minimally represent the state of this tree for nodes with a particular switch value?
For example, say that my tree looks something like:
A[0] -> B[1] -> C[1]
|-----> D[1]
|-----> E[1]
Here we have a state where 4 nodes are checked, is there a way to represent this state in a concise manner? The naive approach would be to list the four nodes as being checked, but what if node B had 100 children instead of just four?
My current line of thinking is to store each node's ancestor in the data component and describe the checked state in terms of the set of ancestors that minimize the data required to represent a state. In the tree below, an ancestor of node N is represented as n'. So the above tree would now look something like:
A[0, {a}] -> B[1, {a', b}] -> C[1, {a' b' c}]
|--------------> D[1, {a' b' d}]
|--------------> E[1, {a' b' e}]
Now you can analyze the tree and see that all of node A's children are checked, and describe the state simply as the nodes with data element a' are set to 1, or just [a']. If node D's state switched to 0, the you could describe the tree state as [a' not d].
Are there data structures or algorithms that can be used to solve a problem of this type? Any thoughts on a better approach? Any thoughts on the analysis algorithm?
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
使用从根开始的先序树遍历。如果检查了一个节点,则不要遍历其子节点。对于每个遍历的节点,将其检查状态(布尔值 0/1)存储在布尔位图(8 位/字节)中。最后使用 zip/bzip 或任何其他压缩技术压缩结果。
当重建状态时,首先解压缩,然后使用前序树遍历,根据状态设置每个节点,如果状态被选中,则将所有子节点设置为选中并跳过它们。
Use a preorder tree traversal starting from the root. If a node is checked don't traverse its children. For each traversed node store it's checked state (boolean 0/1) in a boolean bitmap (8bits/byte). Finally compress the result with zip/bzip or any other compression technique.
When you reconstruct the state, first decompress, then use preorder tree traversal, set each node based on the state, if state is checked set all children to checked and skip them.
一般来说,没有任何技术能够始终将检查的元素存储在少于 n 位的空间中,其中 n 是树中元素的数量。这背后的基本原理是,有 2^n 种不同的可能检查状态,因此您至少需要 2^n 种不同的编码,因此必须至少有一种长度为 2^n 的编码,因为只有 2^n - 1 种编码比这个短的。
鉴于此,如果您确实想最大程度地减少空间使用,我建议使用@yi_H 建议的编码。它为每个编码精确地使用 n 位。您也许能够通过对位应用标准压缩算法来压缩大多数编码,这对于实际的检查节点集可能效果很好,但在最坏的情况下会优雅地降级。
希望这有帮助!
In general there is no technique that will always be able to store the checked elements in fewer than n bits of space, where n is the number of elements in the tree. The rationale behind this is that there are 2^n different possible check states, so you need at least 2^n different encodings, so there must be at least one coding of length 2^n since there are only 2^n - 1 encodings that are shorter than this.
Given this, if you really want to minimize space usage, I would suggest going with an encoding like the one @yi_H suggests. It uses precisely n bits for each encoding. You might be able to compress most of the encodings by applying a standard compression algorithm to the bits, which for practical sets of checked nodes might do quite well, but which degrades gracefully in the worst case.
Hope this helps!