在深度优先搜索期间检测谱系图中的循环
我正在递归加载马谱系数据。 对于一些错误的数据集,我的递归永远不会停止......那是因为数据中存在循环。
如何检测这些循环以停止重复?
我想到在重复时维护一个包含所有“访问过”的马的哈希表。 但这会发现一些误报,因为一匹马可能会在一棵树上两次。
不可能发生的是,一匹马以自己的父亲、祖父或曾祖父的身份出现。
I'm loading horse genealogical data recursively.
For some wrong sets of data my recursion never stops... and that is because there are cycles in the data.
How can I detect those cycles to stop recurring?
I thought of while recurring maintain a hashTable with all "visited" horses.
But that will find some false positives, because a horse CAN be twice in a tree.
What cannot happen is that a horse appear as a father or grandfather or great grandfather of ITSELF.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
伪代码:
基本思想是保留我们在到达当前节点的过程中已经见过的所有节点的列表; 如果回到我们已经经历过的节点,那么你知道我们已经形成了一个循环(我们应该跳过该值,或者做任何需要做的事情)
Pseudo code:
The basic idea is to keep a list of all of the nodes that we've already seen on our way down to the current node; if was get back to a node that we already went through, then you know that we've formed a cycle (and we should skip the value, or do whatever needs to be done)
维护一直到树根的所有元素的堆栈。
每次沿着树向下前进时,请扫描堆栈以查找子元素。 如果找到匹配项,那么您就发现了循环,应该跳过该子项。 否则,将子级推入堆栈并继续。 每当你回溯树时,从堆栈中弹出一个元素并丢弃。
(对于谱系数据,树中的“子”节点可能是“父”节点的生物父。)
Maintain a stack of all elements leading up to the root of the tree.
Every time you advance down the tree, scan the stack for the child element. If you find a match, then you've discovered a loop and should skip that child. Otherwise, push the child onto the stack and proceed. Whenever you backtrack up the tree, pop an element out of the stack and discard.
(In the case of genealogical data, a "child" node in the tree is presumably the biological parent of the "parent" node.)
这听起来像是您终于可以应用面试琐事问题的情况:仅使用 O(1) 内存在链表中找到一个循环。
在这种情况下,您的“链接列表”是您枚举的元素序列。 使用两个枚举器,其中一个以半速运行,如果快的枚举器遇到慢的枚举器,那么就会出现循环。 这也将是 O(n) 时间,而不是检查“已查看”列表所需的 O(n^2) 时间。 缺点是只有在某些节点被多次处理后您才会发现循环。
在示例中,我用更易于编写的“删除标记”方法替换了“半速”方法。
This sounds like a case where you can finally apply that interview trivia question: find a cycle in a linked list using only O(1) memory.
In this case your "linked list" is the sequence of elements you enumerate. Use two enumerators, run one at half speed, and if the fast one ever runs into the slow one then you have a loop. This will also be O(n) time instead of the O(n^2) time required for checking a 'seen' list. The downside is you only find out about the loop after some of the nodes have been processed multiple times.
In the example I've replaced the 'half speed' method with the simpler-to-write 'drop markers' method.
检测这一点的一个非常简单的方法是检查该约束本身:
每当您在树中插入一个节点时,请遍历树到根以确保马不作为任何类型的父节点存在。
为了加快速度,您可以将哈希表关联到每个节点,在其中缓存此类查找的答案。 这样,下次在该节点下插入马时就不必搜索整个路径。
A very simple way to detect this, is by checking that constraint itself:
Whenever you insert a node in your tree, traverse the tree to the root to make sure that the horse does not exist as a any kind of parent.
To speed this up, you can associate a hashtable to each node, where you cache the answer of such a lookup. Then you don't have to search the entire path next time you insert a horse under that node.
如果您跟踪的是节点而不是马,那么您的哈希表解决方案应该有效。 只要确保每次读取新马时都会创建一个新节点,即使值/马与前一个节点的值/马相同。
Your hash table solution should work if you keep track of nodes instead of horses. Just make sure every time you read a new horse you create a new node even if the value/horse is the same as a previous node's value/horse.
您正在处理有向无环图,而不是树。 不应该有任何循环,因为马的后代不可能也是它的祖先。
知道这一点后,您应该应用特定于有向无环图的代码技术。
You're dealing with a directed acyclic graph, not a tree. There should not be any cycles, as a horse's descendant cannot also be its ancestor.
Knowing this, you should apply code techniques specific to directed acyclic graphs.