我有一个如下所示的数据结构:
public class Node
{
public string Code { get; set; }
public string Description { get; set; }
...
public List<Node> Children { get; set; }
}
我想编写一个在给定指定的Code
的情况下返回特定节点的方法。通常我只会递归遍历层次结构来查找节点,但我担心性能。层次结构中将有数千个节点,并且此方法将被调用很多很多次。
我如何构建它以使其更快?我是否可以使用现有的数据结构来对 Code
执行二分搜索,同时保留层次结构,而无需自己重新实现某种形式的二分搜索?
I have a data structure that looks like this:
public class Node
{
public string Code { get; set; }
public string Description { get; set; }
...
public List<Node> Children { get; set; }
}
I want to write a method that will return a specific node, given the specified Code
. Normally I would just do a recursive walk through the hierarchy to find the node, but I'm concerned about performance. There will be several thousand nodes in the hierarchy, and this method will be called many, many times.
How do I structure this to make it faster? Can I use an existing data structure that perhaps performs a binary search on Code
while retaining the hierarchical structure, without re-implementing some form of binary search myself?
发布评论
评论(8)
将所有节点添加到字典中,并以代码为键。 (你可以做一次),字典中的查找基本上是O(1)。
如果您知道根,用法将是:
如果您不知道,您可以使用相同的字典在所有节点上调用
FillDictionary()
方法。Add all the nodes to dictionary with the code as key. (you can do it once), the look-up in dictionary is basically O(1).
If you know the root, usage will be:
If you don't you can call the
FillDictionary()
method on all your nodes with the same dictionary.这是我和其他人所讨论的内容的完整实现。请注意,通过使用索引字典,您将使用更多的内存(仅对节点的引用)以换取更快的搜索。我正在使用事件来动态更新索引。
编辑:添加了评论并修复了一些问题。
Here's a full implementation of what I and others were talking about. Note that by having the index dictionary, you will use a bit more memory (only references to the nodes) in exchange for faster searches. I'm using events to dynamically update the index.
Edit: Added comments and fixed up a few things.
如果没有任何类型的基于比较的代码组织,您就无法阻止 O(n) 遍历树。但是,如果您在读取 XML 文件来构建节点树的同时构建另一个数据结构(用于 O(1) 访问的字典或用于 O(log n) 访问的列表),则可以构建额外的结构可以快速访问,而无需太多开销。
Without any kind of comparison-based organization for Code, there's nothing you can do to prevent an O(n) walkthrough of the tree. However, if you build another data structure (either a Dictionary for O(1) access or List for O(log n) access) at the same time you're reading through your XML file to build the Node tree, you could build the additional structure to access quickly without much more overhead.
将它们存储在字典中,这为您提供了 O(1) 查找时间。
假设 n 是一个水合的 Node 对象。然后,要获取特定的节点项,您可以执行
该操作,这将根据提供的键返回您的特定节点。您甚至可以使用以下方法检查节点是否存在:
为了了解该节点在原始集合中的位置,您需要向 Node 类添加某种指针层次结构,以便它了解前一个节点
Store them in a Dictionary, this affords you O(1) lookup time.
Assuming
n
is a hydratedNode
object. Then to get a particular node item you can dowhich will return your particular Node according to the provided key. You can even check the node exists using :
In order to know where the node was in the original collection you would need to add some sort of pointer hierarchy to your Node class so that it had knowledge of the previous node
如果可以更改节点的顺序,则可以创建二叉搜索树。
If you can change the order of the nodes, you can make a Binary Search Tree.
这个答案引用了您应该能够使用的库?
This SO answer references a library that you should be able to use?
我会说实话;我很难理解 Itay 的建议 没有完全意义。
这是您所说的要求:
所以
Code
是唯一的,我认为吗?然后就没有什么可以阻止您将所有Node
对象放入Dictionary
中。在您对 Itay 的回答的评论中,您这样说:
如果您的意思是您不明白字典如何知道您的
Node
在数据结构中的位置,那是因为它不是。这有关系吗?您没有在需求中说明您想知道该节点在数据结构中的位置;您仅指定要获取节点。为了做到这一点,字典只需要知道节点在内存中的位置,而不是在一些完全独立的数据结构中。提供一个简化的示例(如果我在这里侮辱了您的智力,我深表歉意,但请耐心等待,因为这至少可以向其他人澄清这一点),假设您有一个简单的
LinkedList
包含所有唯一整数。然后,您枚举该列表并使用它构造一个Dictionary>
,其想法是您希望能够根据节点的值快速找到节点。字典是否需要知道每个节点在链表中的位置?当然不是——只是在记忆中。一旦您使用字典根据 O(1) 的值找到了节点,您当然可以使用节点本身轻松地向前或向后遍历链表,而节点本身恰好(通过设计)知道链表包含它。
它与层次结构相同,只是比链表复杂一点。但同样的原则也适用。
I'll be honest; I'm having great difficulty understanding how Itay's suggestion doesn't make perfect sense.
Here is the requirement that you've stated:
So the
Code
is unique, I take it? Then there's nothing stopping you from putting all of yourNode
objects into aDictionary<string, Node>
.In your comments to Itay's answer you say this:
If you mean you don't understand how the dictionary is going to know where your
Node
is in the data structure, that's because it isn't. Does this matter? You haven't stated in your requirements that you want to know where the node is in the data structure; you only specified that you want to get the node. In order to do this, the dictionary only needs to know where the node is in memory, not in some completely separate data structure.To provide a simplified example (and I apologize if I'm insulting your intelligence here, but bear with me as this may at least clarify the point for someone else), suppose you had a plain
LinkedList<int>
containing all unique integers. You then enumerate over this list and use it to construct aDictionary<int, LinkedListNode<int>>
, the idea being that you want to be able to quickly find a node based on its value.Does the dictionary need to know where in the linked list each node is? Certainly not—only where it is in memory. Once you've found your node based on its value in O(1) using the dictionary, you can of course easily traverse the linked list forwards or backwards using the node itself, which happens to be aware (by design) of the linked list containing it.
It's the same with your hierarchical structure, only a bit more complex than a linked list. But the same principle applies.
为什么不使用 SortedSet
构建一个包含所有 Node 实例的 BST?比较器将基于Code
- 容器必须限定范围,以便这在所有成员中都是唯一的。Why not use SortedSet
<Node>
to build a BST containing all your Node instances? Comparator would be based onCode
- container would have to be scoped such that this is unique across all members.