SortedList 与 SortedDictionary 与 Sort()

发布于 2024-08-17 07:46:34 字数 410 浏览 3 评论 0原文

这是诸如这个之类的问题的延续。

是否有调整性能的指导方针？我并不是指大 O 的增益，只是节省一些线性时间。

例如，预排序在 SortedList 或 SortedDictionary 上可以节省多少？

假设我有一个人员类别，有 3 个属性需要排序，其中之一是年龄。我应该首先按年龄对对象进行分类吗？

我应该首先对一个属性进行排序，然后使用生成的列表/字典对两个属性进行排序，依此类推？

想到的还有其他优化吗？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

铁憨憨 2024-08-24 07:46:34

好吧，这在 SortedList 上很容易获胜。插入一个项目需要进行二分查找 (O(log(n)) 来找到插入点，然后使用 List.Insert (O(n)) 来插入项目。Insert() 占主导地位，填充列表需要 O(n) ^2)。如果输入项已经排序，则插入会折叠为 O(1)，但不会影响搜索，您不必担心 Oh 有多大，假设您可以承受双倍的存储要求，

那么它使用红黑树，然后可能需要重新平衡树。也需要 O(log(n))。因此，填充字典需要 O(nlog(n))。使用排序输入不会改变查找插入点或重新平衡的工作量，它仍然是 O(nlog(n))。现在哦很重要，插入排序的输入需要树不断重新平衡本身，如果输入是随机的，那么效果会更好，

所以用排序的输入填充 SortedList 和用未排序的输入填充 SortedDictionary 都是 O( 。 nlog(n))。忽略提供排序输入的成本，SortedList 的 Oh 比 SortedDictionary 的 Oh 小。这是由于 List 分配内存的方式而导致的实现细节。它只需要这样做 O(log(n)) 次，红黑树必须分配 O(n) 次。非常小哦顺便说一句。

值得注意的是，这两种方法都比不上简单地填充 List，然后调用 Sort()。这也是 O(nlog(n))。事实上，如果输入已经被意外排序，您可以绕过 Sort() 调用，这会降低到 O(n)。成本分析现在需要转向对输入进行排序所需的工作。很难绕过 Sort() 的基本复杂性，O(nlog(n))。它可能不容易看到，您可能会按照 SQL 查询等方式对输入进行排序。只是需要更长的时间才能完成。

使用 SortedList 或 SortedDictonary 的目的是在插入后保持集合排序。如果您只担心填充而不担心变异，那么您不应该使用这些集合。

Well, it's an easy win on SortedList. Inserting an item requires a binary search (O(log(n)) to find the insertion point, then a List.Insert (O(n)) to insert the item. The Insert() dominates, populating the list requires O(n^2). If the input items are already sorted then the Insert collapses to O(1) but doesn't affect the search. Populating is now O(nlog(n)). You don't worry how big the Oh is, sorting first is always more efficient. Assuming you can afford the doubled storage requirement.

SortedDictionary is different, it uses a red-black tree. Finding the insertion point requires O(log(n)). Rebalancing the tree might be required afterwards, that also takes O(log(n)). Populating the dictionary thus takes O(nlog(n)). Using sorted input does not change the effort to find the insertion point or rebalancing, it is still O(nlog(n)). Now the Oh matters though, inserting sorted input requires the tree to constant rebalance itself. It works better if the input is random, you don't want sorted input.

So populating SortedList with sorted input and populating SortedDictionary with unsorted input is both O(nlog(n)). Ignoring the cost of providing sorted input, the Oh of SortedList is smaller than the Oh of SortedDictionary. That's an implementation detail due to the way List allocates memory. It only has to do so O(log(n)) times, a red-black tree has to allocate O(n) times. Very small Oh btw.

Notable is that neither one compares favorably over simply populating a List, then calling Sort(). That's also O(nlog(n)). In fact, if input is already accidentally sorted you can bypass the Sort() call, this collapses to O(n). The cost analysis now needs to move to the effort it takes to get the input sorted. It is hard to bypass the fundamental complexity of Sort(), O(nlog(n)). It might not be readily visible, you might get the input sorted by, say, a SQL query. It will just take longer to complete.

The point of using either SortedList or SortedDictonary is to keep the collection sorted after inserts. If you only worry about populating but not mutating then you shouldn't use those collections.

回复收藏 0 原文

~没有更多了~