.Distinct() 的更快替代方案
我正在制作一款性能至关重要的视频游戏。
我正在使用 .Distinct() 扩展方法从列表中获取唯一值。 有没有更快的方法? (即使这意味着有更多的代码行)
I'm making a video game where performance is critical.
I'm using the .Distinct() extension method to get unique value from a List.
Is there a faster way to do so? (even if it means having many more lines of code)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
.Distinct
是一个O(n)
调用。你不可能比这更快了。
但是,您应该确保您的
GetHashCode
(以及在较小程度上Equals
)尽可能快。根据您的情况,您也许可以将
List
替换为HashSet
,这将首先防止插入重复项。 (还具有O(1)
插入)但是,在得出需要更快的结论之前,请务必分析您的代码。
.Distinct
is anO(n)
call.You can't get any faster than that.
However, you should make sure that your
GetHashCode
(and, to a lesser extent,Equals
) is as fast as possible.Depending on your scenario, you may be able to replace the
List<T>
with aHashSet<T>
, which will prevent duplicates from being inserted in the first place. (yet hasO(1)
insertion)However, Always profile your code before jumping to conclusions about what needs to be faster.
它必须是一个列表吗?
是否可以从 List 切换到 HashSet? HashSet 从一开始就防止对象被多次插入到列表中,因此 Distinct 已经完成了。
Does it have to be a List?
Would it be possible to switch from List, to HashSet? HashSet prevents objects from being inserted into the list more than once in the first place, so the Distinct is already done.
如果您可以就地执行不同操作,则可以通过首先使用 Array.Sort 来非常快速地进行零分配,然后:
然后您将必须跟踪现在较小的数组大小,或使用 Array.resize (但这将分配一个新数组)
或者,如果您使用
List
执行相同的方法,您可以在末尾调用RemoveRange
调整其大小而不分配。这最终会快得多。其他海报可能是正确的,尽管您可以通过其他方式实现此目标,例如首先使用哈希集,或者保持并行集合,其中一个集合始终只包含不同的元素。抵消插入/删除的少量成本,以便根本不需要时间来获取不同的集合。
If you can do the distinct in place, you can do it very quickly and with zero allocations by first using
Array.Sort
and then:You will then have to keep track of the now smaller size of the array, or use Array.resize (But that will allocate a new array)
Alternatively if you do this same approach with a
List<T>
you can callRemoveRange
at the end to resize it without allocating. This ends up being significantly quicker.Other posters are probably correct though that you can achieve this goal some other way, such as using a hashset in the first place, or keeping parallel collections where one contains only the distinct elements all the time. Offsetting small costs on insert/remove so that no time at all is required to get the distinct set.