使用 .NET 创建不同的项目列表的最有效方法是什么?
我有一个很大的值列表(100-200 个字符串),我需要返回它们的不同列表。使用 .NET 执行此操作最有效的方法是什么?我能想到的两种方法是:
- 使用 IEnumerable 类的 Distinct() 方法
- 使用字典
如果字典方法在原始条件下速度更快,请考虑围绕代码的可维护性做出权衡决策。
I have a large list of values (100-200 character strings) and I need to return a distinct listing of them. What is the most efficient way to do this using .NET? The 2 ways that I can think of are:
- Use the Distinct() method of the IEnumerable class
- Use a Dictionary
If the Dictionary approach is faster in raw terms, consider a trade-off decision around maintainability of code.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果您只执行一次,我希望
Enumerable.Distinct
与使用字典一样快。如果您希望能够添加/删除值并保持独特性,您可以构建一个HashSet
(这基本上是我期望 Distinct 在幕后所做的事情,但是Distinct()
显然会在找到新值时返回它们,从而保持顺序。事实上,
如果您不介意顺序混乱的话,只需使用:将是一个非常好的(而且简单)的解决方案。比使用
字典
更简单,而且概念上也更清晰(因为您真的不想将键映射到值)(与以往一样,我建议找到最多的 )。首先,对其进行基准测试 - 如果它“足够快”,则继续使用它。如果您想将其用作另一个查询的一部分,那么
Distinct
很可能是最可读的方式。我建议使用HashSet
。)I would expect
Enumerable.Distinct
to be about as fast as using a dictionary if you're only doing it once. If you want to be able to add/remove values and keep the distinct-ness, you could build aHashSet<string>
(which is basically what I expect Distinct is doing under the hood, butDistinct()
will obviously return new values as it finds them, maintaining order.In fact, just using:
will be a pretty good (and simple) solution if you don't mind the ordering being messed up. It's simpler than using a
Dictionary
, and conceptually cleaner as well (as you don't really want to map keys to values).(As ever, I would suggest finding the most readable solution first, and benchmark it - if it's "fast enough" then go with that. If you want to use this as part of another query, then
Distinct
may well be the most readable way. Otherwise, I'd suggestHashSet
.)我个人会使用 LINQ 提供的 Distinct() 方法。它更容易阅读和维护。虽然使用 LINQ 比使用字典慢,但差异很小(在您列出的情况下),并且您最好花时间优化数据库查询或 Web 服务调用。
I would personally go with the Distinct() method provided by LINQ. It's far easier to read and maintain. Whilst using LINQ will be slower than using a dictionary the difference will be small (in the case you've listed) and you'd be better spending time optimizing database queries or web service calls.
我建议您在这里使用分析。生成一个包含示例项目的列表,使用两种方式对其进行 1M 次排序,并测量每种方式所用的时间。
如果考虑可读性,请创建一个
GetDistinctItems
方法并将代码放入其中:瞧,自记录代码。I would siggest you to use profiling here. Generate a list with sample items, sort it say 1M times using both ways, and measure the time used by each way.
If readability is a concern, create a
GetDistinctItems
method and put your code inside it: voilà, self-documented code.