Lambda 表达式求差
通过以下数据,
string[] data = { "a", "a", "b" };
我非常希望找到重复项并获得此结果:
a
我尝试了以下代码,
var a = data.Distinct().ToList();
var b = a.Except(a).ToList();
显然这不起作用,我可以看到上面发生了什么,但我不确定如何修复它。
With the following data
string[] data = { "a", "a", "b" };
I'd very much like to find duplicates and get this result:
a
I tried the following code
var a = data.Distinct().ToList();
var b = a.Except(a).ToList();
obviously this didn't work, I can see what is happening above but I'm not sure how to fix it.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
当运行时没有问题时,您可以使用
Good old O(n^n) =)
Edit: Now 以获得更好的解决方案。 =)
如果您定义一个新的扩展方法,例如
您可以使用
When runtime is no problem, you could use
Good old O(n^n) =)
Edit: Now for a better solution. =)
If you define a new extension method like
you can use
使用group by stuff,这些方法的性能相当不错。如果您正在处理大型数据集,唯一需要担心的是巨大的内存开销。
--或者,如果您更喜欢扩展方法,
其中
Count() == 1
是您的不同项,而Count() > 则为1
这是一个或多个重复项。由于 LINQ 有点懒,如果您不想重新评估计算,可以这样做:
创建分组时,将创建一组集合。假设它是一个插入
O(1)
的集合,则按方法分组的运行时间为O(n)
。每个操作产生的成本有点高,但它应该相当于接近线性的性能。Use the group by stuff, the performance of these methods are reasonably good. Only concern is big memory overhead if you are working with large data sets.
--OR if you prefer extension methods
Where
Count() == 1
that's your distinct items and whereCount() > 1
that's one or more duplicate items.Since LINQ is kind of lazy, if you don't want to reevaluate your computation you can do this:
When creating the grouping a set of sets will be created. Assuming that it's a set with
O(1)
insertion the running time of the group by approach isO(n)
. The incurred cost for each operation is somewhat high, but it should equate to near linear performance.对数据进行排序、迭代并记住最后一项。当当前项目与上一个项目相同时,它是重复的。这可以通过迭代或使用 lambda 表达式在 O(n*log(n)) 时间内轻松实现。
Sort the data, iterate through it and remember the last item. When the current item is the same as the last, its a duplicate. This can be easily implemented either iteratively or using a lambda expression in O(n*log(n)) time.