如何在 C# 中有效地从一个巨大列表中减去另一个列表
我有一个很长的 ID(整数)列表,表示当前数据库中的所有项目:
var idList = GetAllIds();
我还有另一个巨大的通用列表,其中包含要添加到数据库中的项目:
List<T> itemsToAdd;
现在,我想从通用列表中删除所有项目列出其 Id 已在 idList 中的列表。 目前 idList 是一个简单的数组,我像这样减去列表:
itemsToAdd.RemoveAll(e => idList.Contains(e.Id));
我很确定它会快得多,所以我应该对这两个集合使用什么数据类型以及减去它们的最有效的做法是什么?
谢谢你!
I have a very long list of Ids (integers) that represents all the items that are currently in my database:
var idList = GetAllIds();
I also have another huge generic list with items to add to the database:
List<T> itemsToAdd;
Now, I would like to remove all items from the generic list whose Id is already in the idList.
Currently idList is a simple array and I subtract the lists like this:
itemsToAdd.RemoveAll(e => idList.Contains(e.Id));
I am pretty sure that it could be a lot faster, so what datatypes should I use for both collections and what is the most efficient practice to subtract them?
Thank you!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
LINQ 可以提供帮助:
您的代码速度很慢,因为
List.Contains
是O(n)
。因此,您的总成本为O(itemsToAdd.Count*idList.Count)
。您可以将 idList 制作成具有
O(1)
.Contains
的HashSet
。或者只需使用 Linq.Except
扩展方法即可为您完成此操作。请注意,
.Except
还将删除左侧的所有重复项。即 newint[]{1,1,2}.Except(new int[]{2})
将仅产生{1}
并且第二个 1 被删除。但我认为对于你的情况来说这没有问题,因为 ID 通常是唯一的。LINQ could help:
Your code is slow because
List<T>.Contains
isO(n)
. So your total cost isO(itemsToAdd.Count*idList.Count)
.You can make idList into a
HashSet<T>
which hasO(1)
.Contains
. Or just use the Linq.Except
extension method which does it for you.Note that
.Except
will also remove all duplicates from the left side. i.e. newint[]{1,1,2}.Except(new int[]{2})
will result in just{1}
and the second 1 was removed. But I assume it's no problem in your case because IDs are typically unique.暂时将
idList
转换为HashSet
并使用相同的方法,即:它应该快得多
Transform temporarily
idList
to anHashSet<T>
and use the same method i.e.:it should be much faster
假设以下前提成立:
idList
和itemsToAdd
可能不包含重复的值,您可以使用 HashSet 这样:
根据文档 ISet.ExceptWith 方法非常有效:
在您的情况下,
n
是idList
中的项目数。Assuming the following premises are true:
idList
anditemsToAdd
may not contain duplicate valuesyou could use a HashSet<T> this way:
According to the documentation the ISet<T>.ExceptWith method is pretty efficient:
In your case
n
is the number of items inidList
.您应该使用两个
HashSet
。请注意,它们是唯一且无序的。
You should use two
HashSet<int>
s.Note that they're unique and unordered.