这是一个菜鸟问题 - 我对 C# 和泛型还相当陌生,对谓词、委托和 lambda 表达式也完全陌生...
我有一个“Enquiries”类,其中包含另一个名为“Vehicles”的类的通用列表。我正在构建代码以从父查询中添加/编辑/删除车辆。目前,我正在专门研究删除。
从我到目前为止所读到的内容来看,我似乎可以使用 Vehicles.RemoveAll() 来删除具有特定 VehicleID 的项目或具有特定 EnquiryID 的所有项目。我的问题是理解如何为 .RemoveAll 提供正确的谓词 - 我看到的例子太简单了(或者由于我缺乏谓词、委托和 lambda 表达式的知识,我可能太简单了)。
因此,如果我有一个List;车辆
,其中每个车辆都有一个EnquiryID
,我如何使用Vehicles.RemoveAll()
删除给定EnquiryID的所有车辆?
我知道有几种方法可以实现这一点,所以我很想听听方法之间的差异 - 尽管我需要让某些东西发挥作用,但这也是一个学习练习。
作为补充问题,通用列表是这些对象的最佳存储库吗?我的第一个倾向是收集,但看来我已经过时了。当然,泛型似乎是首选,但我对其他替代方案很好奇。
This is a bit of noob question - I'm still fairly new to C# and generics and completely new to predicates, delegates and lambda expressions...
I have a class 'Enquiries' which contains a generic list of another class called 'Vehicles'. I'm building up the code to add/edit/delete Vehicles from the parent Enquiry. And at the moment, I'm specifically looking at deletions.
From what I've read so far, it appears that I can use Vehicles.RemoveAll() to delete an item with a particular VehicleID or all items with a particular EnquiryID. My problem is understanding how to feed .RemoveAll the right predicate - the examples I have seen are too simplistic (or perhaps I am too simplistic given my lack of knowledge of predicates, delegates and lambda expressions).
So if I had a List<Of Vehicle> Vehicles
where each Vehicle had an EnquiryID
, how would I use Vehicles.RemoveAll()
to remove all vehicles for a given EnquiryID?
I understand there are several approaches to this so I'd be keen to hear the differences between approaches - as much as I need to get something working, this is also a learning exercise.
As an supplementary question, is a Generic list the best repository for these objects? My first inclination was towards a Collection, but it appears I am out of date. Certainly Generics seem to be preferred, but I'm curious as to other alternatives.
发布评论
评论(5)
RemoveAll()
方法接受Predicate
委托(在此之前没有什么新内容)。谓词指向仅返回 true 或 false 的方法。当然,RemoveAll
将从集合中删除所有在应用谓词的情况下返回 True 的T
实例。C# 3.0 允许开发人员使用多种方法将谓词传递给
RemoveAll
方法(而且不仅仅是这个……)。您可以使用:Lambda 表达式
匿名方法
普通方法
The
RemoveAll()
methods accept aPredicate<T>
delegate (until here nothing new). A predicate points to a method that simply returns true or false. Of course, theRemoveAll
will remove from the collection all theT
instances that return True with the predicate applied.C# 3.0 lets the developer use several methods to pass a predicate to the
RemoveAll
method (and not only this one…). You can use:Lambda expressions
Anonymous methods
Normal methods
T 中的谓词是一个接受 T 并返回 bool 的委托。 List.RemoveAll 将删除列表中调用谓词返回 true 的所有元素。提供简单谓词的最简单方法通常是 lambda 表达式,但是您还可以使用匿名方法或实际方法。
A predicate in T is a delegate that takes in a T and returns a bool. List<T>.RemoveAll will remove all elements in a list where calling the predicate returns true. The easiest way to supply a simple predicate is usually a lambda expression, but you can also use anonymous methods or actual methods.
这应该可行(其中
enquiryId
是您需要匹配的 id):它的作用是将列表中的每个车辆传递到 lambda 谓词中,并对谓词进行求值。如果谓词返回 true(即
vehicle.EnquiryID == enquiryId
),则当前车辆将从列表中删除。如果您知道集合中对象的类型,那么使用通用集合是更好的方法。它可以避免从集合中检索对象时进行强制转换,但如果集合中的项目是值类型(这可能会导致性能问题),也可以避免装箱。
This should work (where
enquiryId
is the id you need to match against):What this does is passes each vehicle in the list into the lambda predicate, evaluating the predicate. If the predicate returns true (ie.
vehicle.EnquiryID == enquiryId
), then the current vehicle will be removed from the list.If you know the types of the objects in your collections, then using the generic collections is a better approach. It avoids casting when retrieving objects from the collections, but can also avoid boxing if the items in the collection are value types (which can cause performance issues).
有点偏离主题,但说我想从列表中删除所有 2。这是一种非常优雅的方法。
使用谓词:
+1 只是为了鼓励您在这里留下答案以供学习之用。你说它偏离主题也是对的,但我不会因此而责备你,因为将你的例子留在这里有很大的价值,再次,严格来说是为了学习目的。我将此回复作为编辑发布,因为将其作为一系列评论发布是不守规矩的。
虽然你的例子很短&紧凑,但在效率方面都不优雅;第一个在 O(n2) 上很糟糕,第二个在 O(n3) 上绝对糟糕。 O(n2) 的算法效率很差,应尽可能避免,特别是在通用代码中; O(n3) 的效率非常糟糕,在所有情况下都应该避免,除非您知道 n 总是很小。有些人可能会抛出他们的“过早优化是万恶之源”的战斧,但他们这样做很天真,因为他们没有真正理解二次增长的后果,因为他们从未编写过必须处理大型数据集的算法。因此,他们的小数据集处理算法通常运行速度比实际速度慢,而且他们不知道自己可以运行得更快。高效算法和低效算法之间的差异通常很细微,但性能差异可能是巨大的。了解算法性能的关键是了解您选择使用的原语的性能特征。
在第一个示例中,
list.Contains()
和Remove()
都是 O(n),因此while()
循环有一个在谓词 &主体中的另一个是 O(n2);好吧,从技术上讲,O(m*n) 是 O(m*n),但随着要删除的元素数量 (m) 接近列表的长度 (n),它会接近 O(n2)。你的第二个例子更糟糕:O(n3),因为每次你调用
Remove()
时,你也会调用First(predicate)
,也是 O(n)。想一想:Any(predicate)
循环遍历列表,查找predicate()
返回 true 的任何元素。一旦找到第一个这样的元素,它就会返回 true。在while()
循环体中,您随后调用list.First(predicate)
,它再次循环遍历列表来查找list.Any(predicate)
已找到相同的元素。一旦First()
找到它,它就会返回传递给list.Remove()
的元素,该元素第三次循环遍历列表 再次找到之前由Any()
和First()
找到的相同元素,以便最终将其删除。一旦删除,整个过程会从一个稍短的列表开始,一遍又一遍地从头开始执行所有循环,直到最后没有剩下更多与谓词匹配的元素。因此,当 m 接近 n 时,第二个示例的性能为 O(m*m*n) 或 O(n3)。从列表中删除与某些谓词匹配的所有项目的最佳选择是使用通用列表自己的
List.RemoveAll(predicate)
方法,只要您的谓词是 O(n)是 O(1)。for()
循环技术仅传递列表一次,为每个要删除的元素调用list.RemoveAt()
,似乎为 O(n),因为它似乎只经过一次循环。这样的解决方案比您的第一个示例更有效,但仅通过一个常数因子,就算法效率而言可以忽略不计。即使是for()
循环实现也是 O(m*n),因为每次调用Remove()
都是 O(n)。由于for()
循环本身的时间复杂度为 O(n),并且它调用Remove()
m 次,因此for()
循环的增长为当 m 接近 n 时,复杂度为 O(n2)。Little bit off topic but say i want to remove all 2s from a list. Here's a very elegant way to do that.
With predicate:
+1 only to encourage you to leave your answer here for learning purposes. You're also right about it being off-topic, but I won't ding you for that because of there is significant value in leaving your examples here, again, strictly for learning purposes. I'm posting this response as an edit because posting it as a series of comments would be unruly.
Though your examples are short & compact, neither is elegant in terms of efficiency; the first is bad at O(n2), the second, absolutely abysmal at O(n3). Algorithmic efficiency of O(n2) is bad and should be avoided whenever possible, especially in general-purpose code; efficiency of O(n3) is horrible and should be avoided in all cases except when you know n will always be very small. Some might fling out their "premature optimization is the root of all evil" battle axes, but they do so naïvely because they do not truly understand the consequences of quadratic growth since they've never coded algorithms that have to process large datasets. As a result, their small-dataset-handling algorithms just run generally slower than they could, and they have no idea that they could run faster. The difference between an efficient algorithm and an inefficient algorithm is often subtle, but the performance difference can be dramatic. The key to understanding the performance of your algorithm is to understand the performance characteristics of the primitives you choose to use.
In your first example,
list.Contains()
andRemove()
are both O(n), so awhile()
loop with one in the predicate & the other in the body is O(n2); well, technically O(m*n), but it approaches O(n2) as the number of elements being removed (m) approaches the length of the list (n).Your second example is even worse: O(n3), because for every time you call
Remove()
, you also callFirst(predicate)
, which is also O(n). Think about it:Any(predicate)
loops over the list looking for any element for whichpredicate()
returns true. Once it finds the first such element, it returns true. In the body of thewhile()
loop, you then calllist.First(predicate)
which loops over the list a second time looking for the same element that had already been found bylist.Any(predicate)
. OnceFirst()
has found it, it returns that element which is passed tolist.Remove()
, which loops over the list a third time to yet once again find that same element that was previously found byAny()
andFirst()
, in order to finally remove it. Once removed, the whole process starts over at the beginning with a slightly shorter list, doing all the looping over and over and over again starting at the beginning every time until finally no more elements matching the predicate remain. So the performance of your second example is O(m*m*n), or O(n3) as m approaches n.Your best bet for removing all items from a list that match some predicate is to use the generic list's own
List<T>.RemoveAll(predicate)
method, which is O(n) as long as your predicate is O(1). Afor()
loop technique that passes over the list only once, callinglist.RemoveAt()
for each element to be removed, may seem to be O(n) since it appears to pass over the loop only once. Such a solution is more efficient than your first example, but only by a constant factor, which in terms of algorithmic efficiency is negligible. Even afor()
loop implementation is O(m*n) since each call toRemove()
is O(n). Since thefor()
loop itself is O(n), and it callsRemove()
m times, thefor()
loop's growth is O(n2) as m approaches n.我想解决一些迄今为止没有答案的问题:
假设
VehicleID
顾名思义是唯一的,当您获得大量车辆时,列表是一种非常低效的存储方式,如删除(以及其他方法,如Find
)仍然是 O(n)。看一下HashSet
,它具有 O(1) 移除(和其他方法),使用:移除具有特定 EnquiryID 的所有车辆仍然需要以这种方式迭代所有元素,因此您可以考虑使用返回
EnquiryID
的GetHashCode
,具体取决于您更常执行的操作。但是,如果许多车辆共享相同的 EnquiryID,则这样做的缺点是会发生大量碰撞。在这种情况下,更好的替代方案是创建一个将 EnquiryID 映射到车辆的
Dictionary>
,并在添加/删除车辆时保持最新状态。从 HashSet 中删除这些车辆的操作时间复杂度为 O(m),其中 m 是具有特定 EnquiryID 的车辆数量。I wanted to address something none of the answers have so far:
Assuming
VehicleID
is unique as the name suggests, a list is a terribly inefficient way to store them when you get a lot of vehicles, as removal(and other methods likeFind
) is still O(n). Have a look at aHashSet<Vehicle>
instead, it has O(1) removal(and other methods) using:Removing all vehicles with a specific EnquiryID still requires iterating over all elements this way, so you could consider a
GetHashCode
that returns theEnquiryID
instead, depending on which operation you do more often. This has the downside of a lot of collisions if a lot of Vehicles share the same EnquiryID though.In this case, a better alternative is to make a
Dictionary<int, List<Vehicle>>
that maps EnquiryIDs to Vehicles and keep that up to date when adding/removing vehicles. Removing these vehicles from a HashSet is then an O(m) operation, where m is the number of vehicles with a specific EnquiryID.