何时使用 T[]、List、IEnumerable中的每一个?
我通常发现自己在做类似的事情:
string[] things = arrayReturningMethod();
int index = things.ToList<string>.FindIndex((s) => s.Equals("FOO"));
//do something with index
return things.Distinct(); //which returns an IEnumerable<string>
我发现所有这些类型/接口的混合有点令人困惑,它让我潜在的性能问题触角发痒(当然,在被证明是正确的之前我会忽略它)。
这是惯用且正确的 C# 还是有更好的替代方案来避免来回转换以访问处理数据的正确方法?
编辑: 问题实际上是双重的:
何时适合直接使用 IEnumerable 接口或数组或列表(或任何其他 IEnumerable 实现类型)(当接受参数时)?
您是否应该在 IEnumerables(实现未知)和列表、IEnumerables 和数组、数组和列表之间自由移动,或者是非惯用的(有更好的方法来做到这一点)/非性能(通常不相关,但可能在某些情况下)案例)/只是丑陋(无法维护,不可读)?
I usually find myself doing something like:
string[] things = arrayReturningMethod();
int index = things.ToList<string>.FindIndex((s) => s.Equals("FOO"));
//do something with index
return things.Distinct(); //which returns an IEnumerable<string>
and I find all this mixup of types/interface a bit confusing and it tickles my potential performance problem antennae (which I ignore until proven right, of course).
Is this idiomatic and proper C# or is there a better alternative to avoid casting back and forth to access the proper methods to work with the data?
EDIT:
The question is actually twofold:
When is it proper to use either the IEnumerable interface or an array or a list (or any other IEnumerable implementing type) directly (when accepting parameters)?
Should you freely move between IEnumerables (implementation unknown) and lists and IEnumerables and arrays and arrays and Lists or is that non idiomatic (there are better ways to do it)/ non performant (not typically relevant, but might be in some cases) / just plain ugly (unmaintable, unreadable)?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
关于性能...
关于惯用语...
一般来说,IEnumerable 对于公共属性、函数参数,并且通常对于返回值很有用 - 并且只有当您知道将按顺序使用这些值时。
例如,如果您有一个函数 PrintValues,如果它被写为 PrintValues(Listvalues),则它只能处理 List 值,因此用户首先必须进行转换,例如,如果他们使用在[]。如果函数是 PrintValues(T[] value),则同样。但如果它是 PrintValues(IEnumerablevalue),它将能够处理列表、T[]、堆栈、哈希表、字典、字符串、集合等 - 任何实现 IEnumerable 的集合,这实际上是每个收藏。
关于内部使用...
另请注意,转换与使用 ToArray() 或 ToList() 不同——后者涉及复制值,如果您有很多元素,这确实会影响性能和内存。前者只是说“狗是一种动物,所以像任何动物一样,它可以吃东西”(沮丧)或“这种动物恰好是一只狗,所以它可以吠叫”(向上)。同样,所有列表和 T[] 都是 IEnumerable,但只有部分 IEnumerable 是列表或 T[]。
In regards to performance...
In regards to idioms...
In general, IEnumerable is useful for public properties, function parameters, and often for return values -- and only if you know that you're going to be using the values sequentially.
For instance, if you had a function PrintValues, if it was written as PrintValues(List<T> values), it would only be able to deal with List values, so the user would first have to convert, if for instance they were using a T[]. Likewise with if the function was PrintValues(T[] values). But if it was PrintValues(IEnumerable<T> values), it would be able to deal with Lists, T[]s, stacks, hashtables, dictionaries, strings, sets, etc -- any collection that implements IEnumerable, which is practically every collection.
In regards to internal use...
Also, note that casting is different from using ToArray() or ToList() -- the latter involves copying the values, which is indeed a performance and memory hit if you have a lot of elements. The former simply is to say that "A dog is an animal, so like any animal, it can eat" (downcast) or "This animal happens to be a dog, so it can bark" (upcast). Likewise, All Lists and T[]s are IEnumerables, but only some IEnumerables are Lists or T[]s.
一个好的经验法则是始终使用 IEnumerable(在声明变量/方法参数/方法返回类型/属性/等时),除非您有充分的理由不这样做。到目前为止,与其他(尤其是扩展)方法的类型最兼容。
A good rule of thumb is to always use IEnumerable (when declaring your variables/method parameters/method return types/properties/etc.) unless you have a good reason not to. By far the most type-compatible with other (especially extension) methods.
好吧,您正在比较两个苹果和一个橙子。
这两个苹果是数组和列表。
C# 中的数组是内置垃圾收集的 C 风格数组。使用它们的好处是它们的开销非常小,假设您不需要移动东西。不好的是,当您添加内容、删除内容或以其他方式更改数组时,它们的效率不高,因为内存会被洗牌。
List 是 C# 风格的动态数组(类似于 C++ 中的 vector<> 类)。虽然开销更大,但当您需要大量移动内容时,它们的效率更高,因为它们不会尝试保持内存使用连续。
我能给出的最好的比较是,数组与列表的关系就像字符串与 StringBuilder 的关系一样。
橙色是“IEnumerable”。这不是数据类型,而是接口。当类实现 IEnumerable 接口时,它允许在 foreach() 循环中使用该对象。
当您返回列表时(正如您在示例中所做的那样),您并没有将列表转换为 IEnumerable。列表已经是一个 IEnumerable 对象。
编辑:何时在两者之间进行转换:
这取决于应用程序。使用数组可以完成的事情很少是列表无法完成的,因此我通常会推荐使用列表。也许最好的办法是做出一个设计决定,您将使用其中之一,这样您就不必在两者之间切换。如果您依赖外部库,请将其抽象化以保持一致的使用。
希望这能驱散一点迷雾。
Well, you've got two apples and an orange that you are comparing.
The two apples are the array and the List.
An array in C# is a C-style array that has garbage collection built in. The upside of using them it that they have very little overhead, assuming you don't need to move things around. The bad thing is that they are not as efficient when you are adding things, removing things, and otherwise changing the array around, as memory gets shuffled around.
A List is a C# style dynamic array (similar to the vector<> class in C++). There is more overhead, but they are more efficient when you need to be moving things around a lot, as they will not try to keep the memory usage contiguous.
The best comparison I could give is saying that arrays are to Lists as strings are to StringBuilders.
The orange is 'IEnumerable'. This is not a datatype, but rather it is an interface. When a class implements the IEnumerable interface, it allows that object to be used in a foreach() loop.
When you return the list (as you did in your example), you were not converting the list to an IEnumerable. A list already is an IEnumerable object.
EDIT: When to convert between the two:
It depends on the application. There is very little that can be done with an array that cannot be done with a List, so I would generally recommend the List. Probably the best thing to do is to make a design decision that you are going to use one or the other, that way you don't have to switch between the two. If you rely on an external library, abstract it away to maintain consistent usage.
Hope this clears a little bit of the fog.
在我看来,问题在于您没有费心去学习如何搜索数组。提示:Array.IndexOf 或 Array.BinarySearch 取决于数组是否已排序。
你是对的,转换为列表是一个坏主意:它浪费空间和时间,并使代码可读性较差。此外,盲目向上转换为
IEnumerable
会减慢速度,并且完全阻止使用某些算法(例如二分搜索)。Looks to me like the problem is that you haven't bothered learning how to search an array. Hint: Array.IndexOf or Array.BinarySearch depending on whether the array is sorted.
You're right that converting to a list is a bad idea: it wastes space and time and makes the code less readable. Also, blindly upcasting to
IEnumerable
slows matters down and also completely prevents use of certain algorithms (such as binary search).如果可以避免的话,我会尽量避免在数据类型之间快速跳转。
与您所描述的类似的情况必须有足够的不同,以防止关于转换您的类型的教条规则;然而,通常最好的做法是选择一种数据结构,该数据结构能够尽可能提供您所需的接口,而不必将元素不必要地复制到新的数据结构。
I try to avoid rapidly jumping between data types if it can be avoided.
It must be the case that each situation similar to that you described is sufficiently different so as to prevent a dogmatic rule about transforming your types; however, it is generally good practice to select a data structure that provides as best as possible the interface you need without having to copying elements needlessly to new data structures.
什么时候用什么?
我建议返回最具体的类型,并采用最灵活的类型。
像这样:
这样你就可以调用
List
上的方法。 T >
除了将其视为 IEnumerable 之外,还可以从该方法返回任何内容。在输入上,使用尽可能灵活,这样您就不会指示方法的用户创建哪种类型的集合。When to use what?
I would suggest returning the most specific type, and taking in the most flexible type.
Like this:
That way you can call methods on
List< T >
for whatever you get back from the method in addition to treating it as an IEnumerable. On the inputs, use as flexible as possible, so you don't dictate the users of your method what kind of collection to create.在确实出现性能问题之前,忽略“性能问题”天线是正确的。 大多数性能问题来自于执行过多的 I/O 或过多的锁定或其中之一错误,而这些都不适用于此问题。
我的一般方法是:
#4 的推论:不要害怕 ToList()。 ToList() 是你的朋友。它强制 IEnumerable然后立即评估(当您堆叠多个 where 子句时很有用)。不要对它发疯,但是在您对它执行 foreach (或类似操作)之前建立完整的 where 子句后,请随意调用它。
当然,这只是一个粗略的指导方针。只是请尝试在同一个代码库中遵循相同的模式——跳跃的代码风格会让维护编码人员更难进入你的思维框架。
You're right to ignore the 'performance problem' antennae until you actually have a performance problem. Most performance problems come from doing too much I/O or too much locking or doing one of them wrong, and none of these apply to this question.
My general approach is:
Corollary to #4: don't be afraid of ToList(). ToList() is your friend. It forces the IEnumerable<T> to evaluate right then (useful for when you're stacking several where clauses). Don't go nuts with it, but feel free to call it once you've built up your full where clause before you do the foreach over it (or the like).
Of course, this is just a rough guideline. Just please try to follow the same pattern in the same codebase -- code styles that jump around make it harder for maintenance coders to get into your frame of mind.