何时使用 T[]、List、IEnumerable中的每一个?

发布于 2024-09-13 06:54:16 字数 636 浏览 9 评论 0原文

我通常发现自己在做类似的事情:

string[] things = arrayReturningMethod();
int index = things.ToList<string>.FindIndex((s) => s.Equals("FOO"));
//do something with index
return things.Distinct(); //which returns an IEnumerable<string>

我发现所有这些类型/接口的混合有点令人困惑,它让我潜在的性能问题触角发痒(当然,在被证明是正确的之前我会忽略它)。

这是惯用且正确的 C# 还是有更好的替代方案来避免来回转换以访问处理数据的正确方法?

编辑: 问题实际上是双重的:

  • 何时适合直接使用 IEnumerable 接口或数组或列表(或任何其他 IEnumerable 实现类型)(当接受参数时)?

  • 您是否应该在 IEnumerables(实现未知)和列表、IEnumerables 和数组、数组和列表之间自由移动,或者是非惯用的(有更好的方法来做到这一点)/非性能(通常不相关,但可能在某些情况下)案例)/只是丑陋(无法维护,不可读)?

I usually find myself doing something like:

string[] things = arrayReturningMethod();
int index = things.ToList<string>.FindIndex((s) => s.Equals("FOO"));
//do something with index
return things.Distinct(); //which returns an IEnumerable<string>

and I find all this mixup of types/interface a bit confusing and it tickles my potential performance problem antennae (which I ignore until proven right, of course).

Is this idiomatic and proper C# or is there a better alternative to avoid casting back and forth to access the proper methods to work with the data?

EDIT:
The question is actually twofold:

  • When is it proper to use either the IEnumerable interface or an array or a list (or any other IEnumerable implementing type) directly (when accepting parameters)?

  • Should you freely move between IEnumerables (implementation unknown) and lists and IEnumerables and arrays and arrays and Lists or is that non idiomatic (there are better ways to do it)/ non performant (not typically relevant, but might be in some cases) / just plain ugly (unmaintable, unreadable)?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

神魇的王 2024-09-20 06:54:16

关于性能...

  • 从 List 转换为 T[] 涉及将原始列表中的所有数据复制到新分配的数组中。
  • 从 T[] 到 List 的转换还涉及将原始列表中的所有数据复制到新分配的 List 中。
  • 从 List 或 T[] 转换为 IEnumerable 涉及转换,这需要几个 CPU 周期。
  • 从 IEnumerable 转换为 List 涉及向上转换,这也需要几个 CPU 周期。
  • 从 IEnumerable 到 T[] 的转换还涉及向上转换。
  • 不能将 IEnumerable 强制转换为 T[] 或 List,除非它一开始就是 T[] 或 List。您可以使用 ToArray 或 ToList 函数,但这些函数也会生成副本。
  • 在一个简单的循环中,按顺序访问 T[] 中从头到尾的所有值将被优化为使用简单的指针算术——这使其成为其中最快的。
  • 按从头到尾的顺序访问列表中的所有值需要检查每次迭代,以确保您没有访问数组边界之外的值,然后才是实际访问数组值。
  • 访问 IEnumerable 中的所有值涉及创建一个枚举器对象,调用 Next() 函数来增加索引指针,然后调用 Current 属性,该属性为您提供实际值并将其粘贴到您在 foreach 语句中指定的变量中。一般来说,这并不像听起来那么糟糕。
  • 访问 IEnumerable 中的任意值需要从头开始,并根据需要多次调用 Next() 以获得该值。一般来说,这听起来很糟糕。

关于惯用语...

一般来说,IEnumerable 对于公共属性、函数参数,并且通常对于返回值很有用 - 并且只有当您知道将按顺序使用这些值时。

例如,如果您有一个函数 PrintValues,如果它被写为 PrintValues(Listvalues),则它只能处理 List 值,因此用户首先必须进行转换,例如,如果他们使用在[]。如果函数是 PrintValues(T[] value),则同样。但如果它是 PrintValues(IEnumerablevalue),它将能够处理列表、T[]、堆栈、哈希表、字典、字符串、集合等 - 任何实现 IEnumerable 的集合,这实际上是每个收藏。

关于内部使用...

  • 仅当您不确定其中需要有多少项目时才使用列表。
  • 如果您知道其中需要有多少项,但需要以任意顺序访问值,请使用 T[]。
  • 如果您已经得到了 IEnumerable 并且您只需要按顺序使用它,请坚持使用 IEnumerable。许多函数将返回 IEnumerables。如果确实需要以任意顺序访问 IEnumerable 中的值,请使用 ToArray()。

另请注意,转换与使用 ToArray() 或 ToList() 不同——后者涉及复制值,如果您有很多元素,这确实会影响性能和内存。前者只是说“狗是一种动物,所以像任何动物一样,它可以吃东西”(沮丧)或“这种动物恰好是一只狗,所以它可以吠叫”(向上)。同样,所有列表和 T[] 都是 IEnumerable,但只有部分 IEnumerable 是列表或 T[]。

In regards to performance...

  • Converting from List to T[] involves copying all the data from the original list to a newly allocated array.
  • Converting from T[] to List also involves copying all the data from the original list to a newly allocated List.
  • Converting from either List or T[] to IEnumerable involves casting, which is a few CPU cycles.
  • Converting from IEnumerable to List involves upcasting, which is also a few CPU cycles.
  • Converting from IEnumerable to T[] also involves upcasting.
  • You can't cast an IEnumerable to T[] or List unless it was a T[] or List respectively to begin with. You can use the ToArray or ToList functions, but those will also result in a copy being made.
  • Accessing all the values in order from start to end in a T[] will, in a straightforward loop, be optimized to use straightforward pointer arithmetic -- which makes it the fastest of them all.
  • Accessing all the values in order from start to end in a List involves a check on each iteration to make sure that you aren't accessing a value outside the array's bounds, and then the actual accessing of the array value.
  • Accessing all the values in an IEnumerable involves creating an enumerator object, calling the Next() function which increases the index pointer, and then calling the Current property which gives you the actual value and sticks it in the variable that you specified in your foreach statement. Generally, this isn't as bad as it sounds.
  • Accessing an arbitrary value in an IEnumerable involves starting at the beginning and calling Next() as many times as you need to get to that value. Generally, this is as bad as it sounds.

In regards to idioms...

In general, IEnumerable is useful for public properties, function parameters, and often for return values -- and only if you know that you're going to be using the values sequentially.

For instance, if you had a function PrintValues, if it was written as PrintValues(List<T> values), it would only be able to deal with List values, so the user would first have to convert, if for instance they were using a T[]. Likewise with if the function was PrintValues(T[] values). But if it was PrintValues(IEnumerable<T> values), it would be able to deal with Lists, T[]s, stacks, hashtables, dictionaries, strings, sets, etc -- any collection that implements IEnumerable, which is practically every collection.

In regards to internal use...

  • Use a List only if you're not sure how many items will need to be in it.
  • Use a T[] if you know how many items will need to be in it, but need to access the values in an arbitrary order.
  • Stick with the IEnumerable if that's what you've been given and you just need to use it sequentially. Many functions will return IEnumerables. If you do need to access values from an IEnumerable in an arbitrary order, use ToArray().

Also, note that casting is different from using ToArray() or ToList() -- the latter involves copying the values, which is indeed a performance and memory hit if you have a lot of elements. The former simply is to say that "A dog is an animal, so like any animal, it can eat" (downcast) or "This animal happens to be a dog, so it can bark" (upcast). Likewise, All Lists and T[]s are IEnumerables, but only some IEnumerables are Lists or T[]s.

离不开的别离 2024-09-20 06:54:16

一个好的经验法则是始终使用 IEnumerable(在声明变量/方法参数/方法返回类型/属性/等时),除非您有充分的理由不这样做。到目前为止,与其他(尤其是扩展)方法的类型最兼容。

A good rule of thumb is to always use IEnumerable (when declaring your variables/method parameters/method return types/properties/etc.) unless you have a good reason not to. By far the most type-compatible with other (especially extension) methods.

定格我的天空 2024-09-20 06:54:16

好吧,您正在比较两个苹果和一个橙子。

这两个苹果是数组和列表。

  • C# 中的数组是内置垃圾收集的 C 风格数组。使用它们的好处是它们的开销非常小,假设您不需要移动东西。不好的是,当您添加内容、删除内容或以其他方式更改数组时,它们的效率不高,因为内存会被洗牌。

  • List 是 C# 风格的动态数组(类似于 C++ 中的 vector<> 类)。虽然开销更大,但当您需要大量移动内容时,它们的效率更高,因为它们不会尝试保持内存使用连续。

我能给出的最好的比较是,数组与列表的关系就像字符串与 StringBuilder 的关系一样。

橙色是“IEnumerable”。这不是数据类型,而是接口。当类实现 IEnumerable 接口时,它允许在 foreach() 循环中使用该对象。

当您返回列表时(正如您在示例中所做的那样),您并没有将列表转换为 IEnumerable。列表已经是一个 IEnumerable 对象。

编辑:何时在两者之间进行转换:

这取决于应用程序。使用数组可以完成的事情很少是列表无法完成的,因此我通常会推荐使用列表。也许最好的办法是做出一个设计决定,您将使用其中之一,这样您就不必在两者之间切换。如果您依赖外部库,请将其抽象化以保持一致的使用。

希望这能驱散一点迷雾。

Well, you've got two apples and an orange that you are comparing.

The two apples are the array and the List.

  • An array in C# is a C-style array that has garbage collection built in. The upside of using them it that they have very little overhead, assuming you don't need to move things around. The bad thing is that they are not as efficient when you are adding things, removing things, and otherwise changing the array around, as memory gets shuffled around.

  • A List is a C# style dynamic array (similar to the vector<> class in C++). There is more overhead, but they are more efficient when you need to be moving things around a lot, as they will not try to keep the memory usage contiguous.

The best comparison I could give is saying that arrays are to Lists as strings are to StringBuilders.

The orange is 'IEnumerable'. This is not a datatype, but rather it is an interface. When a class implements the IEnumerable interface, it allows that object to be used in a foreach() loop.

When you return the list (as you did in your example), you were not converting the list to an IEnumerable. A list already is an IEnumerable object.

EDIT: When to convert between the two:

It depends on the application. There is very little that can be done with an array that cannot be done with a List, so I would generally recommend the List. Probably the best thing to do is to make a design decision that you are going to use one or the other, that way you don't have to switch between the two. If you rely on an external library, abstract it away to maintain consistent usage.

Hope this clears a little bit of the fog.

新一帅帅 2024-09-20 06:54:16

在我看来,问题在于您没有费心去学习如何搜索数组。提示:Array.IndexOfArray.BinarySearch 取决于数组是否已排序。

你是对的,转换为列表是一个坏主意:它浪费空间和时间,并使代码可读性较差。此外,盲目向上转换为 IEnumerable 会减慢速度,并且完全阻止使用某些算法(例如二分搜索)。

Looks to me like the problem is that you haven't bothered learning how to search an array. Hint: Array.IndexOf or Array.BinarySearch depending on whether the array is sorted.

You're right that converting to a list is a bad idea: it wastes space and time and makes the code less readable. Also, blindly upcasting to IEnumerable slows matters down and also completely prevents use of certain algorithms (such as binary search).

天暗了我发光 2024-09-20 06:54:16

如果可以避免的话,我会尽量避免在数据类型之间快速跳转。

与您所描述的类似的情况必须有足够的不同,以防止关于转换您的类型的教条规则;然而,通常最好的做法是选择一种数据结构,该数据结构能够尽可能提供您所需的接口,而不必将元素不必要地复制到新的数据结构。

I try to avoid rapidly jumping between data types if it can be avoided.

It must be the case that each situation similar to that you described is sufficiently different so as to prevent a dogmatic rule about transforming your types; however, it is generally good practice to select a data structure that provides as best as possible the interface you need without having to copying elements needlessly to new data structures.

再见回来 2024-09-20 06:54:16

什么时候用什么?

我建议返回最具体的类型,并采用最灵活的类型。

像这样:

public int[] DoSomething(IEnumerable<int> inputs)
{
    //...
}

public List<int> DoSomethingElse(IList<int> inputs)
{
    //...
}

这样你就可以调用 List上的方法。 T > 除了将其视为 IEnumerable 之外,还可以从该方法返回任何内容。在输入上,使用尽可能灵活,这样您就不会指示方法的用户创建哪种类型的集合。

When to use what?

I would suggest returning the most specific type, and taking in the most flexible type.

Like this:

public int[] DoSomething(IEnumerable<int> inputs)
{
    //...
}

public List<int> DoSomethingElse(IList<int> inputs)
{
    //...
}

That way you can call methods on List< T > for whatever you get back from the method in addition to treating it as an IEnumerable. On the inputs, use as flexible as possible, so you don't dictate the users of your method what kind of collection to create.

溺孤伤于心 2024-09-20 06:54:16

在确实出现性能问题之前,忽略“性能问题”天线是正确的。 大多数性能问题来自于执行过多的 I/O 或过多的锁定或其中之一错误,而这些都不适用于此问题。

我的一般方法是:

  1. 使用 T[] 来获取“静态”或“快照”类型的信息。用于调用 .Add() 无论如何都没有意义的情况,并且您不需要额外的方法 List;给你。
  2. 接受 IEnumerable如果您并不真正关心所给的内容并且不需要恒定时间的 .Length/.Count。
  3. 只返回 IEnumerable;当您对输入 IEnumerable进行简单操作时;或者当你特别想利用yield语法来懒惰地完成你的工作时。
  4. 在所有其他情况下,请使用 List。实在是太灵活了。

#4 的推论:不要害怕 ToList()。 ToList() 是你的朋友。它强制 IEnumerable然后立即评估(当您堆叠多个 where 子句时很有用)。不要对它发疯,但是在您对它执行 foreach (或类似操作)之前建立完整的 where 子句后,请随意调用它。

当然,这只是一个粗略的指导方针。只是请尝试在同一个代码库中遵循相同的模式——跳跃的代码风格会让维护编码人员更难进入你的思维框架。

You're right to ignore the 'performance problem' antennae until you actually have a performance problem. Most performance problems come from doing too much I/O or too much locking or doing one of them wrong, and none of these apply to this question.

My general approach is:

  1. Use T[] for 'static' or 'snapshot'-style information. Use for things where calling .Add() wouldn't make sense anyway, and you don't need the extra methods List<T> gives you.
  2. Accept IEnumerable<T> if you don't really care what you're given and don't need a constant-time .Length/.Count.
  3. Only return IEnumerable<T> when you're doing simple manipulations of an input IEnumerable<T> or when you specifically want to make use of the yield syntax to do your work lazily.
  4. In all other cases, use List<T>. It's just too flexible.

Corollary to #4: don't be afraid of ToList(). ToList() is your friend. It forces the IEnumerable<T> to evaluate right then (useful for when you're stacking several where clauses). Don't go nuts with it, but feel free to call it once you've built up your full where clause before you do the foreach over it (or the like).

Of course, this is just a rough guideline. Just please try to follow the same pattern in the same codebase -- code styles that jump around make it harder for maintenance coders to get into your frame of mind.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文