Scala:读取并保存 Iterable 的所有元素
我有一个 Iterable[T],它实际上是一个未知长度的流,并且想要读取所有内容并将其保存到仍然是 Iterable 实例的内容中。我确实必须阅读并保存它;我不能用懒惰的方式来做这件事。最初的 Iterable 至少可以有几千个元素。最有效/最好/规范的方法是什么?我应该使用 ArrayBuffer、List 还是 Vector?
假设 xs 是我的 Iterable。我可以考虑做这些可能性:
xs.toArray.toIterable // Ugh?
xs.toList // Fast?
xs.copyToBuffer(anArrayBuffer)
Vector(xs: _*) // There's no toVector, sadly. Is this construct as efficient?
编辑:我从问题中看出我应该更具体。这是一个稻草人的例子:
def f(xs: Iterable[SomeType]) { // xs might a stream, though I can't be sure
val allOfXS = <xs all read in at once>
g(allOfXS)
h(allOfXS) // Both g() and h() take an Iterable[SomeType]
}
I have an Iterable[T] that is really a stream of unknown length, and want to read it all and save it into something that is still an instance of Iterable. I really do have to read it and save it; I can't do it in a lazy way. The original Iterable can have a few thousand elements, at least. What's the most efficient/best/canonical way? Should I use an ArrayBuffer, a List, a Vector?
Suppose xs is my Iterable. I can think of doing these possibilities:
xs.toArray.toIterable // Ugh?
xs.toList // Fast?
xs.copyToBuffer(anArrayBuffer)
Vector(xs: _*) // There's no toVector, sadly. Is this construct as efficient?
EDIT: I see by the questions I should be more specific. Here's a strawman example:
def f(xs: Iterable[SomeType]) { // xs might a stream, though I can't be sure
val allOfXS = <xs all read in at once>
g(allOfXS)
h(allOfXS) // Both g() and h() take an Iterable[SomeType]
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这很容易。几千个元素没什么意义,所以除非它是一个非常紧密的循环,否则它几乎不重要。所以轻率的答案是:使用你认为最优雅的任何东西。
但是,好吧,让我们假设这实际上处于某个紧密循环中,并且您可以预测或对您的代码进行足够的基准测试,以了解这会限制性能。
不可变解决方案的最佳性能可能是
Vector
,如下使用:在我手中,这可以每秒复制 10k 个可迭代对象约 4k-5k 次。
List
大约是速度的一半。如果您愿意在幕后尝试可变解决方案,
xs.toArray.toIterable
通常会以每秒约 10k 副本的速度完成任务。ArrayBuffer
的速度与List
大致相同。如果您确实知道目标的大小(即
size
是O(1)
或者您从其他地方知道),您可以再削减 20-30%通过分配正确的大小并编写 while 循环来提高执行速度。如果它实际上是基元,您可以通过编写自己的专门的类似 Iterable 的东西来获得 10 倍的系数,该东西作用于数组并通过底层数组转换为常规集合。
底线:为了将功能、速度和灵活性完美结合,请在大多数情况下使用
Vector() ++ xs
。xs.toIndexedSeq
默认为相同的东西,好处是如果它已经是一个Vector
,那么它根本不需要时间(并且在不使用括号的情况下很好地链接),并且缺点是您依赖于约定,而不是行为规范(并且需要多输入 1-3 个字符)。This is easy. A few thousand elements is nothing, so it hardly matters unless it's a really tight loop. So the flippant answer is: use whatever you feel is most elegant.
But, okay, let's suppose that this is actually in some tight loop, and you can predict or have benchmarked your code enough to know that this is performance-limiting.
Your best performance for an immutable solution will likely be a
Vector
, used like so:In my hands, this can copy a 10k iterable about 4k-5k times per second.
List
is about half the speed.If you're willing to try a mutable solution under the hood,
xs.toArray.toIterable
usually takes the cake with about 10k copies per second.ArrayBuffer
is about the same speed asList
.If you actually know the size of the target (i.e.
size
isO(1)
or you know it from somewhere else), you can shave off another 20-30% of the execution speed by allocating just the right size and writing a while loop.If it's actually primitives, you can gain a factor of 10 by writing your own specialized
Iterable
-like-thing that acts on arrays and converts to regular collections via the underlying array.Bottom line: for a great blend of power, speed, and flexibility, use
Vector() ++ xs
in most situations.xs.toIndexedSeq
defaults to the same thing, with the benefit that if it's already aVector
that it will take no time at all (and chains nicely without using parens), and the drawback that you are relying upon a convention, not a specification for behavior (and it takes 1-3 more characters to type).Stream.force
怎么样?How about
Stream.force
?这很难。 Iterable 的方法是根据其迭代器定义的,但会被子特征覆盖。例如,
IndexedSeq
方法通常是根据apply
定义的。问题是为什么你要复制
Iterable
,但我想你可能会防范它可变的可能性。如果您不想复制它,那么您需要重新表述您的问题。如果您要复制它,并且希望确保所有元素都以严格的方式复制,则可以使用
.toList
。这不会复制List
,但不需要复制List
。对于其他任何内容,它将生成一个新副本。This is hard. An
Iterable
's methods are defined in terms of itsiterator
, but that gets overridden by subtraits. For instance,IndexedSeq
methods are usually defined in terms ofapply
.There is the question of why do you want to copy the
Iterable
, but I suppose you might be guarding against the possibility of it being mutable. If you do not want to copy it, then you need to rephrase your question.If you are going to copy it, and you want to be sure all elements are copied in a strict manner, you could use
.toList
. That will not copy aList
, but aList
does not need to be copied. For anything else, it will produce a new copy.