为什么scala标准库中没有不可变数组?
Scala 具有各种不可变序列,如 List、Vector 等。我很惊讶地发现没有由简单数组支持的不可变索引序列的实现(Vector 对于我的需求来说似乎太复杂了)。
这有设计原因吗?我在邮件列表上找不到很好的解释。
您是否有关于性能与数组接近相同的不可变索引序列的建议?我正在考虑 scalaz 的 ImmutableArray,但它在 scala trunk 方面存在一些问题。
谢谢
Scala has all sorts sorts of immutable sequences like List, Vector,etc. I have been surprised to find no implementation of immutable indexed sequence backed by a simple array (Vector seems way too complicated for my needs).
Is there a design reason for this? I could not find a good explanation on the mailing list.
Do you have a recommendation for an immutable indexed sequence that has close to the same performances as an array? I am considering scalaz's ImmutableArray, but it has some issues with scala trunk for example.
Thank you
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
那么,我们首先来区分一下接口和类。接口是API的设计,而类是API的实现。
Scala 中的接口具有相同的名称和不同的包,以区分不变性:
Seq
、immutable.Seq
、mutable.Seq
。另一方面,这些类通常不共享名称。
List
是不可变序列,而ListBuffer
是可变序列。也有例外,例如HashSet
,但这只是实现方面的巧合。现在,
Array
不是 Scala 集合的一部分,而是一个 Java 类,但它的包装器WrappedArray
清楚地显示了它会出现的位置:作为一个可变类。WrappedArray
实现的接口是IndexedSeq
,它存在可变和不可变特征。immutable.IndexedSeq
有一些实现类,包括WrappedString
。然而,实现它的通用类是Vector
。该类在可变端占据的位置与 Array 类占据的位置相同。现在,使用
Vector
并不比使用Array
更复杂,所以我不知道你为什么说它复杂。也许你认为它在内部做了太多的事情,在这种情况下你就错了。所有设计良好的不可变类都是持久性,因为使用不可变集合意味着创建它的新副本,所以它们必须为此进行优化,这正是
Vector
所做的。So, let's first make a distinction between interface and class. The interface is an API design, while the class is the implementation of such API.
The interfaces in Scala have the same name and different package to distinguish with regards to immutability:
Seq
,immutable.Seq
,mutable.Seq
.The classes, on the other hand, usually don't share a name. A
List
is an immutable sequence, while aListBuffer
is a mutable sequence. There are exceptions, likeHashSet
, but that's just a coincidence with regards to implementation.Now, and
Array
is not part of Scala's collection, being a Java class, but its wrapperWrappedArray
shows clearly where it would show up: as a mutable class.The interface implemented by
WrappedArray
isIndexedSeq
, which exists are both mutable and immutable traits.The
immutable.IndexedSeq
has a few implementing classes, including theWrappedString
. The general use class implementing it, however, is theVector
. That class occupies the same position anArray
class would occupy in the mutable side.Now, there's no more complexity in using a
Vector
than using anArray
, so I don't know why you call it complicated.Perhaps you think it does too much internally, in which case you'd be wrong. All well designed immutable classes are persistent, because using an immutable collection means creating new copies of it, so they have to be optimized for that, which is exactly what
Vector
does.主要是因为 Scala 中没有任何数组。您所看到的是 java 的数组,其中包含一些方法,可帮助它们适应集合 API。
其他任何东西都不会是数组,因为它具有不会遭受类型擦除或破坏方差的独特属性。它只是另一种带有索引和值的类型。 Scala 确实有这个功能,它被称为
IndexedSeq
,如果您需要将其作为数组传递给某些第 3 方 API,那么您只需使用.toArray
Mostly because there are no arrays whatsoever in Scala. What you're seeing is java's arrays pimped with a few methods that help them fit into the collection API.
Anything else wouldn't be an array, with it's unique property of not suffering type erasure, or the broken variance. It would just be another type with indexes and values. Scala does have that, it's called
IndexedSeq
, and if you need to pass it as an array to some 3rd party API then you can just use.toArray
Scala 2.13 添加了
ArraySeq
< /a>,这是一个由数组支持的不可变序列。Scala 2.13 has added
ArraySeq
, which is an immutable sequence backed by an array.Scala 3 现在有 IArray,一个不可变数组。
它作为不透明类型别名实现,没有运行时开销。
Scala 3 now has IArray, an Immutable Array.
It is implemented as an Opaque Type Alias, with no runtime overhead.
scala
Array
类的要点是提供一种机制来访问 Java 数组的能力(但没有 Java 允许数组的糟糕的设计决策)在其类型系统内是协变的)。 Java 数组是可变的,因此 scala 标准库中的数组也是可变的。假设库中还有另一个类
immutable.Array
,但编译器也使用 Java 数组作为底层结构(为了效率/速度)。然后将编译并运行以下代码:也就是说,该数组实际上是可变的。
The point of the scala
Array
class is to provide a mechanism to access the abilities of Java arrays (but without Java's awful design decision of allowing arrays to be covariant within its type system). Java arrays are mutable, hence so are those in the scala standard library.Suppose there were also another class
immutable.Array
in the library but that the compiler were also to use a Java array as the underlying structure (for efficiency/speed). The following code would then compile and run:That is, the array would really be mutable.
数组的问题在于它们具有固定的大小。没有向数组添加元素或从中删除元素的操作。
您可以保留一个您认为足够长的数组作为后备存储,“浪费”您不使用的内存,跟踪上次使用的索引,并在需要额外空间时复制到更大的数组。复制显然是
O(N)
。更改单个元素也是
O(N)
,因为您需要复制整个数组。不存在结构共享,这是高性能函数数据结构的关键。您还可以为“溢出”元素分配一个额外的数组,并以某种方式跟踪您的数组。那时您就开始重新发明 Vector。
简而言之,由于不适合结构共享,数组的不可变外观对于添加元素、删除元素和更改元素等大多数常见操作具有糟糕的运行时性能特征。
这只留下了固定大小的固定内容数据载体的用例,并且该用例相对较少。大多数用途更好地与
List
、Stream
或Vector
配合使用The problem with Arrays is that they have a fixed size. There is no operation to add an element to an array, or remove one from it.
You can keep an array that you guess will be long enough as a backing store, "wasting" the memory you're not using, keep track of the last used index, and copy to a larger array if you need the extra space. That copying is
O(N)
obviously.Changing a single element is also
O(N)
as you will need to copy over the entire array. There is no structural sharing, which is the lynchpin of performant functional datastructures.You could also allocate an extra array for the "overflowing" elements, and somehow keep track of your arrays. At that point you're on your way of re-inventing Vector.
In short, due to their unsuitablility for structural sharing, immutable facades for arrays have terrible runtime performance characteristics for most common operations like adding an element, removing an element, and changing an element.
That only leaves the use-case of a fixed size fixed content data-carrier, and that use-case is relatively rare. Most uses better served with
List
,Stream
orVector
您可以简单地使用 Array[T].toIndexSeq 将 Array[T] 转换为 ArraySeq[T],其类型为 immutable.IndexedSeq[T]。
(Scala 2.13.0之后)
You can simply use Array[T].toIndexSeq to convert Array[T] to ArraySeq[T], which is of type immutable.IndexedSeq[T].
(after Scala 2.13.0)
您可以将数组转换为序列。
该数组将隐式转换为 WrappedArray。并且由于类型是Seq,因此将不再进行更新操作。
You could cast your array into a sequence.
The array will be implicitly converted to a WrappedArray. And as the type is Seq, update operations will no longer be available.