如何定义ArrayList中容量的概念?
据我所知,容量是 ArrayList 中可能包含也可能不包含引用对象的值的元素或可用空间的数量。我正在尝试更多地了解容量的概念。
所以我有三个问题:
1)从内存的角度来看,有哪些好的方法来定义容量代表什么?
...分配给 ArrayList 的(连续?)内存?
...ArrayLists 在(堆?)上的内存占用?
2)那么如果上述情况成立,改变容量需要某种方式的内存管理开销?
3)有人有一个例子,其中#2是或可能是性能问题吗?除了可能不断调整其容量的大量大型 ArrayList 之外?
I understand that capacity is the number of elements or available spaces in an ArrayList that may or may not hold a value referencing an object. I am trying to understand more about the concept of capacity.
So I have three questions:
1) What are some good ways to define what capacity represents from a memory standpoint?
...the (contiguous?) memory allocated to the ArrayList?
...the ArrayLists’s memory footprint on the (heap?)?
2) Then if the above is true, changing capacity requires some manner of memory management overhead?
3) Anyone have an example where #2 was or could be a performance concern? Aside from maybe a large number of large ArrayLists having their capacities continually adjusted?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
ArrayList 的实现方式如下:
容量是该数组的大小。
现在,如果你的容量是 10,并且你要添加第 11 个元素,ArrayList 会执行以下操作:
因此,如果你从较小的容量开始,ArrayList 最终将创建一堆数组并为你复制内容。不断添加元素,这不好。
另一方面,如果您指定 1,000,000 的容量并仅向 ArrayList 添加 3 个元素,这也有点糟糕。
经验法则:如果您知道容量,请指定它。如果您不确定但知道上限,请指定。如果您不确定,请使用默认值。
ArrayList is implemented like this:
the capacity is the size of that array.
Now, if your capacity is 10, and you're adding 11-th element, ArrayList will do this:
So if you start off with a small capacity, ArrayList will end up creating a bunch of arrays and copying stuff around for you as you keep adding elements, which isn't good.
On the other hand, if you specify a capacity of 1,000,000 and add only 3 elements to ArrayList, that also is kinda bad.
Rule of thumb: if you know the capacity, specify it. If you aren't sure but know the upper bound, specify that. If you just aren't sure, use the defaults.
容量正如您所描述的那样——分配给 ArrayList 用于存储值的连续内存。 ArrayList 将所有值存储在数组中,并自动为您调整数组的大小。调整大小时这会产生内存管理开销。
如果我没记错的话,当您尝试添加一个超出容量的元素时,Java 会将 ArrayList 的支持数组的大小从大小 N 增加到大小 2N + 2。我不知道当您使用
insert
方法(或类似方法)在超出容量末尾的特定位置插入时,它会增加到什么大小,甚至不知道它是否允许这样做。这是一个帮助您思考其工作原理的示例。将
|
之间的每个空格想象为后备数组中的一个单元格:大小 = 0(不包含任何元素),容量 = 2(可以包含 2 个元素)。
大小 = 1(包含 1 个元素),容量 = 2(可以包含 2 个元素)。
size = 2,capacity = 2。添加另一个元素:
size 增加 1,capacity 增加到 6 (2 * 2 + 2)。对于大型数组来说,这可能会很昂贵,因为分配一个大的连续内存区域可能需要一些工作(与分配许多小块内存的 LinkedList 相反),因为 JVM 需要搜索适当的位置,并且可能需要向操作系统请求更多内存。将大量值从一个地方复制到另一个地方的成本也很高,一旦找到这样的区域就会完成这一操作。
我的经验法则是:如果您知道所需的容量,请使用 ArrayList,因为只有一次分配,并且访问速度非常快。如果您不知道所需的容量,请使用 LinkedList,因为添加新值始终需要相同的工作量,并且不涉及复制。
Capacity is as you described it -- the contiguous memory allocated to an ArrayList for storage of values. ArrayList stores all values in an array, and automatically resizes the array for you. This incurs memory management overhead when resizing.
If I remember correctly, Java increases the size of an ArrayList's backing array from size N to size 2N + 2 when you try to add one more element than the capacity can take. I do not know what size it increases to when you use the
insert
method (or similar) to insert at a specific position beyond the end of the capacity, or even whether it allows this.Here is an example to help you think about how it works. Picture each space between the
|
s as a cell in the backing array:size = 0 (contains no elements), capacity = 2 (can contain 2 elements).
size = 1 (contains 1 element), capacity = 2 (can contain 2 elements).
size = 2, capacity = 2. Adding another element:
size increased by 1, capacity increased to 6 (2 * 2 + 2). This can be expensive with large arrays, as allocating a large contiguous memory region can require a bit of work (as opposed to a LinkedList, which allocates many small pieces of memory) because the JVM needs to search for an appropriate location, and may need to ask the OS for more memory. It is also expensive to copy a large number of values from one place to another, which would be done once such a region was found.
My rule of thumb is this: If you know the capacity you will require, use an ArrayList because there will only be one allocation and access is very fast. If you do not know your required capacity, use a LinkedList because adding a new value always takes the same amount of work, and there is no copying involved.
是的,ArrayList 是由数组支持,代表内部数组大小。
是的,数组容量越大,ArrayList 使用的内存就越多。
是的。当列表增长到足够大时,会分配一个更大的数组并复制内容。先前的数组可能会被丢弃并标记为垃圾回收。
是的,如果您创建初始容量为 1 的 ArrayList(例如),并且列表的增长远远超出了该容量。如果您预先知道要存储的元素数量,则最好请求该大小的初始容量。
但是我认为这在你的优先级列表中应该是较低的,虽然数组复制可能经常发生,但它从Java的早期阶段就已经被优化了,不应该成为一个问题。我认为更好的是选择正确的算法。请记住:过早优化是万恶之源
另请参阅:何时使用 LinkedList 而不是 ArrayList
Yes, an ArrayList is backed up by an array, to that represents the internal array size.
Yes, the larget the array capacity, the more footprint used by the arraylist.
It is. When the list grows large enough, a larger array is allocated and the contents copied. The previous array maybe discarded and marked for garbage collection.
Yes, if you create the ArrayList with initial capacity of 1 ( for instance ) and your list grows way beyond that. If you know upfront the number of elements to store, you better request an initial capacity of that size.
However I think this should be low in your list of priorities, while array copy may happen very often, it is optimized since the early stages of Java, and should not be a concern. Better would be to choose a right algorithm, I think. Remember: Premature optimization is the root of all evil
See also: When to use LinkedList over ArrayList