如何定义ArrayList中容量的概念?

发布于 2024-10-25 18:48:34 字数 305 浏览 9 评论 0原文

据我所知,容量是 ArrayList 中可能包含也可能不包含引用对象的值的元素或可用空间的数量。我正在尝试更多地了解容量的概念。

所以我有三个问题:

1)从内存的角度来看,有哪些好的方法来定义容量代表什么?

...分配给 ArrayList 的(连续?)内存?

...ArrayLists 在(堆?)上的内存占用?

2)那么如果上述情况成立,改变容量需要某种方式的内存管理开销?

3)有人有一个例子,其中#2是或可能是性能问题吗?除了可能不断调整其容量的大量大型 ArrayList 之外?

I understand that capacity is the number of elements or available spaces in an ArrayList that may or may not hold a value referencing an object. I am trying to understand more about the concept of capacity.

So I have three questions:

1) What are some good ways to define what capacity represents from a memory standpoint?

...the (contiguous?) memory allocated to the ArrayList?

...the ArrayLists’s memory footprint on the (heap?)?

2) Then if the above is true, changing capacity requires some manner of memory management overhead?

3) Anyone have an example where #2 was or could be a performance concern? Aside from maybe a large number of large ArrayLists having their capacities continually adjusted?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

毁虫ゝ 2024-11-01 18:48:34
  1. 该类称为 ArrayList,因为它基于数组。容量是数组的大小,需要一块连续的堆内存。但是,请注意,数组本身仅包含对元素的引用,这些元素是堆上的单独对象。
  2. 增加容量需要分配一个更大的新数组,并将旧数组中的所有引用复制到新数组,之后旧数组就可以进行垃圾回收。
  3. 您引用了性能可能成为问题的主要案例。在实践中,我从未见过它真正成为问题,因为元素对象通常比列表占用更多的内存(可能还有 CPU 时间)。
  1. The class is called ArrayList because it's based on an array. The capacity is the size of the array, which requires a block of contiguous heap memory. However, note that the array itself contains only references to the elements, which are separate objects on the heap.
  2. Increasing the capacity requires allocating a new, larger array and copying all the references from the old array to the new one, after which the old one becomes eligible for garbage collection.
  3. You've cited the main case where performance could be a concern. In practice, I've never seen it actually become a problem, since the element objects usually take up much more memory (and possibly CPU time) than the list.
命比纸薄 2024-11-01 18:48:34

ArrayList 的实现方式如下:

class ArrayList {
  private Object[] elements;
}

容量是该数组的大小。

现在,如果你的容量是 10,并且你要添加第 11 个元素,ArrayList 会执行以下操作:

Object[] newElements = new Object[capacity * 1.5];
System.arraycopy(this.elements, newElements);
this.elements = newElements;

因此,如果你从较小的容量开始,ArrayList 最终将创建一堆数组并为你复制内容。不断添加元素,这不好。

另一方面,如果您指定 1,000,000 的容量并仅向 ArrayList 添加 3 个元素,这也有点糟糕。

经验法则:如果您知道容量,请指定它。如果您不确定但知道上限,请指定。如果您不确定,请使用默认值。

ArrayList is implemented like this:

class ArrayList {
  private Object[] elements;
}

the capacity is the size of that array.

Now, if your capacity is 10, and you're adding 11-th element, ArrayList will do this:

Object[] newElements = new Object[capacity * 1.5];
System.arraycopy(this.elements, newElements);
this.elements = newElements;

So if you start off with a small capacity, ArrayList will end up creating a bunch of arrays and copying stuff around for you as you keep adding elements, which isn't good.

On the other hand, if you specify a capacity of 1,000,000 and add only 3 elements to ArrayList, that also is kinda bad.

Rule of thumb: if you know the capacity, specify it. If you aren't sure but know the upper bound, specify that. If you just aren't sure, use the defaults.

很酷又爱笑 2024-11-01 18:48:34

容量正如您所描述的那样——分配给 ArrayList 用于存储值的连续内存。 ArrayList 将所有值存储在数组中,并自动为您调整数组的大小。调整大小时这会产生内存管理开销。

如果我没记错的话,当您尝试添加一个超出容量的元素时,Java 会将 ArrayList 的支持数组的大小从大小 N 增加到大小 2N + 2。我不知道当您使用 insert 方法(或类似方法)在超出容量末尾的特定位置插入时,它会增加到什么大小,甚至不知道它是否允许这样做。

这是一个帮助您思考其工作原理的示例。将 | 之间的每个空格想象为后备数组中的一个单元格:

| | |

大小 = 0(不包含任何元素),容量 = 2(可以包含 2 个元素)。

|1| |

大小 = 1(包含 1 个元素),容量 = 2(可以包含 2 个元素)。

|1|2|

size = 2,capacity = 2。添加另一个元素:

|1|2|3| | | |

size 增加 1,capacity 增加到 6 (2 * 2 + 2)。对于大型数组来说,这可能会很昂贵,因为分配一个大的连续内存区域可能需要一些工作(与分配许多小块内存的 LinkedList 相反),因为 JVM 需要搜索适当的位置,并且可能需要向操作系统请求更多内存。将大量值从一个地方复制到另一个地方的成本也很高,一旦找到这样的区域就会完成这一操作。

我的经验法则是:如果您知道所需的容量,请使用 ArrayList,因为只有一次分配,并且访问速度非常快。如果您不知道所需的容量,请使用 LinkedList,因为添加新值始终需要相同的工作量,并且不涉及复制。

Capacity is as you described it -- the contiguous memory allocated to an ArrayList for storage of values. ArrayList stores all values in an array, and automatically resizes the array for you. This incurs memory management overhead when resizing.

If I remember correctly, Java increases the size of an ArrayList's backing array from size N to size 2N + 2 when you try to add one more element than the capacity can take. I do not know what size it increases to when you use the insert method (or similar) to insert at a specific position beyond the end of the capacity, or even whether it allows this.

Here is an example to help you think about how it works. Picture each space between the |s as a cell in the backing array:

| | |

size = 0 (contains no elements), capacity = 2 (can contain 2 elements).

|1| |

size = 1 (contains 1 element), capacity = 2 (can contain 2 elements).

|1|2|

size = 2, capacity = 2. Adding another element:

|1|2|3| | | |

size increased by 1, capacity increased to 6 (2 * 2 + 2). This can be expensive with large arrays, as allocating a large contiguous memory region can require a bit of work (as opposed to a LinkedList, which allocates many small pieces of memory) because the JVM needs to search for an appropriate location, and may need to ask the OS for more memory. It is also expensive to copy a large number of values from one place to another, which would be done once such a region was found.

My rule of thumb is this: If you know the capacity you will require, use an ArrayList because there will only be one allocation and access is very fast. If you do not know your required capacity, use a LinkedList because adding a new value always takes the same amount of work, and there is no copying involved.

囚你心 2024-11-01 18:48:34

1) 从内存的角度定义容量代表什么有哪些好方法?

...分配给 ArrayList 的(连续?)内存?

是的,ArrayList 是由数组支持,代表内部数组大小。

...ArrayList 在(堆?)上的内存占用?

是的,数组容量越大,ArrayList 使用的内存就越多。

2)如果上述情况成立,更改容量需要某种形式的内存管理开销?

是的。当列表增长到足够大时,会分配一个更大的数组并复制内容。先前的数组可能会被丢弃并标记为垃圾回收。

3) 有人有一个例子,其中 #2 是或可能是性能问题吗?除了可能不断调整容量的大量大型 ArrayList 之外?

是的,如果您创建初始容量为 1 的 ArrayList(例如),并且列表的增长远远超出了该容量。如果您预先知道要存储的元素数量,则最好请求该大小的初始容量。

但是我认为这在你的优先级列表中应该是较低的,虽然数组复制可能经常发生,但它从Java的早期阶段就已经被优化了,不应该成为一个问题。我认为更好的是选择正确的算法。请记住:过早优化是万恶之源

另请参阅:何时使用 LinkedList 而不是 ArrayList

1) What are some good ways to define what capacity represents from a memory standpoint?

...the (contiguous?) memory allocated to the ArrayList?

Yes, an ArrayList is backed up by an array, to that represents the internal array size.

...the ArrayLists’s memory footprint on the (heap?)?

Yes, the larget the array capacity, the more footprint used by the arraylist.

2) Then if the above is true, changing capacity requires some manner of memory management overhead?

It is. When the list grows large enough, a larger array is allocated and the contents copied. The previous array maybe discarded and marked for garbage collection.

3) Anyone have an example where #2 was or could be a performance concern? Aside from maybe a large number of large ArrayLists having their capacities continually adjusted?

Yes, if you create the ArrayList with initial capacity of 1 ( for instance ) and your list grows way beyond that. If you know upfront the number of elements to store, you better request an initial capacity of that size.

However I think this should be low in your list of priorities, while array copy may happen very often, it is optimized since the early stages of Java, and should not be a concern. Better would be to choose a right algorithm, I think. Remember: Premature optimization is the root of all evil

See also: When to use LinkedList over ArrayList

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文