加速链表?
我是一名学生,对 Java 还很陌生。我正在研究 Java 中的两个集合(Linked List 和 ArrayList)所实现的不同速度。我知道 ArrayList 在查找值并将值放入其索引中要快得多。我的问题是:
如果可能的话,如何才能使链表更快?
感谢您的任何帮助。
兹马希尔
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
当谈论速度时,也许您指的是复杂性。 ArrayList(和数组)的插入和检索操作为 O(1),而 LinkedList 的插入和检索操作为 O(n)。这是无法改变的——这是“根据定义”的。
O(n) 意味着为了在给定位置插入一个对象或检索它,在最坏的情况下,您必须遍历列表中的所有 (n) 个项目。因此有n次操作。对于 ArrayList 这只是一项操作。
When talking about speed, perhaps you mean complexity. Insertion and retrieval operations for
ArrayList
(and arrays) are O(1), while forLinkedList
they are O(n). And this cannot be changed - it is 'by definition'.O(n) means that in order to insert an object at a given position, or retrieve it, you must traverse, in the worst case, all (n) the items in the list. Hence n operations. For ArrayList this is only one operation.
你可能不能。您不知道大小(好吧,好吧,您可以),也不知道每个元素的位置。要在链表中查找元素 100,您需要从第 1 项开始,找到它到第 2 项的链接,依此类推,直到找到 100。这使得插入到此列表中成为一项繁琐的工作。
根据您的具体目标,有很多选择。您可以使用 b 树或类似的方法将大链表拆分为更小的链表。或者,如果您想快速查找项目,请使用哈希列表。或者使用简单的数组。但是如果您想要一个像 ArrayList 一样执行的列表,为什么不使用 ArrayList 呢?
You probably can't. You don't know the size (well, ok you can), nor the location of each element. To find element 100 in a linked list, you need to start with item 1, find it's link to item 2, etc. until you find 100. This makes inserting into this list a tedious job.
There are many alternatives depending on your exact goals. You can use b-trees or similar methods to split the large linked list into smaller ones. Or use hashlists if you want to quickly find items. Or use simple arrays. But if you want a list that performs like an ArrayList, why not use an ArrayList?
您可以分割链接到主链接列表的区域,这样您就可以直接在列表内部提供入口点,这样您就不必走到它们那里。请参阅此处的
subList
方法:http://download.oracle.com/javase/1.4.2/docs/api/java/util/AbstractList.html。例如,如果您有许多由单词组成的“句子”,这会很有用。您可以使用单独的链表来迭代句子,这些句子是主链表的子列表。添加、删除或访问元素时,您还可以使用
ListIterator
。这对于提高顺序访问的速度有很大帮助。请参阅listIterator
方法和类:http://download.oracle.com/javase/1.4.2/docs/api/java/util/ListIterator.html。You can split off regions which are linked to the main linked list, so this gives you entry points directly inside the list so you don't have to walk up to them. See the
subList
method here: http://download.oracle.com/javase/1.4.2/docs/api/java/util/AbstractList.html. This is useful if you have a number of 'sentences' made out of words, say. You can use a separate linked list to iterate over the sentences, which are sublists of the main linked list.You can also use a
ListIterator
when adding, removing, or accessing elements. This helps greatly with increasing the speed of sequential access. See thelistIterator
method for this, and the class: http://download.oracle.com/javase/1.4.2/docs/api/java/util/ListIterator.html.使用跳跃列表可以提高链接列表的速度: http://igoro.com /archive/skip-lists-are-fascinating/
Speed of a linked list could be improved by using skip lists: http://igoro.com/archive/skip-lists-are-fascinating/
链表使用指针来遍历项目,因此,例如,如果您请求第 5 个项目,则运行时将从第一个项目开始,遍历每个指针,直到到达第 5 个项目。
你对此无能为力。如果您需要快速访问项目,链接列表可能不是一个好的选择。尽管有一些优化,例如创建循环链表或双链表,您可以在列表中来回走动,但这实际上取决于业务逻辑和应用程序需求。
我的建议是避免链接列表,如果它不符合您的需求,并且更改为不同的数据结构可能是最好的方法。
a linked list uses pointers to walk through the items, so for example if you asked for the 5th item, the runtime will start from the first item and walks through each pointer until it reaches the 5th item.
there is really not much you can do about it. a linked list may not be a good choice if you need fast acces to items. although there are some optimizations for it such as creating a circular linked list or a double linked list where you can walk back and forth the list but this really depends on the business logic and the application requirements.
my advise is to avoid linked lists if it does not match your needs and changing to a different data structure might be the best approach.
一般来说,数据结构旨在很好地完成某些任务。 LinkedList 的设计目的是在插入元素和删除元素方面比 ArrayList 更快,并且在按顺序迭代列表时与 ArrayList 大致相同。当您更改 LinkedList 的工作方式时,它就不再是真正的 LinkedList,因此实际上没有任何方法可以修改它们以使其在某些方面更快,但仍然是 LinkedList。
您需要检查使用此特定集合的方式,并确定 LinkedList 是否确实是适合您目的的最佳数据结构。如果您与我们分享您如何使用它,以及为什么您需要它更快,那么我们可以建议您应该考虑使用哪种数据结构。
As a general rule, data structures are designed to do certain things well. LinkedLists are designed to be faster than ArrayLists at inserting elements and removing elements and about the same as ArrayLists at iterating across the list in order. When you change the way a LinkedList works, you make it no longer a true LinkedList, so there's not really any way to modify them to be faster at something and still be a LinkedList.
You'll need to examine the way you're using this particular collection and decide whether a LinkedList is really the best data structure for your purposes. If you share with us how you're using it, and why you need it to be faster, then we can advise you on which data structure you ought to consider using.
很多比你我聪明的人都研究过 Java 集合类的实现。如果需要进行优化,他们就会找到并已经实现。
由于集合类已经尽可能优化,我们的首要任务应该是选择正确的集合类。
选择集合类型时,不要忘记 HashSet 之类的东西。如果顺序不重要,并且不需要在集合中放入重复项,那么 HashSet 可能比较合适。
Lots of people smarter than you or I have looked at the implementation of the Java collection classes. If there were an optimization to be made, they would have found it and already made it.
Since the collection classes are pretty much as optimized as they can be, our primary task should be to choose the correct one.
When choosing your collection type, don't forget about things like HashSet. If order doesn't matter, and you don't need to put duplicates in the collection, then HashSet may be appropriate.
标准 Java 集合类型(实际上是用任何语言实现的所有数据结构!)代表了对各种“度量”的妥协,例如:
例如:
各种性能指标通常由各种数据结构的数学计算确定。例如,如果您有一个节点链,则获取第 i 个节点的唯一方法是从头开始遍历它们。这涉及到以下
i
指针。有时您可以修改数据结构来提高某一方面的性能。但这通常是以牺牲其他方面的性能为代价的。 (例如,您可以添加一个单独的索引来更快地建立链接列表的索引。但是在插入/删除时维护索引的成本意味着您可能最好使用
ArrayList
.)在某些情况下,集成/重用要求对性能有重大影响。
例如,理论上可以通过向列表元素类型添加
next
字段、组合元素和节点对象并为每个列表条目节省 16 个字节左右来优化链表的空间使用。然而,这将使列表类型变得不那么通用(成员/元素类需要实现特定的接口),并且具有元素在任何时候最多只能属于一个列表的限制。这些限制非常有限,以至于这种方法很少在 Java 中使用。对于第二个示例,考虑在链表中的给定位置插入的问题。对于
LinkedList
类,这通常是一个O(N)
操作,因为您必须逐步遍历列表才能找到位置。理论上,如果应用程序可以找到并记住一个位置,它应该能够在O(1)
内执行该位置的插入操作。不幸的是,List
API 都没有提供“记住”位置的方法。虽然这些示例都不是开发人员“做自己的事情”的基本障碍,但它们说明使用通用数据结构 API 和这些 API 的通用实现会对性能产生影响,因此代表了性能和易用性之间的权衡-使用。
The standard Java collection type (indeed all data structures implemented in any language!) represent compromises on various "measures" such as:
So for instance:
The various performance measures are typically determines by the maths of the various data structures. For example, if you have a chain of nodes, the only way to get the
ith
node is to step through them from the beginning. This involves followingi
pointers.Sometimes you can modify the data structures to improve one aspect of the performance. But this typically comes at the cost of some other aspect of the performance. (For example, you could add a separate index to make indexing of a linked list faster. But the cost of maintaining the index on insertion / deletion would mean that you'd probably be better of using an
ArrayList
.)In some cases the integration / reuse requirements have significant impact on performance.
For example, it is theoretically possible to optimize a linked list's space usage by adding a
next
field to the list element type, combining the element and node objects and saving 16 or so bytes per list entry. However, this would make the list type less general (the member/element class would need to implement a specific interface), and has the restriction that an element can belong to at most one list at any time. These restrictions are so limiting that this approach is rarely used in Java.For a second example, consider the problem of inserting at a given position in a linked list. For the
LinkedList
class, this is normally anO(N)
operation, because you have to step through the list to find the position. In theory, if an application could find and remember a position, it should be able to perform the insertion at that position inO(1)
. Unfortunately, neither theList
APIs provides no way to "remember" a position.While neither of these examples is a fundamental roadblock to a developer "doing his own thing", they illustrate that using general data structure APIs and general implementations of those APIs has performance implications, and therefore represents a trade-off between performance and ease-of-use.