Java:当不需要并发时使用ConcurrentSkipList*的开销是多少?
我需要一个以迭代为主的场景中的排序列表(与插入/删除相比,根本不是随机获取)。出于这个原因,我考虑使用跳跃列表与树相比(迭代器应该更快)。
问题是java6只有一个跳跃列表的并发实现,所以我猜测在非并发场景中使用它是否有意义,或者开销是否使其成为一个错误的决定。
据我所知,ConcurrentSkipList* 基本上是基于 CAS 的无锁实现,因此它们不应该带来(太多)开销,但我想听听其他人的意见。
编辑:
经过一些微基准测试(在不同大小的 TreeSet、LinkedList、ConcurrentSkipList 和 ArrayList 上运行多次迭代)表明存在相当大的开销。 ConcurrentSkipList 确实将元素存储在内部的链接列表中,因此它在迭代上比 LinkedList 慢的唯一原因是由于上述开销。
I need a sorted list in a scenario dominated by iteration (compared to insert/remove, not random get at all). For this reason I thought about using a skip list compared to a tree (the iterator should be faster).
The problem is that java6 only has a concurrent implementation of a skip list, so I was guessing whether it makes sense to use it in a non-concurrent scenario or if the overhead makes it a wrong decision.
For what I know ConcurrentSkipList* are basically lock-free implementations based on CAS, so they should not carry (much) overhead, but I wanted to hear somebody else's opinion.
EDIT:
After some micro-benchmarking (running iteration multiple times on different-sized TreeSet, LinkedList, ConcurrentSkipList and ArrayList) shows that there's quite an overhead. ConcurrentSkipList does store the elements in a linked list inside, so the only reason why it would be slower on iteration than a LinkedList would be due to the aforementioned overhead.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
如果不需要线程安全,我会说完全跳过
package java.util.concurrent
。我的意思是,您看过 ConcurrentSkipListMap 的源代码吗? :-) 当我看到它时我总是微笑。这是 3000 行的一些我在 Java 中见过的最疯狂、最可怕但同时又最美丽的代码。 (感谢 Doug Lea 和他的同事将所有并发实用程序与集合框架完美集成!)话虽如此,在现代 CPU 上,代码和算法复杂性甚至不再那么重要。通常更重要的是将要迭代的数据放在内存中,以便 CPU 缓存可以更好地完成其工作。
听起来不错。如果您确实需要在迭代中压缩每一滴性能,您也可以尝试直接迭代原始数组。每次更改时重新填充它,例如通过调用 TreeSet.toArray() 或生成它,然后使用 Arrays.sort(T[], Comparator)< /代码>。但收益可能很小(如果 JIT 做得很好,收益甚至为零),因此可能不值得为此带来不便。
If thread-safety's not required I'd say to skip
package java.util.concurrent
altogether.I mean, have you seen the source code for ConcurrentSkipListMap? :-) I always have to smile when I look at it. It's 3000 lines of some of the most insane, scary, and at the same time beautiful code I've ever seen in Java. (Kudos to Doug Lea and co. for getting all the concurrency utils integrated so nicely with the collections framework!) Having said that, on modern CPUs the code and algorithmic complexity won't even matter so much. What usually makes more difference is having the data to iterate co-located in memory, so that the CPU cache can do its job better.
Sounds good. If you really need to squeeze every drop of performance out of iteration you could also try iterating a raw array directly. Repopulate it upon each change, e.g. by calling
TreeSet.toArray()
or generating it then sorting it in-place usingArrays.sort(T[], Comparator<? super T>)
. But the gain could be tiny (or even nothing if the JIT does its job well) so it might not be worth the inconvenience.根据在我公司使用的典型生产硬件上使用 Open JDK 6 进行的测量,您可以预期跳表映射上的所有添加和查询操作所花费的时间大约是树形映射上相同操作的两倍。
示例:
63 usec 与 31 usec 来创建和添加 200 个条目。
在 200 个元素的映射上,get() 需要 145 纳秒,而 get() 需要 77 纳秒。
对于较小和较大的尺寸,该比率不会发生太大变化。
(此基准测试的代码最终将被共享,以便您可以查看并自己运行它;抱歉,我们还没有准备好这样做。)
As measured using Open JDK 6 on typical production hardware my company uses, you can expect all add and query operations on a skip-list map to take roughly double the time as the same operation on a tree map.
examples:
63 usec vs 31 usec to create and add 200 entries.
145 ns vs 77 ns for get() on that 200-element map.
And the ratio doesn't change all that much for smaller and larger sizes.
(The code for this benchmark will eventually be shared so you can review it and run it yourself; sorry we're not ready to do that yet.)
好吧,您可以使用很多其他结构来执行跳跃列表,它存在于 Concurrent 包中,因为并发数据结构要复杂得多,并且使用并发跳跃列表的成本比使用其他并发数据结构来模拟跳跃列表要低。
在单线程世界中是不同的:您可以使用排序集、二叉树或自定义数据结构,它们的性能比并发跳跃列表更好。
迭代树列表或跳跃列表的复杂度始终为 O(n),但如果您使用链表或数组列表,则会遇到插入问题:在正确的位置插入一个项目(排序的链表)对于二叉树或跳表,插入的复杂度将为 O(n),而不是 O(log n)。
你可以在TreeMap.keySet()中迭代来按顺序获取所有插入的键,并且不会那么慢。
还有 TreeSet 类,这可能就是您所需要的,但由于它只是 TreeMap 的包装,因此直接使用 TreeMap 会更快。
Well you can use a lot of other structures to do the skip list, it exists in Concurrent package because concurrent data structures are a lot more complicated and because using a concurrent skip list would cost less than using other concurrent data structures to mimic a skip list.
In a single thread world is different: you can use a sorted set, a binary tree or your custom data structure that would perform better than concurrent skip list.
The complexity in iterating a tree list or a skip list will be always O(n), but if you instead use a linked list or an array list, you have the problem with insertion: to insert an item in the right position (sorted linked list) the complexity of insertion will be O(n) instead of O(log n) for a binary tree or for a skip list.
You can iterate in TreeMap.keySet() to obtain all inserted keys in order and it will not be so slow.
There is also the TreeSet class, that probably is what you need, but since it is just a wrapper to TreeMap, the direct use of TreeMap would be faster.
如果没有并发性,使用平衡二叉搜索树通常会更有效。在 Java 中,这将是一个
TreeMap
。
跳跃列表通常保留用于并发编程,因为它们易于实现,并且在多线程应用程序中速度很快。
Without concurrency, it is usually more efficient to use a balanced binary search tree. In Java, this would be a
TreeMap
.Skip lists are generally reserved for concurrent programming because of their ease in implementation the speed in multithreaded applications.
您似乎对这里的权衡有很好的把握,所以我怀疑有人能给您一个明确的、原则性的答案。幸运的是,这非常容易测试。
我首先创建一个简单的
Iterator
,它在随机生成的字符串的有限列表上无限循环。 (也就是说:在初始化时,它从 c< 池中生成一个由 a 个长度为 b 的随机字符串组成的数组_strings
/em> 不同的字符。第一次调用next()
返回_strings[0]
,下一次调用返回_strings[1]
...第 (n+1) 次调用返回再次_strings[0]
。)此迭代器返回的字符串是我在对SortedSet.add(...)
和的所有调用中使用的字符串。 SortedSetremove(...)
。然后,我编写了一个测试方法,该方法接受空的
SortedSet
并循环 d 次。在每次迭代中,它都会添加 e 元素,然后删除 f 元素,然后迭代整个集合。 (作为健全性检查,它通过使用add()
和remove()
的返回值来跟踪集合的大小,并且当迭代整个集合时,它确保它找到预期数量的元素。大多数情况下,我这样做只是为了在循环体内会有一些东西。)我认为我不需要解释我的
main(...)
方法执行此操作。 :-)我尝试了各种参数的不同值,发现有时
ConcurrentSkipListSet
表现更好,有时TreeSet
表现更好,但区别在于永远不会超过两倍。 时,ConcurrentSkipListSet
表现更好ConcurrentSkipListSet
的处理能力比TreeSet
更好具有更大的集合,和/或更昂贵的字符串比较。尝试专门设计的特定值来区分这两个因素,我的印象是ConcurrentSkipListSet
处理较大集合的情况明显优于TreeSet
,并且稍稍不太与更昂贵的字符串比较。 (这基本上就是您所期望的;TreeSet
的二叉树方法旨在进行绝对最小可能的比较次数。)add(...)
和remove(...)
次数很少。 (这正是您的预测。)确切的转换点取决于 a、b 和 c,但为第一个近似值,当 e + f 小于 10 左右时,ConcurrentSkipListSet
表现更好,TreeSet
当e + f 超过 20 左右。当然,这是在一台可能与您的计算机完全不同的机器上,使用的 JDK 与您的计算机完全不同,并使用非常人造的数据,可能与您的数据完全不同。我建议您运行自己的测试。由于
Tree*
和ConcurrentSkipList*
都实现了Sorted*
,因此您应该可以毫无困难地尝试这两种方式的代码并查看发现的内容。我的理解是这将取决于机器。在某些系统上,无锁实现可能是不可能的,在这种情况下,这些类将不得不使用锁。 (但由于您实际上并不是多线程,即使是锁也可能不会那么昂贵。当然,同步有开销,但其主要成本是锁争用和强制单线程。这对您来说不是问题。)
You seem to have a good grasp of the trade-off here, so I doubt anyone can give you a definitive, principled answer. Fortunately, this is pretty straightforward to test.
I started by creating a simple
Iterator<String>
that loops indefinitely over a finite list of randomly generated strings. (That is: on initialization, it generates an array_strings
of a random strings of length b out of a pool of c distinct characters. The first call tonext()
returns_strings[0]
, the next call returns_strings[1]
… the (n+1)th call returns_strings[0]
again.) The strings returned by this iterator were what I used in all calls toSortedSet<String>.add(...)
andSortedSet<String>remove(...)
.I then wrote a test method that accepts an empty
SortedSet<String>
and loops d times. On each iteration, it adds e elements, then removes f elements, then iterates over the entire set. (As a sanity-check, it keeps track of the set's size by using the return values ofadd()
andremove()
, and when iterates over the entire set, it makes sure it finds the expected number of elements. Mostly I did that just so there would be something in the body of the loop.)I don't think I need to explain what my
main(...)
method does. :-)I tried various values for the various parameters, and I found that sometimes
ConcurrentSkipListSet<String>
performed better, and sometimesTreeSet<String>
did, but the difference was never much more than twofold. In general,ConcurrentSkipListSet<String>
performed better when:ConcurrentSkipListSet<String>
copes somewhat better thanTreeSet<String>
with larger sets, and/or with more-expensive string-comparisons. Trying specific values designed to tease apart these two factors, I got the impression thatConcurrentSkipListSet<String>
coped noticeably better thanTreeSet<String>
with larger sets, and slightly less well with more-expensive string-comparisons. (That's basically what you'd expect;TreeSet<String>
's binary-tree approach aims to do the absolute minimum possible number of comparisons.)add(...)
s andremove(...)
s only a small number of times per iteration. (This is exactly what you predicted.) The exact turn-over point depended on a, b, and c, but to a first approximation,ConcurrentSkipListSet<String>
performed better when e + f was less than around 10, andTreeSet<String>
performed better when e + f was more than around 20.Of course, this was on a machine that may look nothing like yours, using a JDK that may look nothing like yours, and using very artificial data that might look nothing like yours. I'd recommend that you run your own tests. Since
Tree*
andConcurrentSkipList*
both implementSorted*
, you should have no difficulty trying your code both ways and seeing what you find.My understanding is that this will depend on the machine. On some systems a lock-free implementation may not be possible, in which case these classes will have to use locks. (But since you're not actually multi-threading, even locks may not be all that expensive. Synchronization has overhead, of course, but its main cost is lock contention and forced single-threading. That isn't an issue for you. Again, I think you'll just have to test and see how the two versions perform.)
如前所述,与 TreeMap 相比,SkipList 的开销很大,并且 TreeMap 迭代器不太适合您的用例,因为它只是重复调用 successor() 方法,结果非常慢。
因此,一种比前两种方法要快得多的替代方法是编写自己的 TreeMap 迭代器。实际上,我会完全放弃 TreeMap,因为 3000 行代码比您可能需要的代码有点庞大,只需使用您需要的方法编写一个干净的 AVL 树实现即可。基本的 AVL 逻辑只是任何语言的几百行代码 然后添加最适合您的情况的迭代器。
As noted SkipList has a lot of overhead compared to TreeMap and the TreeMap iterator isn't well suited to your use case because it just repeatedly calls the method successor() which turns out to be very slow.
So one alternative that will be significantly faster than the previous two is to write your own TreeMap iterator. Actually, I would dump TreeMap altogether since 3000 lines of code is a bit bulkier than you probably need and just write a clean AVL tree implementation with the methods you need. The basic AVL logic is just a few hundred lines of code in any language then add the iterator that works best in your case.