Java：当不需要并发时使用ConcurrentSkipList*的开销是多少？

发布于 2024-12-13 07:21:26 字数 403 浏览 2 评论 0原文

我需要一个以迭代为主的场景中的排序列表（与插入/删除相比，根本不是随机获取）。出于这个原因，我考虑使用跳跃列表与树相比（迭代器应该更快）。

问题是java6只有一个跳跃列表的并发实现，所以我猜测在非并发场景中使用它是否有意义，或者开销是否使其成为一个错误的决定。

据我所知，ConcurrentSkipList* 基本上是基于 CAS 的无锁实现，因此它们不应该带来（太多）开销，但我想听听其他人的意见。

编辑：

经过一些微基准测试（在不同大小的 TreeSet、LinkedList、ConcurrentSkipList 和 ArrayList 上运行多次迭代）表明存在相当大的开销。 ConcurrentSkipList 确实将元素存储在内部的链接列表中，因此它在迭代上比 LinkedList 慢的唯一原因是由于上述开销。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

回眸一遍 2024-12-20 07:21:26

如果不需要线程安全，我会说完全跳过package java.util.concurrent。

有趣的是，有时 ConcurrentSkipList 在相同输入上比 TreeSet 慢，我还没有弄清楚原因。

我的意思是，您看过 ConcurrentSkipListMap 的源代码吗？ :-) 当我看到它时我总是微笑。这是 3000 行的一些我在 Java 中见过的最疯狂、最可怕但同时又最美丽的代码。（感谢 Doug Lea 和他的同事将所有并发实用程序与集合框架完美集成！）话虽如此，在现代 CPU 上，代码和算法复杂性甚至不再那么重要。通常更重要的是将要迭代的数据放在内存中，以便 CPU 缓存可以更好地完成其工作。

所以最后我将用一个新的 addSorted() 方法包装 ArrayList，该方法将排序插入到 ArrayList 中。

听起来不错。如果您确实需要在迭代中压缩每一滴性能，您也可以尝试直接迭代原始数组。每次更改时重新填充它，例如通过调用 TreeSet.toArray() 或生成它，然后使用 Arrays.sort(T[], Comparator)< /代码>。但收益可能很小（如果 JIT 做得很好，收益甚至为零），因此可能不值得为此带来不便。

回复收藏 0 原文

不交电费瞎发啥光 2024-12-20 07:21:26

根据在我公司使用的典型生产硬件上使用 Open JDK 6 进行的测量，您可以预期跳表映射上的所有添加和查询操作所花费的时间大约是树形映射上相同操作的两倍。

示例：

63 usec 与 31 usec 来创建和添加 200 个条目。
在 200 个元素的映射上，get() 需要 145 纳秒，而 get() 需要 77 纳秒。

对于较小和较大的尺寸，该比率不会发生太大变化。

（此基准测试的代码最终将被共享，以便您可以查看并自己运行它；抱歉，我们还没有准备好这样做。）

回复收藏 0 原文

蓝礼 2024-12-20 07:21:26

好吧，您可以使用很多其他结构来执行跳跃列表，它存在于 Concurrent 包中，因为并发数据结构要复杂得多，并且使用并发跳跃列表的成本比使用其他并发数据结构来模拟跳跃列表要低。

在单线程世界中是不同的：您可以使用排序集、二叉树或自定义数据结构，它们的性能比并发跳跃列表更好。

迭代树列表或跳跃列表的复杂度始终为 O(n)，但如果您使用链表或数组列表，则会遇到插入问题：在正确的位置插入一个项目（排序的链表）对于二叉树或跳表，插入的复杂度将为 O(n)，而不是 O(log n)。

你可以在TreeMap.keySet()中迭代来按顺序获取所有插入的键，并且不会那么慢。

还有 TreeSet 类，这可能就是您所需要的，但由于它只是 TreeMap 的包装，因此直接使用 TreeMap 会更快。

回复收藏 0 原文

机场等船 2024-12-20 07:21:26

如果没有并发性，使用平衡二叉搜索树通常会更有效。在 Java 中，这将是一个 TreeMap。

跳跃列表通常保留用于并发编程，因为它们易于实现，并且在多线程应用程序中速度很快。

回复收藏 0 原文

枯叶蝶 2024-12-20 07:21:26

您似乎对这里的权衡有很好的把握，所以我怀疑有人能给您一个明确的、原则性的答案。幸运的是，这非常容易测试。

我首先创建一个简单的 Iterator，它在随机生成的字符串的有限列表上无限循环。（也就是说：在初始化时，它从 c< 池中生成一个由 a 个长度为 b 的随机字符串组成的数组 _strings /em> 不同的字符。第一次调用 next() 返回 _strings[0]，下一次调用返回 _strings[1] ...第 (n+1) 次调用返回再次 _strings[0]。）此迭代器返回的字符串是我在对 SortedSet.add(...) 和 的所有调用中使用的字符串。 SortedSetremove(...)。

然后，我编写了一个测试方法，该方法接受空的 SortedSet 并循环 d 次。在每次迭代中，它都会添加 e 元素，然后删除 f 元素，然后迭代整个集合。（作为健全性检查，它通过使用 add() 和 remove() 的返回值来跟踪集合的大小，并且当迭代整个集合时，它确保它找到预期数量的元素。大多数情况下，我这样做只是为了在循环体内会有一些东西。）

我认为我不需要解释我的 main(...) 方法执行此操作。 :-)

我尝试了各种参数的不同值，发现有时 ConcurrentSkipListSet 表现更好，有时 TreeSet 表现更好，但区别在于永远不会超过两倍。时，ConcurrentSkipListSet 表现更好

一般来说，当a、b 和/或 c 相对较大。（我的意思是，在我测试的范围内。我的 a 的范围是 1000 到 10000，我的 b 的范围是 3 到 10，我的 c em> 从 10 到 80。总的来说，结果集大小的范围从大约 450 到恰好 10000，模式为 666 和 6666，因为我通常使用e=2‎f。）这表明 ConcurrentSkipListSet 的处理能力比 TreeSet 更好具有更大的集合，和/或更昂贵的字符串比较。尝试专门设计的特定值来区分这两个因素，我的印象是 ConcurrentSkipListSet 处理较大集合的情况明显优于 TreeSet，并且稍稍不太与更昂贵的字符串比较。（这基本上就是您所期望的；TreeSet 的二叉树方法旨在进行绝对最小可能的比较次数。）
e 和 f 很小；也就是说，当我每次迭代时仅调用 add(...) 和 remove(...) 次数很少。（这正是您的预测。）确切的转换点取决于 a、b 和 c，但为第一个近似值，当 e + f 小于 10 左右时，ConcurrentSkipListSet 表现更好，TreeSet当e + f 超过 20 左右。

当然，这是在一台可能与您的计算机完全不同的机器上，使用的 JDK 与您的计算机完全不同，并使用非常人造的数据，可能与您的数据完全不同。我建议您运行自己的测试。由于 Tree* 和 ConcurrentSkipList* 都实现了 Sorted*，因此您应该可以毫无困难地尝试这两种方式的代码并查看发现的内容。

据我所知，ConcurrentSkipList* 基本上是基于 CAS 的无锁实现，因此它们不应该承担（太多）开销，[...]

我的理解是这将取决于机器。在某些系统上，无锁实现可能是不可能的，在这种情况下，这些类将不得不使用锁。（但由于您实际上并不是多线程，即使是锁也可能不会那么昂贵。当然，同步有开销，但其主要成本是锁争用和强制单线程。这对您来说不是问题。）

You seem to have a good grasp of the trade-off here, so I doubt anyone can give you a definitive, principled answer. Fortunately, this is pretty straightforward to test.

I started by creating a simple Iterator<String> that loops indefinitely over a finite list of randomly generated strings. (That is: on initialization, it generates an array _strings of a random strings of length b out of a pool of c distinct characters. The first call to next() returns _strings[0], the next call returns _strings[1] … the (n+1)th call returns _strings[0] again.) The strings returned by this iterator were what I used in all calls to SortedSet<String>.add(...) and SortedSet<String>remove(...).

I then wrote a test method that accepts an empty SortedSet<String> and loops d times. On each iteration, it adds e elements, then removes f elements, then iterates over the entire set. (As a sanity-check, it keeps track of the set's size by using the return values of add() and remove(), and when iterates over the entire set, it makes sure it finds the expected number of elements. Mostly I did that just so there would be something in the body of the loop.)

I don't think I need to explain what my main(...) method does. :-)

I tried various values for the various parameters, and I found that sometimes ConcurrentSkipListSet<String> performed better, and sometimes TreeSet<String> did, but the difference was never much more than twofold. In general, ConcurrentSkipListSet<String> performed better when:

a, b, and/or c were relatively large. (I mean, within the ranges I tested. My a's ranged from 1000 to 10000, my b's from 3 to 10, my c's from 10 to 80. Overall, the resulting set-sizes ranged from around 450 to exactly 10000, with modes of 666 and 6666 because I usually used e=2‎f.) This suggests that ConcurrentSkipListSet<String> copes somewhat better than TreeSet<String> with larger sets, and/or with more-expensive string-comparisons. Trying specific values designed to tease apart these two factors, I got the impression that ConcurrentSkipListSet<String> coped noticeably better than TreeSet<String> with larger sets, and slightly less well with more-expensive string-comparisons. (That's basically what you'd expect; TreeSet<String>'s binary-tree approach aims to do the absolute minimum possible number of comparisons.)
e and f were small; that is, when I called add(...)s and remove(...)s only a small number of times per iteration. (This is exactly what you predicted.) The exact turn-over point depended on a, b, and c, but to a first approximation, ConcurrentSkipListSet<String> performed better when e + f was less than around 10, and TreeSet<String> performed better when e + f was more than around 20.

Of course, this was on a machine that may look nothing like yours, using a JDK that may look nothing like yours, and using very artificial data that might look nothing like yours. I'd recommend that you run your own tests. Since Tree* and ConcurrentSkipList* both implement Sorted*, you should have no difficulty trying your code both ways and seeing what you find.

For what I know ConcurrentSkipList* are basically lock-free implementations based on CAS, so they should not carry (much) overhead, […]

My understanding is that this will depend on the machine. On some systems a lock-free implementation may not be possible, in which case these classes will have to use locks. (But since you're not actually multi-threading, even locks may not be all that expensive. Synchronization has overhead, of course, but its main cost is lock contention and forced single-threading. That isn't an issue for you. Again, I think you'll just have to test and see how the two versions perform.)

回复收藏 0 原文

无敌元气妹 2024-12-20 07:21:26

如前所述，与 TreeMap 相比，SkipList 的开销很大，并且 TreeMap 迭代器不太适合您的用例，因为它只是重复调用 successor() 方法，结果非常慢。
因此，一种比前两种方法要快得多的替代方法是编写自己的 TreeMap 迭代器。实际上，我会完全放弃 TreeMap，因为 3000 行代码比您可能需要的代码有点庞大，只需使用您需要的方法编写一个干净的 AVL 树实现即可。基本的 AVL 逻辑只是任何语言的几百行代码然后添加最适合您的情况的迭代器。

回复收藏 0 原文

~没有更多了~