高效插入长值集合

发布于 2024-11-18 15:18:55 字数 352 浏览 2 评论 0原文

我正在为一段代码进行指标收集,并希望存储时间差的集合(类型原始 long)以供以后分析

该集合的插入操作应该尽可能高效,以添加最少的数据结果的开销。

我首先测试了 ConcurrentLinkedQueue 集合。这给出了最差的性能(可能是由于装箱/拆箱),

我目前决定使用同步 gnu.trove.TLongArrayList,它的数据速度几乎快了 7 倍一组 500 万个多头。

对于其他可能成为此用例基准测试的良好候选者的任何建议,我们将不胜感激。我查看了 guava API,但似乎找不到任何东西

I'm doing metrics collection for a piece of code and want to store a collection of time differences (type primitive long) for later analysis

The insert operation for this collection should be as efficient as possible to add least overhead to the results.

I first tested out a ConcurrentLinkedQueue<Long> collection. This gave the worst performance (probably due to boxing/unboxing)

I've currently settled on using a synchronized gnu.trove.TLongArrayList which is almost 7 times faster for a data set of 5 million longs.

Any recommendations for other collection libraries that may be good candidates to benchmark for this use case would be gratefully acknowledged. I took a look at the guava API, but couldn't seem to find anything

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

z祗昰~ 2024-11-25 15:18:55

为了提高性能,您可以采取的措施是减少数据类型的大小。如果您可以将其减少为 int 将会有所帮助。 (通常两次调用 nanoTime() 之间的差异小于 20 亿)

您可以为集合设置一个好的起始大小。尤其是如果你知道你可能有多少。

如果您知道要记录的值的最大数量,则可以在未达到最大值时将 int[] 与可能的 counter 一起使用。这比使用对象更快。

Something you could do to improve performance is to cut the size of the data type. If you can reduce it to an int it would help. (often the difference between two calls to nanoTime() is less than 2 billion)

You can set a good starting size for the collection. esp if you know how many you are likely to have.

If you know the maximum number of values you will record you can use int[] with a possible counter if the maximum is not reached. This will me faster than using an Object.

溺深海 2024-11-25 15:18:55

Trove 的新版本正在开发中(最新版本是 3.0.0-RC2)。 此页面表示 Trove 3 比 Trove 2 快 10% 到 20%

。不幸的是:

  • Trove 3 具有 API 兼容性重大更改。
  • 在线 javadoc 尚不可用。
  • 您还无法从 Maven Central 获取它。 (你甚至无法获得 Trove 2.1.0 ... tsk,tsk。)

There's a new version of Trove in the pipeline (the latest is 3.0.0-RC2). This page says that Trove 3 is 10% to 20% faster that Trove 2.

Unfortunately:

  • Trove 3 has API compatibility breaking changes.
  • The online javadocs are not available yet.
  • You can't get it from Maven Central yet. (You can't even get Trove 2.1.0 ... tsk, tsk.)
_蜘蛛 2024-11-25 15:18:55

您应该尝试 fastutil。根据场景,fastutil 可能比 trove4j 更快

You should try fastutil. Depends on the scenario, it is possible that fastutil is faster than trove4j

鸵鸟症 2024-11-25 15:18:55

我不确定您的情况是否允许这样做,但是您是否考虑过将数据保存在每个线程的单独的、不同步的数据结构中?类似于包含 TLongArrayList 的 ThreadLocal。这将消除同步开销。

I'm not sure if your situation allows this, but did you consider saving your data in a separate, unsynchronized data structure for each thread? Something like a ThreadLocal containing a TLongArrayList. This would remove the synchronization overhead.

何必那么矫情 2024-11-25 15:18:55

如果您提前知道集合的大小,则可以使用单个不同步的 long[] 数组与 AtomicInteger 计数器相结合来获取下一个插入位置。

If you know ahead of time the size of the collection, you could use a single unsynchronized long[] array combined with an AtomicInteger counter to get the next insert position.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文