Java 分析:java.lang.Object.hashCode 占用一半的 CPU 时间但从未显式调用

发布于 2024-09-07 03:19:04 字数 288 浏览 18 评论 0原文

我已经使用 -agentlib:hprof=cpu=samples 对我的多线程程序进行了基准测试 并惊讶地在结果中发现以下行:

rank   self  accum   count trace method
   1 52.88% 52.88%    8486 300050 java.lang.Object.hashCode

我从未在程序中显式调用 hashCode() 。 这可能是什么原因?如何了解这次“浪费”的根源以及是否正常?

谢谢, 大卫

I have been benchmarked my multihreaded program using -agentlib:hprof=cpu=samples
and was surprised to find the following line in the results:

rank   self  accum   count trace method
   1 52.88% 52.88%    8486 300050 java.lang.Object.hashCode

I never explicitly call hashCode() in my program.
What can be the reason for this? How can I understand the source for this time "waste" and whether it is normal or not?

Thanks,
David

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

伴我老 2024-09-14 03:19:04

您很可能非常频繁地使用 Map,例如 HashMap。

HashMap 使用hashCode 来分布对象。如果您使用具有此数据结构的许多对象,则正确实现 .equals.hashCode 方法非常重要。

请参阅:Effective Java Item 8:当您覆盖 equals 时始终覆盖 hashCode

Most likely you're using very intensively a Map such as a HashMap.

HashMap used the hashCode to distribute the objects. If you're using many objects with this data structure, is very important your .equals and your .hashCode method are properly implemented.

See: Effective Java Item 8: Always override hashCode when you override equals

葬シ愛 2024-09-14 03:19:04

您应该做的一件事是检查匹配的堆栈跟踪以查看谁在调用它;变化是它确实是HashMap。

但除此之外,我注意到 hprof 往往会大大高估对 hashCode() 的调用;我真的很想知道如何以及为什么。这是基于实际了解代码的粗略性能概况;我已经看到 50% 的 cpu 使用率(通过采样),几乎可以肯定它绝对不会花那么长时间。 hashCode() 的实现仅返回一个 int 字段,并且方法是最终的(在最终对象上)。
所以它基本上是某种分析器工件......只是不知道如何或为什么,或如何摆脱它。

One thing you should do is to check out matching stack trace to see who is calling it; changes are it is indeed HashMap.

But beyond this, I have noticed that hprof tends to vasty overestimate calls to hashCode(); and I really would like to know how and why. This is based on actually knowing rough performance profile of code; and I have seen 50% percent cpu use (by sampling), where it is all but certain that it absolutely will not take that long. Implementation of hashCode() just returns an int field, and method is final (on final object).
So it is basically a profiler artifact of some sort... just no idea how or why, or how to get rid of it.

送舟行 2024-09-14 03:19:04

你可能是对的。我实际上可以放弃对随机访问功能的使用(您是这么称呼它的吗?),并且我不关心对象的顺序。我只需要能够添加对象然后迭代所有对象。另外,这确实是一个集合(我不需要多次使用同一个对象),但我也永远不会尝试多次添加它......我应该使用列表来代替(尽管我不关心顺序)?对于这样的集合,最有效的数据结构是什么?

HashSet 被实现为将键映射到自身的 HashMap,因此切换到 HashSet 在性能方面不会产生太大差异。

其他替代方案是 TreeSet 或(假设您的应用程序永远不会尝试插入重复项)List 类之一。如果您的应用程序支持 List,那么 ArrayList 或 LinkedList 将比 HashSet 或 TreeSet 更有效。

然而,您的应用程序将 50% 的时间花费在 hashCode 方法上,这非常可疑。除非调整哈希表的大小,否则每个 set 或 map 操作只能调用一次 hashCode 方法。因此,要么正在进行大量的地图/集合大小调整,要么您正在执行大量的集合添加操作。 (据我所知,对象哈希码方法很便宜,因此每次调用的成本不应该成为问题。)

编辑

nextInt() 真的很贵吗?还有其他选择吗?

不,它并不贵。看一下代码。 Random 类(和 nextInt() 方法)确实使用 AtomicLong 来使其线程安全,如果您编写了非线程安全版本,则可能会节省一些周期。源代码在你的JDK安装目录...看一下。

You are mot probably right. I can actually relinquish my use of the random access capabilities (is that how you call it?), and I don't care about the order of the objects. I just need to be able to add objects then iterate over all of them. Also, this is indeed a set (I don't need the same object more than once), but I will also never attempt to add it more than once... Should I use a list instead (although I don't care about the ordering)? What is the most efficient data structure for such a set?

A HashSet is implemented as a HashMap that maps the key to itself, so switching to a HashSet won't make much difference, performance-wise.

The other alternatives are a TreeSet, or (assuming that your application will never try to insert a duplicate) one of the List classes. If your application is such that a List will work, then an ArrayList or LinkedList will be more efficient than either a HashSet or TreeSet.

However, there is something very fishy about your application spending 50% of its time in hashCode methods. Unless the hash tables are resized, the hashCode method should only be called once per set or map operation. So either there is a lot of map/set resizing going on, or you are doing huge numbers of set add operations. (AFAIK, the Object hashcode method is cheap, so the cost of each call should not be an issue.)

EDIT

Is nextInt() really expensive? Any alternatives?

No it is not expensive. Take a look at the code. The Random class (and the nextInt() method) does make use of an AtomicLong to make it thread-safe, and you might save a few cycles if you coded a non-thread-safe version. The source code is in your JDK installation directory ... take a look.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文