hashmap put性能问题
我的应用程序中有一个哈希图。该映射位于单例中,并且使用同步方法来保护更新或读取它的访问。
我的问题是在测试大量(20000+)并发线程时出现的。当线程使用 put() 写入映射时,我收到 OutOfMemory 异常。
读取操作很好(我可以模拟 1000000+ 线程),没有任何问题。
关于如何使我的哈希图的写入性能更高有什么建议吗?这也可能是我在内存中存储如此多数据的方法的限制?
i have a hashmap in my application. The map is in a singleton and access to update or read it is protected using synchronized methods.
My problem occurs when testing with large numbers(20000+) of concurrent threads. When threads are writing to the map using put() im getting OutOfMemory exception.
Read operations are fine (i can simulate 1000000+ threads) without any issue.
Any recommendations on how i can make my hashmap more performant for writes? This may also be a limitation with my approach of storing so much data in memory?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
我怀疑您由于线程数量而耗尽了 PermGen 内存。您的
OutOfMemoryError
异常应该告诉您它是堆还是 PermGen。Java 中的每个线程使用大约 256-512 KB 的堆栈空间,这些空间是从 PermGen 分配的。因此,20,000 个线程 * 256 KB = 5 GB,这远远超出了默认 PermGen 大小(通常为 64-256 MB)。
您应该将线程数量限制在几百个以内。查看 Java 5/6 的并发包,特别是 线程池执行器。
I suspect you're running out of PermGen memory due to the number of threads. Your
OutOfMemoryError
exception should tell you if it's heap or PermGen.Each thread in Java uses about 256-512 Kbytes for its stack, which is allocated from PermGen. So 20,000 threads * 256 Kbytes = 5 Gbytes which way beyond the default PermGen size (usually 64-256 Mbytes).
You should limit the number of threads to less than a few hundred. Take a look Java 5/6's concurrent package, in particular ThreadPoolExecutor.
听起来你的问题是内存,而不是性能。
尝试将具有相同哈希码的最近最少访问的键和值写入文件
并将它们从内存中清除。
如果存储哈希码的文件被寻址,则将最近最少使用的哈希码的密钥和销售额写入文件并从内存中清除,
然后将所需存储的读取文件读取到内存中。
考虑使用多个级别的哈希图(每个级别具有不同的键)以提高性能。
Sounds like your problem is memory, not performance.
Try writing least recently accessed keys and values with same hashcode to a file
and clearing them from memory.
If a file stored hashcode is addressed, write next least recently used hash code's keys and sales to a file and clear from memory,
then read desired stored read file to memory.
Consider multiple levels of hashmaps (each with different keys) for improving performance of this.
您尝试过 ConcurrentHashMap 吗?在正确的编码环境下,您不需要任何同步。内部有多个条带锁来减少争用,并且许多很好的复合原子操作(例如 putIfAbsent)可以允许您完全删除外部锁。
至于内存,我怀疑你确实在 JVM 中存储了很多内存。使用像 VisualVM 这样的监控工具来检查它,或者向 JVM 分配添加更多内存。考虑像 EHCache 这样的缓存,它会自动溢出到磁盘并在内部使用 ConcurrentHashMap,并且具有各种不错的边界选项
Have you tried a ConcurrentHashMap? Under the right coding circumstances you won't need any synchronization. There are multiple striped locks internally to reduce contention, and many nice compound atomic operations like putIfAbsent that may allow you to drop external locks entirely.
As for memory, I suspect you really are storing a lot in the JVM. Use a monitor tool like visualvm to check it out, or add more memory to the JVM allocation. Consider a cache like EHCache that will automatically overflow to disk and internally uses a ConcurrentHashMap, and has all kinds of nice bounding options
如果您使用的是JDK1.5+,ConcurrentHashMap是一个不错的选择。这很有效。
请参阅:ConcurrentHashMap 和 Collections.synchronizedMap(Map) 有什么区别?< /a>
另外,我认为
put()
可能会导致在映射中分配新的内存,并且更耗时,但是get()
不是。因此更多的线程将被阻塞在put()
中。另外,优化关键类的
hashCode()
方法。这很重要,因为哈希码计算在您的情况下是密集型操作。如果键对象是不可变的,则只需计算一次哈希码并将其保存为成员并直接在hashCode()
中返回。If you are using JDK1.5+, ConcurrentHashMap is a good choice. It's effective.
See: What's the difference between ConcurrentHashMap and Collections.synchronizedMap(Map)?
Also, I think
put()
may lead new memory allocate in map and more time consuming, butget()
not. So more threads will be blocked input()
.Also, Optimize the
hashCode()
method of your key class. It's important, as hash code calculation is intensive operation in your case. If the key object is immutable, calculate the hash code just once and save it as a member and return it directly inhashCode()
.如果您想保留当前的实现,您可能还需要考虑通过更改传递给 Java 的 -Xms 和 -Xmx 参数来更改分配给应用程序的内存量。还存在许多其他参数。无论使用何种实现方式,都可能需要执行此操作。
If you would like to keep your current implementation, you might also want to consider changing the amount of memory allocated to the application by changing the -Xms and -Xmx parameters passed to Java. Many other parameters exist as well. It may be necessary to do this regardless of the implementation used.
您可以使用 ConcurrentHashMap 来代替它,它比常规映射有更多优势。
我不确定您是否使用 Java5,因为它仅从版本 5 开始可用。
另外,我想说的是,再次考虑一下您的逻辑是否真的需要读取操作同步。如果不是,您可以将其删除,这样可以节省一些性能。
如果您确实感觉到内存不足的问题,您可以使用上面提到的更多虚拟机内存选项来运行 jvm。尝试一下。 :)
让键的哈希码方法高效。您可以依靠其他 api(例如 Pojomatic)来完成这些操作。
You can use ConcurrentHashMap instead of that and it has more advantages over regular map.
I am not sure whether you are using Java5, since it is available from version 5 only.
Also, I would say that, think once again on your logic whether you really require synchronization on read operations. If it's not, you can remove that and you will save some performance.
If you are really feel a low memory issue, you can run the jvm with more vm memory options said above. Give it a try. :)
Have your hashcode method for the keys efficient. You can depend on other apis such as Pojomatic to do that stuff.
至于你的问题的最后一部分:
关于如何使我的哈希图的写入性能更高有什么建议吗?这也可能是我在内存中存储如此多数据的方法的限制?
我使用一个工具来查看应用程序正在做什么。它可以进行堆和线程转储。它还有一个监视器,显示 cpu、加载的类、线程、堆和永久代。它称为 Java VisualVM,是 jdk 1.6 的一部分。exe 位于 jdk 的 bin 文件夹中。我将使用它来跟踪代码中的一些线程问题。
哈特哈,
詹姆斯
As far as the last part of your question:
Any recommendations on how i can make my hashmap more performant for writes? This may also be a limitation with my approach of storing so much data in memory?
I use a tool to take a look at what the application is doing. It can do heap and thread dumps. It also has a montior that displays the cpu, classes loaded, threads, heap, and perm gen. It's called Java VisualVM and it's part of jdk 1.6 The exe is in the bin folder of the jdk. I'm going to use it for tracking down some threading issues in our code.
HTH,
James
OutOfMemoryError 可能是由大量存储的对象引起的,而不是由大量线程引起的,并且 OOME 不是性能问题。
顺便说一句,您可以使用 ConcurrentHashMap 进行快速并发读写,并且不要使用一个全局锁。
OutOfMemoryError can be caused by large number of objects stored, not by large number of threads, and OOME is not a performance problem.
BTW, you can use ConcurrentHashMap for fast concurrent reads and writes, and do not use one global lock.