所以我需要一个二维的ConcurrentHashMap。
它必须尽可能快,因为我将非常频繁地添加和更新其值。 它位于多线程应用程序中,因此选择使用 ConcurrentHashMap 而不仅仅是 HashMap。
“x”和“y”索引都是具有已知范围(0 到 40,000,000)的整数。
我需要知道的是:实现这一点的最有效方法是什么,以便尽可能快? 最明显的途径是创建一个文字二维哈希图:
ConcurrentHashMap>> 或者我可以创建一个具有两个属性 x 和 y的
私有类“IntPair”,并将其用作键...但如果我这样做,最有效的方法是什么 equals( )
和 hashcode()
? 我最终会分配太多新的 IntPair 吗? 我可以为我分配的每个 x/y 保留一组 IntPair,然后使用纯粹自反的 equals() 以便我只检查完全相同的对象实例吗?
更新:
现在我已经仔细研究了 Integer.valueOf(int),它使用的特定缓存模型在这里没有意义,因为我正在处理一个具有不可预测条目的非常稀疏的矩阵。 我确实需要缓存所有使用的 IntPair,而不是预先指定的子集。
直观上,在我看来,在大地图中查找 IntPair 来查看我是否已经创建了它,实际上与在大“2-D”中查找它或多或少相同无论如何,ConcurrentHashMap 不是吗? 所以看来这里的解决方案实际上是每次查找密钥时都使用 new IntPair(x,y) 。 是的?
So I need a 2-dimensional ConcurrentHashMap
.
It has to be as blazing fast as possible, as I'm going to be adding to and updating its values extremely frequently. It's in a multithreaded application, hence the choice to use ConcurrentHashMap instead of just HashMap.
Both the "x" and "y" indices are integers with a known range (0 through 40,000,000).
What I need to know is: What's the most efficient way to implement this so it'll be as speedy as possible? The most obvious route is to do a literal 2-D hashmap:
ConcurrentHashMap<Integer, ConcurrentHashMap<Integer, ValueObj>> foo;
Or I could make a private class "IntPair" with two properties x and y, and use that as a key... though if I do that, what's the most efficient way to do equals()
and hashcode()
? and will I wind up allocating too many new IntPair
s? Could I keep a set of IntPair
s for each x/y I've assigned, and then use a purely reflexive equals() such that I'm just checking for the exact same object instance?
Update:
Now that I've taken a closer look at Integer.valueOf(int), the specific caching model it uses wouldn't make sense here, since I'm dealing with a very sparse matrix with unpredictable entries. I really need to be caching all those IntPairs which are used, not a prespecified subset.
Intuitively, it seems to me that looking up an IntPair in a big map to see if I've already created it would, in fact, be more-or-less the same as just looking it up in the big "2-D" ConcurrentHashMap anyway, wouldn't it? So it seems the solution here is really to just use new IntPair(x,y)
each time I look up a key. Yes?
发布评论
评论(4)
这取决于 40,000,000 x 40,000,000 矩阵中 (x,y) 点的稀疏程度。 我的猜测是,无论如何,矩阵都会非常稀疏,因此创建大量 ConcurrentHashMap 的成本将会很高。
相比之下,您的(不可变的)
IntPair
建议似乎更具吸引力。 正如您所建议的,您甚至可以缓存其中一些对以提高性能(请参阅Integer.valueOf(int)
查看如何使用静态嵌套类和静态工厂方法来实现)。 由于哈希码始终是必需的,因此您可以在构造函数中预先计算它并将其保存为最终字段。 要计算相等,您可以使用缓存中对象的身份相等性,否则您需要单独比较 x 和 y。编辑:这是 源代码 (OpenJDK)
Integer.valueOf( int)
。It depends on how sparse your (x,y) points are, in the 40,000,000 x 40,000,000 matrix. My guess is that the matrix is going to be quite sparse anyway, so creating a lot of
ConcurrentHashMap
s is going to be expensive.Your (immutable)
IntPair
suggestion seems more attractive in comparison. As you've suggested, you can even cache some of these pairs to improve performance (seeInteger.valueOf(int)
to see how this can be implemented using a static nested class and a static factory method). Since the hashcode will always be required, you can pre-compute it in the constructor and save it as a final field. To compute equals, you could use the identity equality for objects in the cache, otherwise you'll need to compare x and y individually.EDIT: Here's the source code (OpenJDK) for
Integer.valueOf(int)
.ConcurrentHashMap
非常大,因此您可能不需要它们的集合。生命周期短的对象实际上分配起来非常快。 无论如何,您是否必须创建
Integers
?您可以实习坐标对象,但仅查找的成本可能与创建它们相当。
Integer
的真正优势在于,当您保留大量实例一段时间后,相同的实例会被共享。如果性能确实是一个大问题,您可以编写(或使用)一个将长整型映射到引用的映射类型对象。 如果看到自定义地图也具有与坐标系相关的功能(例如查找最近的或在某个范围内),我不会感到惊讶。
ConcurrentHashMap
is quite large, so you probably don't want a collection of them.Short lived objects are actually very fast to allocate. Are you going to have to create the
Integers
anyway?You could intern the coordinate objects, but the cost for just a lookup would probably be comparable to creating them anyway. The real win with
Integer
is that the same instances are shared when you keep around lots of them for some time.If performance is really a huge issue, you could write (or use) a map-type object that maps longs to references. I wouldn't be surprised to see custom maps out there which also have functionality associated with coordinate systems (like finding nearest or within a range).
回应扎克,是的,矩阵将非常稀疏。
我查看了您链接的页面,毫无疑问 Integer.valueOf(int) 的功能是理想的。 如果我在
IntPair
类中开发了类似的静态方法,我是否可以假设我可以定义equals()
来仅检查严格的自反相等?也就是说,我在该页面中没有看到它解释如何使用静态嵌套类和静态工厂方法来实现该功能......我是否只是错过了它? 我怎么做?
谢谢!
In response to Zach, Yes, the matrix will be very sparse.
I looked at the page you linked, and without a doubt the functionality of Integer.valueOf(int) would be ideal. If I developed a similar static method within my
IntPair
class, can I assume that I could defineequals()
to only check for strict reflexive equality?That said, I don't see in that page where it explains how to implement that functionality using a static nested class and static factory method.... am I just missing it somehow? How do I do that?
Thanks!
我基于标准 Java HashMap 制作了一个 Int2DMap 实现。 我发现它比使用 IntPair 作为密钥更快。 但是,它需要同步。
I've made a Int2DMap implementation based on the standard Java HashMap. I find it is faster than using an IntPair as key. However it will need to be synchronized.