数据行级别的并发

发布于 2024-12-08 14:39:04 字数 345 浏览 5 评论 0原文

我有一个 HashMap 并希望单独同步每一行/条目以最大化并发性，因此通过这种方式，许多线程可以同时访问 HashMap 但不能有两个线程一个或多个线程可以同时访问同一行/条目。

我在代码中执行了以下操作，但我不确定它是否正确：

/* Lock/synchronize the data to this key, (skey is a key of type String) */
synchronized (aHashMap.get(skey)) {

    /* write the data (data is Integer) */
    aHashMap.put(skey, data);

}

原文

I have a HashMap and want to synchronize each row/entry separately in order to maximize concurrency, so in this way many threads can access the HashMap at the same time but no two threads or more can access the same row/entry at the same time.

I did the following in my code but I'm not sure if it's correct or not:

/* Lock/synchronize the data to this key, (skey is a key of type String) */
synchronized (aHashMap.get(skey)) {

    /* write the data (data is Integer) */
    aHashMap.put(skey, data);

}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

GRAY°灰色天空 2024-12-15 14:39:04

适当的解决方案很大程度上取决于您的具体问题。如果所有线程都可以更新映射中的任何条目，那么首先要尝试的是 ConcurrentHashMap：

在这种情况下，您描述的操作将替换为：

data = ... compute ...
aHashMap.replace(skey, data);

使用 ConcurrentHashMap 解决了数据竞争，但仍然存在一个问题。如果另一个线程同时更新相同的密钥，则其中一项计算将会丢失。如果您对此感到满意，那就太好了。否则，您可以：

do {
  oldData = aHashMap.get(skey);
  data = ... compute (maybe based on oldData) ... 
  boolean success = aHashMap.replace(skey, oldData, data);
} while(!success);

在这种情况下，仅当数据未更改时替换才会成功（并且替换是原子的）。如果失败，您可以将所有内容放入 do while 循环中重试，也许基于更新的值。

另外，请注意地图获取和替换之间不要产生任何副作用。该计算应该只创建一个全新的“数据”对象。如果您更新“oldData”对象或其他一些共享数据，您将得到意想不到的结果。

如果确实有副作用，一种方法是像这样创建键级锁：

synchronized(skey) {
  data = ... compute ... 
  aHashMap.replace(skey, data);
}

即使在这种情况下，仍然需要 ConcurrentHashMap。此外，这不会阻止其他一些代码更新地图中的该键。所有更新密钥的代码都需要锁定它。

另外，如果您更新“...compute...”中的oldData并且值在地图中不唯一，则这将不是线程安全的。如果您确实想在那里更新 oldData，请用另一个同步覆盖它。

如果这确实有效并且您对表演感到满意，那就别再犹豫了。

如果线程只更新值，而不更改键，那么您可以尝试将对转换为对象并使用与 Map 不同的东西。例如，您可以将对象集拆分为多个集合，然后将它们提供给线程。或者也许使用 ParallelArray。但我可能在这里离题了......:)

The appropriate solution depends very much on your particular problem. If all your threads can update any of the entries in the Map, then the first thing to try is ConcurrentHashMap:

In this case, the operation you described would be replaced with:

data = ... compute ...
aHashMap.replace(skey, data);

Using ConcurrentHashMap solves the data race but one problem remains. If another thread would update the same key at the same time, one of the computations would be lost. If you are ok with this, great. Otherwise, you can:

do {
  oldData = aHashMap.get(skey);
  data = ... compute (maybe based on oldData) ... 
  boolean success = aHashMap.replace(skey, oldData, data);
} while(!success);

In this case, replace will only succeed if the data hasn't changed (and the replace would be atomic). If if fails, you can put everything in a do while loop to try again, maybe based on the updated value.

Also, be careful not to have any side effects between the map get and replace. that computation should only create a brand new "data" object. If you update the "oldData" object or some other shared data you will get unexpected results.

If you do have side effects, one approach is to have make a key-level lock like this:

synchronized(skey) {
  data = ... compute ... 
  aHashMap.replace(skey, data);
}

Even in this case, ConcurrentHashMap is still needed. Also, this will not stop some other code from updating that key in the map. All code that updates the key would need to lock on it.

Also, this will not be thread-safe if you update oldData in "... compute ..." and the values are not unique within the map. If you do want to update oldData there, cover it with another synchronized.

If this does the trick and your content with the performance, look no further.

If the threads only update values, do not change the keys, then you might try converting your pairs to objects and use something different than a Map. For example, you could split the set of objects in several sets and then feed them to your threads. Or maybe use ParallelArray. But I might be digressing here... :)

回复收藏 0 原文

魔法唧唧 2024-12-15 14:39:04

您确实应该使用 ConcurrentHashMap< /a> 类可用。

您的解决方案有缺陷：一旦另一个线程将一个项目放入映射中，导致哈希映射扩展，您可能会丢失更新。此外，它显然取决于哈希图的所有用户是否遵守锁定，如果有人使用该对象锁定其他内容，您将遇到一大堆问题。

回复收藏 0 原文

余生一个溪 2024-12-15 14:39:04

您所采用的方法的问题在于您正在替换 lcok 对象。这意味着尝试执行更新的每个线程都可能锁定不同的对象，这会产生什么都不做的效果。

我会像其他人建议的那样使用 ConcurrentHashMap 。您的操作会替换该值，因此将其锁定，否则任何其他对象都不会在此处添加任何值。

ConcurrentMap<Integer, Value> map = new ConcurrentMap<Integer, Value>();

// thread safe write of the data. No locks required.
map.put(skey, data);

编辑：

如果您有 get() 并且您想更新可变值，则可以。

Value value = map.get(skey);
synchronized(value) {
    value.changeValue();
}

在这种情况下，无需替换相同的值。值需要它自己的同步或锁，因为它不是线程安全的。

如果你想“更新”一个不可变的值，你必须使用循环来继续尝试更新。这是假设这样做没有副作用。

while(true) {
   Value value = map.get(skey);
   Value value2 = compute(value);
   if(map.replace(skey, value, value2)) break;
}

该循环将不断迭代，直到成功替换预期替换的值。考虑到您将拥有比核心 (4-24) 更多的键（数百到数百万），此循环很少会循环多次，但在必要时会重试。

The problem with the approach you have is that you are replacing the lcok object. This means every thread which attempt to perform an update could be locking on a different object and this has the effect of doing nothing.

I would use ConcurrentHashMap as others have suggested. You operation replaces the value so lock it, or any other object doesn't add any value here.

ConcurrentMap<Integer, Value> map = new ConcurrentMap<Integer, Value>();

// thread safe write of the data. No locks required.
map.put(skey, data);

EDIT:

if you have a get() and you want to update the a mutable value you can.

Value value = map.get(skey);
synchronized(value) {
    value.changeValue();
}

In this case there is no need to replace the same value. Value needs its own synchronization or Lock as its not thread safe.

If you want to "update" an immutable value you have to use a loop to keep trying the update. This assumes there are no side effects of doing this.

while(true) {
   Value value = map.get(skey);
   Value value2 = compute(value);
   if(map.replace(skey, value, value2)) break;
}

This loop will keep iterating until it successfully replaces the value it expected to replace. Given you will have much more keys (hundreds to millions) than cores (4-24) this loop will rarely loop more than once, but will try again when it has to.

回复收藏 0 原文

~没有更多了~