C# .NET parallel-extensions task-parallel-library

在 Parallel.ForEach 中使用哈希表？

发布于 2024-08-10 11:37:28 字数 772 浏览 12 评论 0原文

我有一个 Parallel.ForEach 循环在体内运行密集操作。

该操作可以使用哈希表来存储值，并且可以重用于其他连续的循环项。我在密集操作完成后添加到Hashtable中，下一个循环项可以在Hashtable中查找并重用该对象，而不用再次运行密集操作。

但是，因为我使用 Parallel.ForEach 存在不安全问题，导致 Hashtable.Add 和 ContainsKey(key) 调用不同步，因为它们可能并行运行。引入锁可能会导致性能问题。

下面是示例代码：

Hashtable myTable = new Hashtable;
Parallel.ForEach(items, (item, loopState) =>
{
    // If exists in myTable use it, else add to hashtable
    if(myTable.ContainsKey(item.Key))
    {
       myObj = myTable[item.Key];
    }
    else
    {
       myObj = SomeIntensiveOperation();
       myTable.Add(item.Key, myObj); // Issue is here : breaks with exc during runtime
    }
    // Do something with myObj
    // some code here
}

TPL 库中必须有一些 API、属性设置可以处理这种情况。有没有？

原文

I have a Parallel.ForEach loop running an intensive operation inside the body.

The operation can use a Hashtable to store the values, and can be reused for other consecutive loop items. I add to the Hashtable after the intensive operation is complete, the next loop item can look up in the Hashtable and reuse the object, instead of running the intensive operation again.

However, because I am using Parallel.ForEach there is a unsafe issue, causing the Hashtable.Add and the ContainsKey(key) calls go out of sync, as they might be running in parallel. Introducing locks may cause perf issues.

Here's the sample code:

Hashtable myTable = new Hashtable;
Parallel.ForEach(items, (item, loopState) =>
{
    // If exists in myTable use it, else add to hashtable
    if(myTable.ContainsKey(item.Key))
    {
       myObj = myTable[item.Key];
    }
    else
    {
       myObj = SomeIntensiveOperation();
       myTable.Add(item.Key, myObj); // Issue is here : breaks with exc during runtime
    }
    // Do something with myObj
    // some code here
}

There must be some API, Property setting inside TPL library, that could handle this scenario. Is there?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

空城旧梦 2024-08-17 11:37:28

您正在寻找 System.Collections.Concurrent。 ConcurrentDictionary。新的并发集合使用显着改进的锁定机制，并且应该在并行算法中表现出色。

编辑：结果可能如下所示：

ConcurrentDictionary<T,K> cache = ...;
Parallel.ForEach(items, (item, loopState) =>
{
    K value;
    if (!cache.TryGetValue(item.Key, out value))
    {
        value = SomeIntensiveOperation();
        cache.TryAdd(item.Key, value);
    }

    // Do something with value
} );

警告：如果items中的元素并非全部具有唯一的item.Key，则< code>SomeIntentialOperation 可能会对该键调用两次。在示例中，密钥未传递给 SomeIntectiveOperation，但这意味着“Do Something with value”代码可以执行 key/valueA 和 key/valueB 对，并且只会存储一个结果在缓存中（也不一定是由 SomeIntectiveOperation 计算的第一个）。如果这是一个问题，您需要一个并行的惰性工厂来处理这个问题。另外，出于显而易见的原因，SomeIntectiveOperation 应该是线程安全的。

You're looking for System.Collections.Concurrent.ConcurrentDictionary<TKey, TValue>. The new concurrent collections use significantly improved locking mechanisms and should perform excellectly in parallel algorithms.

Edit: The result might look like this:

ConcurrentDictionary<T,K> cache = ...;
Parallel.ForEach(items, (item, loopState) =>
{
    K value;
    if (!cache.TryGetValue(item.Key, out value))
    {
        value = SomeIntensiveOperation();
        cache.TryAdd(item.Key, value);
    }

    // Do something with value
} );

Word of warning: if the elements in items do not all have unique item.Key, then SomeIntensiveOperation could get called twice for that key. In the example, the key isn't passed to SomeIntensiveOperation, but it means that the "Do something with value" code could execute key/valueA and key/valueB pairs, and only one result would get stored in the cache (not necessarily the first one computed by SomeIntensiveOperation either). You'd need a parallel lazy factory to handle this if it's a problem. Also, for obvious reasons SomeIntensiveOperation should be thread safe.

回复收藏 0 原文

趁年轻赶紧闹 2024-08-17 11:37:28

检查 System.Collections.Concurrent 命名空间我认为您需要 ConcurrentDictionary

回复收藏 0 原文

や三分注定 2024-08-17 11:37:28

使用 ReaderWriterLock，这对于持续时间短、读取次数多、写入次数少的工作具有良好的性能。您的问题似乎符合此规范。

所有读取操作都将快速运行并且无锁，唯一会被阻止的时间是发生写入时，并且该写入仅与将某些内容推入哈希表所需的时间相同。

MSDN 上的 ReaderWriterLockSlim

我想我会扔掉一些代码...

ReaderWriterLockSlim cacheLock = new ReaderWriterLockSlim();
Hashtable myTable = new Hashtable();
Parallel.ForEach(items, (item, loopState) =>
{
    cacheLock.EnterReadLock();
    MyObject myObj = myTable.TryGet(item.Key);
    cacheLock.ExitReadLock();

    // If the object isn't cached, calculate it and cache it
    if(myObj == null)
    {
       myObj = SomeIntensiveOperation();
       cacheLock.EnterWriteLock();
       try
       {
           myTable.Add(item.Key, myObj);
       }
       finally
       {
           cacheLock.ExitWriteLock();
       }           
    }
    // Do something with myObj
    // some code here
}

static object TryGet(this Hashtable table, object key)
{
    if(table.Contains(key))
        return table[key]
    else
        return null;
}

Use a ReaderWriterLock, this has good performance for work that has many reads and few writes that are of a short duration. Your problem seems to fit this specification.

All read operations will run quickly and lock free, the only time anyone will be blocked is when a write is happening, and that write is only as long as it takes to shove something in a Hashtable.

ReaderWriterLockSlim on MSDN

I guess I'll throw down some code...

ReaderWriterLockSlim cacheLock = new ReaderWriterLockSlim();
Hashtable myTable = new Hashtable();
Parallel.ForEach(items, (item, loopState) =>
{
    cacheLock.EnterReadLock();
    MyObject myObj = myTable.TryGet(item.Key);
    cacheLock.ExitReadLock();

    // If the object isn't cached, calculate it and cache it
    if(myObj == null)
    {
       myObj = SomeIntensiveOperation();
       cacheLock.EnterWriteLock();
       try
       {
           myTable.Add(item.Key, myObj);
       }
       finally
       {
           cacheLock.ExitWriteLock();
       }           
    }
    // Do something with myObj
    // some code here
}

static object TryGet(this Hashtable table, object key)
{
    if(table.Contains(key))
        return table[key]
    else
        return null;
}

回复收藏 0 原文