C# - 将数据从 ThreadPool 线程传递回主线程

发布于 2024-10-03 22:03:27 字数 2184 浏览 5 评论 0原文

当前实现：等待直到收集了parallelCount值，使用ThreadPool处理这些值，等待所有线程完成，重新收集另一组值等等...

代码：

private static int parallelCount = 5;
private int taskIndex;
private object[] paramObjects;

// Each ThreadPool thread should access only one item of the array, 
// release object when done, to be used by another thread
private object[] reusableObjects = new object[parallelCount];     

private void MultiThreadedGenerate(object paramObject)
{
    paramObjects[taskIndex] = paramObject;
    taskIndex++;

    if (taskIndex == parallelCount)
    { 
        MultiThreadedGenerate();

        // Reset
        taskIndex = 0;
    }
}

/*
 * Called when 'paramObjects' array gets filled
 */
private void MultiThreadedGenerate()
{
    int remainingToGenerate = paramObjects.Count;

    resetEvent.Reset();

    for (int i = 0; i < paramObjects.Count; i++)
    {
        ThreadPool.QueueUserWorkItem(delegate(object obj)
        {
            try
            {
                int currentIndex = (int) obj;       

                Generate(currentIndex, paramObjects[currentIndex], reusableObjects[currentIndex]);
            }
            finally
            {
                if (Interlocked.Decrement(ref remainingToGenerate) == 0)
                {
                    resetEvent.Set();
                }
            }
        }, i);
    }

    resetEvent.WaitOne();    
}

我已经看到这种方法显着提高了性能，但是有许多问题需要考虑：

[1] 收集 paramObjects 中的值并使用 resetEvent 进行同步可以避免，因为线程之间（或当前的值集与下一组值）之间不存在依赖关系。我这样做只是为了管理对 reusableObjects 的访问（当一组 paramObjects 完成处理时，我知道 reusableObjects 中的所有对象都是免费的，因此 taskIndex< /code> 被重置，下一组值的每个新任务将有其唯一的“reusableObj”来使用）。

[2] reusableObjects 的大小和 ThreadPool 使用的线程数之间没有真正的联系。我可能会将 reusableObjects 初始化为具有 10 个对象，并且由于某些限制，ThreadPool 只能为我的 MultiThreadedGenerate() 方法运行 3 个线程，那么我就浪费了内存。

因此，通过摆脱 paramObjects，如何才能以一种方式改进上述代码，以便一旦一个线程完成其工作，该线程就返回其 taskIndex（或 < code>reusableObj) 它已使用且不再需要，以便它可用于下一个值。此外，代码应该创建一个 reUsableObject 并仅在需要时将其添加到某个集合中。在这里使用队列是个好主意吗？

谢谢。

原文

Current implementation: Waits until parallelCount values are collected, uses ThreadPool to process the values, waits until all threads complete, re-collect another set of values and so on...

Code:

private static int parallelCount = 5;
private int taskIndex;
private object[] paramObjects;

// Each ThreadPool thread should access only one item of the array, 
// release object when done, to be used by another thread
private object[] reusableObjects = new object[parallelCount];     

private void MultiThreadedGenerate(object paramObject)
{
    paramObjects[taskIndex] = paramObject;
    taskIndex++;

    if (taskIndex == parallelCount)
    { 
        MultiThreadedGenerate();

        // Reset
        taskIndex = 0;
    }
}

/*
 * Called when 'paramObjects' array gets filled
 */
private void MultiThreadedGenerate()
{
    int remainingToGenerate = paramObjects.Count;

    resetEvent.Reset();

    for (int i = 0; i < paramObjects.Count; i++)
    {
        ThreadPool.QueueUserWorkItem(delegate(object obj)
        {
            try
            {
                int currentIndex = (int) obj;       

                Generate(currentIndex, paramObjects[currentIndex], reusableObjects[currentIndex]);
            }
            finally
            {
                if (Interlocked.Decrement(ref remainingToGenerate) == 0)
                {
                    resetEvent.Set();
                }
            }
        }, i);
    }

    resetEvent.WaitOne();    
}

I've seen significant performance improvements with this approach, however there are a number of issues to consider:

[1] Collecting values in paramObjects and synchronization using resetEvent can be avoided as there is no dependency between the threads (or current set of values with the next set of values). I'm only doing this to manage access to reusableObjects (when a set paramObjects is done processing, I know that all objects in reusableObjects are free, so taskIndex is reset and each new task of the next set of values will have its unique 'reusableObj' to work with).

[2] There is no real connection between the size of reusableObjects and the number of threads the ThreadPool uses. I might initialize reusableObjects to have 10 objects, and say due to some limitations, ThreadPool can run only 3 threads for my MultiThreadedGenerate() method, then I'm wasting memory.

So by getting rid of paramObjects, how can the above code be refined in a way that as soon as one thread completes its job, that thread returns its taskIndex(or the reusableObj) it used and no longer needs so that it becomes available to the next value. Also, the code should create a reUsableObject and add it to some collection only when there is a demand for it. Is using a Queue here a good idea ?

Thank you.

分享到QQ

分享到微博