使用多个有限数量的线程处理项目列表

发布于 2024-10-19 21:31:58 字数 336 浏览 9 评论 0原文

基本上,我想在多个线程中处理一系列项目,而不是一次处理一个项目。 我只想一次运行有限数量的线程。 这种方法有意义吗?使用全局变量作为线程计数是唯一的选择吗? (下面是伪代码)

foreach item in list
    while thread_count >= thread_max
        sleep
    loop
    start_thread item
    thread_count++
next

function start_thread(item)
    do_something_to item
    thread_count--
end function

Basically, I want to process a list of items in multiple threads instead of one at a time.
I only want a limited number of threads going at a time.
Does this approach make sense? Is using a global variable for the thread count the only option? (pseudo-code below)

foreach item in list
    while thread_count >= thread_max
        sleep
    loop
    start_thread item
    thread_count++
next

function start_thread(item)
    do_something_to item
    thread_count--
end function

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

橘和柠 2024-10-26 21:31:58

我会为此使用 PLINQ 并指定最大并行度,如下所示:

我实际上正在更改对此的答案,因为我意识到您只想直接处理原始列表,而您不想进行任何其他过滤或映射(位置/选择)。在这种特殊情况下,最好使用 Parallel::ForEach 并通过 ParallelOptions 指定 MaxDegreeOfParallelism,如下所示:

 int myMaxDegreeOfParallelism = 4; // read this from config maybe

 Parallel.ForEach(
    list,
    new ParallelOptions
    {
        MaxDegreeOfParallelism = myMaxDegreeOfParallelism
    }
    item =>
    {
        // ... your work here ...
    });

现在,请记住,当您指定这样的最大值时,您将阻止 PLINQ 使用更多资源,即使它们有空。因此,如果它在 8 核机器上运行,它永远不会使用超过 4 个核。相反,仅仅因为您指定了 4,并不意味着保证 4 在任何给定时间同时执行。这一切都取决于 TPL 用来实现最佳效果的几种启发法。

I would use PLINQ for this and specify a max degree of parallelism like so:

I'm actually changing my answer on this one because I realized you just want to process a raw list directly and you're not doing any other filtering or mapping (Where/Select). In this particular case it would be better to use Parallel::ForEach and specify the MaxDegreeOfParallelism via ParallelOptions like so:

 int myMaxDegreeOfParallelism = 4; // read this from config maybe

 Parallel.ForEach(
    list,
    new ParallelOptions
    {
        MaxDegreeOfParallelism = myMaxDegreeOfParallelism
    }
    item =>
    {
        // ... your work here ...
    });

Now, keep in mind, when you specify a max like this you prevent PLINQ from being able to use any more resources even if they're availabe. So if this ran on an 8 core machine, it would never utilize more than 4 cores. Conversely, just because you specified 4, doesn't mean 4 are guaranteed to execute simultaneously at any given time. It all depends on several heuristics that the TPL is using to be optimal.

回忆凄美了谁 2024-10-26 21:31:58

这是有道理的,但我希望您知道这不是通常的方法,除非您有非常具体的性能原因或停留在 .NET 3.5 上。通常,您会在列表中的元素上使用 Parallel.ForEach ,并依赖于 partitioner 将工作划分为适当的块。

即使您没有 TPL,更惯用的做法是划分所有工作并立即为每个线程分配一大块工作,而不是在线程完成时零碎地分发工作。按照您的方式进行操作的唯一原因是,如果您预计给定工作项所花费的时间或多或少是不可预测的,因此您无法提前很好地划分工作。

(此外,您可以保留对线程的引用并检查有多少线程仍在工作以及有多少线程已完成。这将消除该变量。)

It makes sense, but I hope you're aware that this isn't the usual way to do it unless you have very specific performance reasons or are stuck on .NET 3.5. Normally you would use Parallel.ForEach over the elements in the list, and rely on the partitioner to divide up the work into appropriate chunks.

Even if you didn't have the TPL, it would be more idiomatic to divide up all the work and hand each thread a big chunk of work at once, rather than doling it out piecemeal at the moment a thread finishes. The only reason to do it your way is if you expected the amount of time a given work item takes to be more or less unpredictable, so you couldn't divide up the work well in advance.

(Also, you could just keep references to the threads and check how many are still working and how many are completed. That would do away with the variable.)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文