英特尔 TBB 与 Boost

发布于 2024-12-01 01:45:54 字数 300 浏览 3 评论 0原文

在我的新应用程序中，我可以灵活地决定使用多线程库。到目前为止我正在使用pthread。现在想探索跨平台库。我把 TBB 和 Boost 归零。我不明白TBB相对于Boost有什么好处。我正在尝试找出 TBB 相对于 Boost 的优势： TBB 维基摘录“相反，该库通过允许将操作视为“任务”来抽象对多个处理器的访问，这些操作由库的运行时引擎动态分配给各个内核，并通过自动高效地使用缓存。 TBB 程序根据算法创建、同步和销毁相关任务的图，”

但是线程库甚至需要担心线程到核心的分配。这不是操作系统的工作吗？那么使用 TBB 相对于 Boost 的真正好处是什么？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

妳是的陽光 2024-12-08 01:45:54

但是线程库甚至需要担心线程到核心的分配。这不是操作系统的工作吗？那么使用 TBB 相对于 Boost 的真正好处是什么？

你是对的，线程库通常不应该关心将线程映射到内核。而 TBB 则不然。 TBB 使用任务而不是线程进行操作。 TBB 的调度程序通过分配线程池并让它动态选择要运行的任务来利用所有核心。这是相对 Boost 的主要优点，使用 Boost，您需要手动将可用工作映射到线程。然后TBB提供了高级构造，例如parallel_for、parallel_pipeline等，可用于表达最常见的并行模式，并隐藏所有任务操作。

例如，让我们使用一段计算 Mandelbrot 分形点的代码（取自 http://warp.povusers。 org/Mandelbrot/，变量初始化省略）：

for(unsigned y=0; y<ImageHeight; ++y)
{
    double c_im = MaxIm - y*Im_factor;
    for(unsigned x=0; x<ImageWidth; ++x)
    {
        double c_re = MinRe + x*Re_factor;

        double Z_re = c_re, Z_im = c_im;
        bool isInside = true;
        for(unsigned n=0; n<MaxIterations; ++n)
        {
            double Z_re2 = Z_re*Z_re, Z_im2 = Z_im*Z_im;
            if(Z_re2 + Z_im2 > 4)
            {
                isInside = false;
                break;
            }
            Z_im = 2*Z_re*Z_im + c_im;
            Z_re = Z_re2 - Z_im2 + c_re;
        }
        if(isInside) { putpixel(x, y); }
    }
}

现在要使其与 TBB 并行，您需要的就是将最外层循环转换为 tbb::parallel_for （我使用为简洁起见，C++11 lambda）：

tbb::parallel_for(0, ImageHeight, [=](unsigned y)
{
    // the rest of code is exactly the same
    double c_im = MaxIm - y*Im_factor;
    for(unsigned x=0; x<ImageWidth; ++x)
    {
        ...
        // if putpixel() is not thread safe, a lock might be needed
        if(isInside) { putpixel(x, y); }
    }
});

TBB 将自动将所有循环迭代分配到可用内核（并且您不必关心有多少个内核）并动态平衡负载，以便如果某个线程有更多工作要做，其他线程则不会只是等待它，但会有所帮助，最大限度地提高 CPU 利用率。尝试使用原始线程来实现它，您会感受到不同:)

but do threading library even need to worry about the allocation of threads to cores. isn't this a job of operating system? So what is the real Benifit of using TBB over Boost?

You are right, a threading library usually should not care about mapping threads to cores. And TBB does not. TBB operates with tasks, not threads. TBB's scheduler utilizes all cores by allocating a pool of threads and letting it dynamically select which tasks to run. This is the main advantage over Boost, with which you will need to map available work to threads manually. And then TBB offers high-level constructs such as parallel_for, parallel_pipeline, etc. that can be used to express most common parallel patterns, and hide all manipulation with tasks.

For example, let's take a piece of code that calculates points of Mandelbrot fractal (taken from http://warp.povusers.org/Mandelbrot/, variable initialization omitted):

for(unsigned y=0; y<ImageHeight; ++y)
{
    double c_im = MaxIm - y*Im_factor;
    for(unsigned x=0; x<ImageWidth; ++x)
    {
        double c_re = MinRe + x*Re_factor;

        double Z_re = c_re, Z_im = c_im;
        bool isInside = true;
        for(unsigned n=0; n<MaxIterations; ++n)
        {
            double Z_re2 = Z_re*Z_re, Z_im2 = Z_im*Z_im;
            if(Z_re2 + Z_im2 > 4)
            {
                isInside = false;
                break;
            }
            Z_im = 2*Z_re*Z_im + c_im;
            Z_re = Z_re2 - Z_im2 + c_re;
        }
        if(isInside) { putpixel(x, y); }
    }
}

Now to make it parallel with TBB, all you need is to convert the outermost loop into tbb::parallel_for (I use a C++11 lambda for brevity):

tbb::parallel_for(0, ImageHeight, [=](unsigned y)
{
    // the rest of code is exactly the same
    double c_im = MaxIm - y*Im_factor;
    for(unsigned x=0; x<ImageWidth; ++x)
    {
        ...
        // if putpixel() is not thread safe, a lock might be needed
        if(isInside) { putpixel(x, y); }
    }
});

TBB will automatically distribute all loop iterations over available cores (and you don't bother how many) and dynamically balance the load so that if some thread has more work to do, other threads don't just wait for it but help, maximizing CPU utilization. Try implementing it with raw threads, and you will feel the difference :)

回复收藏 0 原文

甜中书 2024-12-08 01:45:54

Intel TBB 引入了它自己的线程池/调度程序和执行模型（包括诸如parallel_for 构造之类的东西），而Boost 只有基本的线程管理功能（创建线程和同步原语，仅此而已。）编写一个好的线程使用 Boost 的线程池是可能的，但很困难——TBB 已经配备了高度优化的线程池。因此，这完全取决于您的要求：如果您需要的只是“可移植 pthreads”，请使用 Boost，如果您需要更多，请使用 Intel TBB。

回复收藏 0 原文

~没有更多了~