是一个带有 c/c++ 线程的程序应该比串行运行得更快?

发布于 2024-12-10 14:14:47 字数 480 浏览 0 评论 0原文

我正在学习c/c++中线程的概念。我正在尝试这些例子 Pthreads 概述中引用的两个向量的点积。

我运行了代码的串行版本和线程版本,发现认为 串行版本比线程版本更快。我认为应该是 对面的。

我在单个 CPU 上运行。

I am learning the concept of threads in c/c++. I was trying the examples for
the dot product of two vectors cited in Pthreads Overview.

I ran both the serial and the thread version of the code and I found that the
serial version was faster than the thread version. I thought it should be the
opposite.

I am running on a single CPU.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

命比纸薄 2024-12-17 14:14:47

链接到的代码有一些您需要记住的问题:

  1. 串行版本正在计算两个长度为 100 的向量的内积。并行版本正在计算两个长度为 400 的向量的内积。您可以看到,在数组分配中(例如,a = (double*) malloc (NUMTHRDS*VECLEN*sizeof(double)); -- NUMTHRDS 设置为 4和代码中的 VECLEN 100)。因此,并行程序正在执行四倍的工作量,但有四个线程,因此天真的假设是串行和并行程序将具有相同的运行时间。
  2. 并行代码演示了用于线程同步的互斥体。这可能会在代码运行时造成障碍。
  3. 该代码使用四个线程。如果您的 CPU 没有四个或更多线程,那么您就不能指望它能够扩展。
  4. 创建线程会产生开销,对于这么小的问题,它可能是一个重要因素。

The code that link to has a few issues that you need to keep in mind:

  1. The serial version is doing an inner product of two length-100 vectors. The parallel version is doing an inner product of two length-400 vectors. You can see that in the array allocation (e.g., a = (double*) malloc (NUMTHRDS*VECLEN*sizeof(double)); -- NUMTHRDS is set to four and VECLEN 100 in the code). Therefore, the parallel program is doing four times the amount of work, but with four threads, so the naive assumption is that the serial and parallel programs will have the same run time.
  2. The parallel code is demonstrating a mutex for thread synchronization. This may be creating barriers in the code as it runs.
  3. The code is using four threads. If your CPU does not have four or more threads, then you cannot expect it to scale.
  4. There is an overhead in creating the threads, and with such a small problem it is probably a significant factor.
久伴你 2024-12-17 14:14:47

如果只有一个处理器,那么这并不奇怪。

线程代码仍然需要执行与串行代码相同的工作量,并且它还具有上下文切换的额外负担,从而减慢了速度。

如果可以并行化并且有多个核心来共享工作,则多线程代码将显示出加速。

If there's only have a single processor, then it's not much of a surprise.

The threaded code still has to execute the same amount of work as the serial code, and it has the additional burden of context switching to slow it down.

Multi-threaded code will show a speed up if parallelization is possible and there are multiple cores to share the work.

情泪▽动烟 2024-12-17 14:14:47

一般来说,线程程序没有理由比串行程序运行得更快,反之亦然。

在单个 CPU(或核心)上,您会产生从一个线程切换到另一个线程的开销。

即使在多 CPU(核心)系统上,也存在协调线程的开销。

如果问题是令人尴尬的并行,那么在多个线程上使用多个线程应该相对更容易更快地解决它核,但仍然可以通过性能较差的方式来解决。

对于 GUI 应用程序,让一个线程响应用户操作,同时另一个线程进行计算,可以提高系统的响应速度。

另请参阅阿姆达尔定律

在并行计算中使用多个处理器的程序的加速受到程序的顺序部分所需的时间的限制。例如,如果一个程序需要使用单个处理器核心 20 小时,并且其中 1 小时的特定部分无法并行化,而剩余的 19 小时(95%)有希望的部分可以并行化,那么无论我们投入多少个处理器为了并行执行该程序,最短执行时间不能少于关键的 1 小时。因此,如图所示,加速比最多限制为 20 倍。

In general, there is no reason a threaded program should run faster than a serial program, or vice versa.

On a single CPU (or core), you'd have the overhead of switching from one thread to another.

Even on a multiple CPU (core) system, there is the overhead of coordinating threads.

If a problem is embarrassingly parallel, it should be relatively easier to solve it faster with multiple threads on multiple cores, but it is still possible to solve it in a way that performs poorly.

With GUI applications, having one thread respond to user actions while another thread carries on computations makes the system more responsive.

See also Amdahl's Law:

The speedup of a program using multiple processors in parallel computing is limited by the time needed for the sequential fraction of the program. For example, if a program needs 20 hours using a single processor core, and a particular portion of 1 hour cannot be parallelized, while the remaining promising portion of 19 hours (95%) can be parallelized, then regardless of how many processors we devote to a parallelized execution of this program, the minimum execution time cannot be less than that critical 1 hour. Hence the speedup is limited up to 20×, as the diagram illustrates.

抚笙 2024-12-17 14:14:47

这将取决于您的设计。我没有看你的代码,但并行编程在加速程序方面占有一席之地。

在某些情况下,使用线程只是为了保持应用程序的响应能力(这很好)。在其他情况下,您将使用它来提高性能,但您需要在良好的CPU中很好地规划并行处理。

克劳迪奥·M·苏扎·儒尼奥尔

It will depend on your design. I did not take a look on your code, but parallel programming has its place on accelerating programs.

In some cases, threads are used just to keep the application responsive (it is good). In others, you will use to boost performance, but you need to make parallel processing well planned in a good cpu.

Claudio M. Souza Junior

苏璃陌 2024-12-17 14:14:47

在很多情况下,线程并不能提高性能。

  • 如果您的任务受内存限制,添加线程通常不会有帮助。您要做的就是增加对总线时间和缓存槽的争用。当您的线程将彼此的数据推出缓存时,会增加缓存未命中的机会,这会减慢速度 - 如果您有更多线程争夺总线,情况更是如此。

  • 如果它受 CPU 限制,性能增益将很大程度上取决于您拥有的内核数量。一旦线程数量超过可用核心数量,您将不会获得任何收益。

  • 如果您的任务处理相同的数据,您将需要它们之间的同步。这种同步意味着序列化代码的某些部分,如果任务太密切相关,那么无论如何您最终都会一次运行一个线程。

在每种情况下,您还必须计算使用线程所固有的成本(上下文切换、额外资源等)。如果成本超过收益,那么您很可能会因线程而实际减慢速度。

There are a bunch of cases where threading doesn't improve performance.

  • If your tasks are memory-bound, adding threads usually won't help. All you'll be doing is increasing contention for bus time and cache slots. As your threads push each other's data out of the cache, it increases the chances of a cache miss, which will slow things down -- even more so if you have more threads fighting over the bus.

  • If it's CPU-bound, the performance gain will be very dependent on how many cores you have. Once the number of threads exceeds the number of available cores, you won't gain anything.

  • If your tasks are working on the same data, you'll need synchronization between them. That synchronization will mean serializing some sections of code, and if the tasks are too closely related, you end up running one thread at a time anyway.

In each of these cases, you also have to figure in the costs inherent in the use of threads (context switching, extra resources, etc). If the costs exceed the gains, then you're very likely to actually slow down as a result of threading.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文