为什么我的并行程序显示这种行为

发布于 2025-02-13 21:15:57 字数 228 浏览 0 评论 0原文

我正在尝试在机器上使用32个核心和4个线程上下文以及我的速度图上运行我的并行程序,为什么我有这么多的曲折?

I am trying to run my parallel program on machine with 32 core and 4 threads context and in my speed up graph why I am having so many zigzags?

Image File

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

春庭雪 2025-02-20 21:15:57

如果不对应用程序进行详细分析以及您正在运行的平台进行详细分析,就不可能说。通常,各种各样的事情可能会导致“抖动”的性能测量。

但是,总曲线与并行代码中的典型加速行为一致。增加工人的数量使您接近线性加速,直到达到阈值……通常与可用核心数有关。性能不会更高。确实,如果您不断增加工人的数量,各种资源争议效应可能会导致整体绩效 drop

(我不知道图表上的“快速流”与“线程”是什么意思,或者这是否与您的问题有关。)

It is not possible to say without doing a detailed analysis of your application ... and the platform that you are running it on. In general, there are all sorts of things that can cause "jittery" performance measurements.

However, the overall curve is consistent with typical speedup behavior in parallel code. Increasing the number of workers gives you close to linear speedup until you reach a threshold ... that typically relates to the number of available cores. The performance can go no higher. Indeed, if you keep increasing the number of workers, various resource contention effects can cause the overall performance to drop.

(I have no idea what "fastflow" vs "threads" means on your graph, or whether that is relevant to your question.)

姐不稀罕 2025-02-20 21:15:57

如果您有32个计算单元,则跌落约32是正常的。

如果您的活动线程比有计算单元更多,则通常会看到该线程从一个核心移动到另一个核心,这会导致高速缓存。

但是,您正在做正确的事:始终测量。不要仅仅假设优化会使您的代码更快。而是通过测量加速来显示它。

The drop around 32 is normal if you have 32 compute units.

If you have more active threads than there are compute units, you will often see the threads move from one core to another, and this causes cache misses.

You are doing the right thing, though: Always measure. Do not just assume that an optimization will make your code faster. Instead show it by measuring the speedup.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文