使用 OpenMP 性能下降 x64
在使用 VS 2008 在发布模式下以 x64“模式”运行代码时,我注意到一个非常奇怪的行为。
我寻求性能的提高,因此我将我的项目转向x64平台(我原来的项目是作为控制台应用程序Win32编写的)。
我确实取得了一些进步,但在跑步过程中我也发现了一个非常奇怪的行为。我使用 OpenMP 编译指示来并行化循环(“for”)。因此,当我在四核 i5 处理器上运行我的程序 (Win32) 时,我看到 100% 的 CPU 负载和 4 个线程。没关系。
但是当我切换到 x64 模式(在 Projet Properties->Configuration Manager->...)时,100% 负载下降(3-5 分钟后)到 75%、50%,甚至25%。但仍然有 4 个(!)线程在运行。 (根据资源监视器)。
为什么所有 4 个线程的 CPU 性能仅占总 CPU 性能的 25%?每个线程都应该在其单独的核心上运行。
PS 操作系统 Windows 7 x64 和 VS 2008。
提前致谢!非常感谢任何建议!
AK
I noticed a very strange behavior while running my code in x64 "mode" with VS 2008, in Release mode.
I seek for the performance improvement, thus I turned my project to x64 platform (my original project is written as a console application Win32).
I indeed gained some improvement, but also I found a very strange behavior during the run. I use OpenMP pragmas to parallelize loops ("for"). So when I run my program (Win32) on quad core i5 processor, I see 100% of CPU load and 4 threads. It's OK.
But when I switch to x64 mode (in Projet Properties->Configuration Manager->...) the 100% load Drops Down (after 3-5 minutes) to 75%, 50%, or even 25%. But still, there are 4 (!) threads run. (According to Resource Monitor).
How it is possible to have only 25% of the total CPU performance with all 4 treads? Each thread suppose to run on its individual core.
P.S. OS Windows 7 x64 and VS 2008.
Thanks in advance! Any suggestions are much appreciated!
A.K.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
已解决:
我想我确实知道我的问题的答案:当所有 4 个线程都处于活动状态时,只有 25% 的 CPU 负载,意味着 4 个核心中的 3 个完成了其工作,并等待最后一个核心完成其工作。可能这个核心被计算困住了(有一个积分计算,如果积分不收敛,它会尝试减少步长并继续计算)。
我不确定,这是我的猜测。
Solved:
I think I do know the answer for my question: only 25% of CPU load while all 4 treads are active, means that 3 of 4 cores finished their job and wait for the last core to finish his job. Probably this core is stuck with computations (there is an integral calculations and if integral does not converge, it tries to reduce stepsize and continue calculations).
I do not know for sure, it is my guess.