如果 Xeon X5355 上的线程数超过 2,性能会下降
我有一个奇怪的问题,但对你们中的一些人来说可能并不那么奇怪。
我正在编写一个使用升压线程并使用升压屏障来同步线程的应用程序。我有两台机器来测试该应用程序。
机器 1 是一台 core2 duo (T8300) cpu 机器(windows XP professional - 4GB RAM),我得到以下性能数据:
线程数:1,TPS:21
线程数: 2、TPS:35(提高 66%)
线程数量的进一步增加会降低 TPS,但这是可以理解的,因为机器只有两个核心。
机器 2 是一台 2 个四核 (Xeon X5355) CPU 机器(带有 4GB RAM 的 Windows 2003 服务器),具有 8 个有效核心。
线程数:1,TPS:21
线程数:2,TPS:27(提升 28%)
线程数:4,TPS:25< /strong>
线程数:8,TPS:24
如您所见,在 2 个线程之后性能会下降(尽管它有 8 个核心)。如果程序有一些瓶颈,那么对于 2 线程来说它也应该降级。
有什么想法吗? ,解释? ,操作系统对性能有影响吗? - Core2duo (2.4GHz) 的扩展性似乎比 Xeon X5355 (2.66GHz) 更好,尽管它具有更好的时钟速度。
谢谢你
-Zoolii
I have a strange problem but may not be that much strange to some of you.
I am writing an application using boost threads and using boost barriers to synchronize the threads. I have two machines to test the application.
Machine 1 is a core2 duo (T8300) cpu machine (windows XP professional - 4GB RAM) where I am getting following performance figures :
Number of threads :1 , TPS :21
Number of threads :2 , TPS :35 (66 % improvement)
further increase in number of threads decreases the TPS but that is understandable as the machine has only two cores.
Machine 2 is a 2 quad core ( Xeon X5355) cpu machine (windows 2003 server with 4GB RAM) and has 8 effective cores.
Number of threads :1 , TPS :21
Number of threads :2 , TPS :27 (28 % improvement)
Number of threads :4 , TPS :25
Number of threads :8 , TPS :24
As you can see, performance is degrading after 2 threads (though it has 8 cores). If the program has some bottle neck , then for 2 thread also it should have degraded.
Any idea? , Explanations ? , Does the OS has some role in performance ? - It seems like the Core2duo (2.4GHz) scales better than Xeon X5355 (2.66GHz) though it has better clock speed.
Thank you
-Zoolii
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
时钟速度和操作系统与代码编写方式的关系不大。要检查的事情可能包括:
分析软件瓶颈时可以使用的一种工具是简单的线程转储。在软件执行的整个生命周期中进行几次转储应该会暴露软件中的瓶颈。您也许可以获取该输出并使用它来重新评估您的代码。
The clock speed and the operating system doesn't have as much to do with it as the way your code is written. Things to check might include:
One tool at your disposal when analyzing software bottlenecks is the simple thread dump. Taking a few dumps throughout the life of an execution of your software should expose bottlenecks in your software. You may be able to take that output and use it to reevaluate your code.
添加更多 CPU 并不总是意味着更好的性能,锁定和争用会严重降低性能。需要考虑的因素是:
Adding more CPU's does not always equate to better performance, locking and contention can severely degrade performance. Factors to consider are:
根据经验,英特尔的政策可能是仅在该处理器上使用 2 线程或双进程,该版本的操作系统只能使用 pthreads,这两个处理器的设计符合具有不同规定的不同法律,或者允许,不允许自己的线程进程,处理器正在回退超过 n 个线程,并且报告此情况的错误消息的处理会减慢两个内核的吞吐量,并可能导致内核 3 和 4 停用。
Based on experience it could be that the Intel policy is 2 threads or dual-process only on that processor, that only pthreads can be used with that version of operating system, that the two processors were designed to conform to different laws with different provisions or allows, that the own thread process is not allowed, that more than n threads are being backed-out by the processor and the processing of error messages reporting this is slowing down throughput of the two cores and may lead to deactivate of cores 3 and 4.