Since you said "heavily multi-threaded", I'd say that more cores are preferred. Higher clock speeds will just mean faster context switching for threads. If your algorithms are parallelized, I'd say more cores will give you a greater boost.
If your application is currently CPU-bound, and if it is easy to increase the number of parallel threads that are doing computational work (i.e. there's minimal dependencies between them), then you will benefit from increasing the number of cores.
If neither of these is true (for instance, if most of your threads are for handling disk, network or user IO), you won't see much benefit either way.
然而,为了比较相似但不相同架构的 CPU,我建议您查看 SPEC_int_rate或 CPU 的 SPEC_fp_rate。 (仅当其浮点密集型时才使用后者,如果有疑问,则不使用;)
If your application scales well, you can assume your processing power for comparison purposes is speed * cores when comparing machines of the same architecture.
Based on those assumptions, your throughput is likely to be proportional to
4 * 2.53 = 10.12
6 * 2.4 = 14.4
The 6 core system could have up to 40% higher throughput.
However for comparing CPU's of similar but not the same architecture I suggest you look at SPEC_int_rate or SPEC_fp_rate for your CPUs. (Only use the latter if its floating point intensive, if in doubt, its not ;)
It depends on the nature of the tasks. If the tasks are serial in nature ("task 1 must be accomplished before task 2, task 2 must be accomplished before task 3," etc.) then processor speed is going to win; if the tasks can be executed in parallel ("tasks 1, 2, and 3 do not use data between them, and task 4 collates the results") then core count is more important.
In other words, it depends entirely on what the processors are being used for.
If the threads are relatively independent of each other - which is definitely the case in a server app where there's just a bunch of threads in a pool, and each of them is given in the next request from client - then the answer is indeed obvious.
Execution time with N cores = Time in Serial code + (Time in parallel code as 1 thread/N)
Now if parallel part is small already, then increasing N will only help a little. Say Time in parallel when run as 1 thread is 10s and time in serial part is 5s.
With 1 core , 15s.
With 2 cores, 10s,
With 4 cores, 7.5s,
With 5 cores, 7s
With 6 cores, 6.66s
Now instead, if we had 4 cores with 5% higher frequency,
Total time will be 7.5/1.05 = 7.14s. Thusm 6 cores seem faster, clearly!
Note: I assumed frequency perfectly translates into performance but this is often not the case due to memory and i/o accesses but it was fine for this analysis.
I also want to add that three more facotrs should be taken into account when doing a similar analysis.
Is the task memory intensive? I/O intensive? Programs that spend a lot of their time waiting for memory/disk/network accesses to finish do not benefit from higher frequency since processors are not the bottleneck. They do however benefit from parallelism since more requests can be fired in parallel and some of the latency can be hidden.
As someone already pointed out, take application scalability into account. If app only has 4 threads of concurrent execution, 6 cores won't help but higher freuqency may help.
The amount of serial code (inherent or accidental due to contention for critical section or load imbalance) can mean that higher frequency is better than more cores. When running serial code, the speed of the single core running the single thread is more important. This is derived from Ahmdahl's law.
Execution time with N cores = Time in Serial code + (Time in parallel code as 1 thread/N)
Now if parallel part is small already, then increasing N will only help a little. Say Time in parallel when run as 1 thread is 10s and time in serial part is 5s.
With 1 core , 15s.
With 2 cores, 10s,
With 4 cores, 7.5s,
With 5 cores, 7s
With 6 cores, 6.66s
Now instead, if we had 4 cores with 5% higher frequency,
Total time will be 7.5/1.05 = 7.14s. Thusm 6 cores seem faster, clearly!
Note: I assumed frequency perfectly translates into performance but this is often not the case due to memory and i/o accesses but it was fine for this analysis.
By the way, one interesting side note: The reason your 6 cores look better is because the two configurations are not comparable as their electric/heating requirements (from a running cost stand point) are not the same. The 6-core will cost more to buy, more to run, more to cool, etc and hence it is likely to provide higher performance.
@Joonas: Server apps can have critical sections, e.g., MySQL is full of them that make threads effectively serial when they contend for the critical section.
发布评论
评论(6)
既然你说“大量多线程”,我会说更多的核心是首选。更高的时钟速度意味着线程的上下文切换更快。如果你的算法是并行的,我想说更多的核心会给你带来更大的提升。
Since you said "heavily multi-threaded", I'd say that more cores are preferred. Higher clock speeds will just mean faster context switching for threads. If your algorithms are parallelized, I'd say more cores will give you a greater boost.
如果您的应用程序当前受 CPU 限制,并且很容易增加执行计算工作的并行线程的数量(即它们之间的依赖性最小),那么您将受益于增加内核数量。
如果这两者都不成立(例如,如果您的大多数线程用于处理磁盘、网络或用户 IO),那么无论哪种方式您都不会看到太多好处。
If your application is currently CPU-bound, and if it is easy to increase the number of parallel threads that are doing computational work (i.e. there's minimal dependencies between them), then you will benefit from increasing the number of cores.
If neither of these is true (for instance, if most of your threads are for handling disk, network or user IO), you won't see much benefit either way.
如果您的应用程序可扩展性良好,则在比较相同架构的机器时,您可以假设用于比较的处理能力为
速度 * 内核
。成正比
。6核系统的吞吐量可能高出 40%。
然而,为了比较相似但不相同架构的 CPU,我建议您查看 SPEC_int_rate或 CPU 的 SPEC_fp_rate。 (仅当其浮点密集型时才使用后者,如果有疑问,则不使用;)
If your application scales well, you can assume your processing power for comparison purposes is
speed * cores
when comparing machines of the same architecture.Based on those assumptions, your throughput is likely to be proportional to
The 6 core system could have up to 40% higher throughput.
However for comparing CPU's of similar but not the same architecture I suggest you look at SPEC_int_rate or SPEC_fp_rate for your CPUs. (Only use the latter if its floating point intensive, if in doubt, its not ;)
这取决于任务的性质。如果任务本质上是串行的(“任务 1 必须在任务 2 之前完成,任务 2 必须在任务 3 之前完成”等),那么处理器速度将会获胜;如果任务可以并行执行(“任务 1、2 和 3 不使用它们之间的数据,任务 4 整理结果”),那么核心数量就更重要。
换句话说,这完全取决于处理器的用途。
It depends on the nature of the tasks. If the tasks are serial in nature ("task 1 must be accomplished before task 2, task 2 must be accomplished before task 3," etc.) then processor speed is going to win; if the tasks can be executed in parallel ("tasks 1, 2, and 3 do not use data between them, and task 4 collates the results") then core count is more important.
In other words, it depends entirely on what the processors are being used for.
2.53/2.4 = 1.05
6/4 = 1.5
如果线程彼此相对独立 - 这在只有一堆线程的服务器应用程序中绝对是这种情况在一个池中,并且它们中的每一个都在客户端的下一个请求中给出 - 那么答案确实是显而易见的。
2.53/2.4 = 1.05
6/4 = 1.5
If the threads are relatively independent of each other - which is definitely the case in a server app where there's just a bunch of threads in a pool, and each of them is given in the next request from client - then the answer is indeed obvious.
我还想补充一点,在进行类似分析时还应该考虑三个因素。
该任务是否占用大量内存? I/O 密集型?花费大量时间等待内存/磁盘/网络访问完成的程序不会从更高的频率中受益,因为处理器不是瓶颈。然而,它们确实受益于并行性,因为可以并行触发更多请求,并且可以隐藏一些延迟。
正如有人已经指出的那样,请考虑应用程序的可扩展性。如果应用程序只有 4 个并发执行线程,则 6 个核心无济于事,但更高的频率可能会有所帮助。
串行代码的数量(由于关键部分争用或负载不平衡而固有或偶然)可能意味着更高的频率比更多的内核更好。运行串行代码时,单核运行单线程的速度更为重要。这是从艾哈姆达尔定律推导出来的。 。
顺便说一句,一个有趣的旁注:您的 6 核看起来更好的原因是因为这两种配置不具有可比性,因为它们的电力/加热要求(从运行成本的角度来看)不一样。 6 核的购买成本更高、运行成本更高、冷却成本更高等等,因此它可能会提供更高的性能。
@Joonas:服务器应用程序可以有关键部分,例如,MySQL充满了关键部分,当线程争用关键部分时,它们使线程有效地串行。
I also want to add that three more facotrs should be taken into account when doing a similar analysis.
Is the task memory intensive? I/O intensive? Programs that spend a lot of their time waiting for memory/disk/network accesses to finish do not benefit from higher frequency since processors are not the bottleneck. They do however benefit from parallelism since more requests can be fired in parallel and some of the latency can be hidden.
As someone already pointed out, take application scalability into account. If app only has 4 threads of concurrent execution, 6 cores won't help but higher freuqency may help.
The amount of serial code (inherent or accidental due to contention for critical section or load imbalance) can mean that higher frequency is better than more cores. When running serial code, the speed of the single core running the single thread is more important. This is derived from Ahmdahl's law.
By the way, one interesting side note: The reason your 6 cores look better is because the two configurations are not comparable as their electric/heating requirements (from a running cost stand point) are not the same. The 6-core will cost more to buy, more to run, more to cool, etc and hence it is likely to provide higher performance.
@Joonas: Server apps can have critical sections, e.g., MySQL is full of them that make threads effectively serial when they contend for the critical section.