如今,设置线程亲和性而不是将其留给操作系统的充分理由是什么?
在此处搜索“线程亲和力”的答案时,我发现人们对此很感兴趣,但除了可能获得稳定的 QueryPerformanceTimer 结果外,没有什么理由这样做。
假设有一个现代操作系统和一个带有现代 4-6 核 CPU 的现代 2-4 插槽工作站/服务器类机器,那么人们有什么充分理由认为他们比操作系统的调度程序更了解呢?在现实世界中是否有任何情况下,更多地控制广告亲和力是正确的做法?可以展示什么样的性能优势?
上次我看到一个在某处设置线程亲和力的非常好的案例(例如,它得到了显示系统性能真正和显着改进的具体结果的支持),这与 Win2K 设备驱动程序有关。但我已经很多年没有见过这样的事情了,所以当有人告诉我他们需要控制线程亲和力(但不是为什么)时,我对此深感怀疑……但很想知道不是这样的。
Searching answers here for "thread affinity", I see a lot of interest in doing it but little justification for it save possibly getting stable QueryPerformanceTimer results.
Assuming a modern OS and a modern 2-4 socket workstation/server class machine with modern 4-6 core CPUs, what good reasons would anyone have for thinking they know better than their OS's scheduler ? Are there any real world situations where taking more control of thead affinity is the right thing to do ? What sort of performance benefits can be demonstrated ?
The last time I saw a really good case for setting thread affinity somewhere (as in, it was backed up by concrete results showing genuine and significant improvements in system performance), it was some obscure thing to do with Win2K device drivers. But I haven't seen anything like that in years so when someone tells me they need to control thread affinity (but not why) these days I am deeply sceptical... but curious to be shown otherwise.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
主要原因是您的某些内容严重依赖于缓存。操作系统调度程序不一定会按照您希望的程度考虑这一点。
The primary reason is if you have something that depends heavily upon caching. The OS scheduler doesn't necessarily take that into account to the degree you might like.
我用它来将线程分配给核心;例如,在模拟中,您完全在一个核心上进行物理处理,并允许在另一个核心上执行其余的计算。如果您处于熟悉硬件的严格环境中,那么能够控制这一点是有意义的。
当然,配置需要针对每个系统进行,因此默认情况下我让操作系统决定运行的核心,但保留限制核心使用的选项。
I use it to assign threads to cores; for example in a simulation you do the physics entirely on one core, and allow the rest of the computation to be executed on another one. It makes sense to be able to control this, if you're on a tight environment where you know the hardware.
Of course, configuring this needs to be done per system, so by default I let the OS decide the cores on which to run, but keep the option of restricting core usage.
在操作系统内核中,有时在内核模式驱动程序中,您需要在每个 CPU 上执行相同的操作(例如更新系统寄存器)。您可以在单个线程的循环中执行此操作,更改每次迭代的亲和力。
In the OS kernel and sometimes in kernel mode drivers you need to perform the same action on every CPU (e.g. update a system register). You can do that in a loop in a single thread, changing the affinity on each iteration.
对于台式机来说这是完全没有必要的。
但我可以看到一些有用的应用程序。例如,如果在其上运行的应用程序不发生变化,CPU 缓存就会喜欢它。
另一种可能性是您有一个关键任务 - 您给它整个 CPU,而其他任务则使用其余的 CPU。
或者相反:您有一些低优先级任务,您将它们全部放在一个 CPU 上,然后将其他任务留给更重要的任务(使用进程优先级将为您带来大部分好处,而无需亲和力,但我可以想象一些内存繁重不会的情况)。
For desktops it's quite unnecessary.
But I can see some applications where it would help. For example the CPU cache likes it if the app that runs on it doesn't change.
Another possibility is you have a critical task - you give it an entire CPU, and the other tasks use the rest of the CPUs.
Or the opposite: You have some low priority tasks, you put them all on one CPU, then leave the others free for more important tasks (using process priority will give you most of this benefit without having affinity, but I can imagine some memory heavy cases where it wouldn't).
我同意在大多数情况下最好让 CPU 来解决这个问题。然而,据我所知,采用线程亲和性的最常见原因是当您需要良好的缓存依赖性时。在多CPU系统中,当一个特定的CPU为自己单独缓存某些内容时,如果相同的内容已缓存在其他CPU中,那么我相信它会在另一个CPU上自动失效。因此,如果某个特定线程不断更换其执行的 CPU,那么缓存命中率就会太低。因此,在这种情况下,我认为程序员更好地判断 COU 亲和力是有意义的。
我还认为 Ariel 的上述关于确保关键任务不断获得 CPU 而不限制其他低优先级进程的观点也是有道理的。
I would agree its best to leave to the CPU to figure this out in most situations. However, the most common reason to go for thread affinity as far as I have seen is when you need good cache dependency. In multiple CPU systems, when a particular CPU caches something individually for itself and if the same thing has been cached in some other CPU, then I believe it can automatically get invalidated on the other CPU. So if a particular thread keeps changing CPUs on which it executes, then the cache hit rate will be too less. So in this case I guess it makes sense for the programmer to be a better judge of the COU affinities.
I also think the above point by Ariel about making sure a critical task constantly gets a CPU without throttling other low priority processes also makes sense.