线程和核心

发布于 2024-09-02 18:12:49 字数 227 浏览 2 评论 0 原文

如果我的机器上有 X 个核心并且我启动 X 个线程。为了便于论证,我们假设每个线程在其使用的内存、硬盘等方面是完全分离的。操作系统是否知道将每个线程发送到一个核心,或者在一个核心上为多个线程进行更多时间切片。 问题归结为,如果我有 X 个核心,并且我的程序必须进行独立计算,我是否应该启动 X 个线程,它们是否会分别通过管道传输到一个核心,或者假设因为我有 X 个核心,所以我可以启动 X线程完全错误?我想是的。 这是用 C# 实现的——

If I have X cores on my machine and I start X threads. Let's assume for the sake of argument that each thread is completely separated in terms of the memory, hdd, etc it uses. Is the OS going to know to send each thread to a core or do more time slicing on one core for multiple threads.
What the question boils down to, is if I have X cores and my program must do independent calculations, should I start X threads, will they each get piped to a core, or is the presumption that because I have X cores I can start X threads completely wrong? I'm thinking it is.
This is with C# --

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

一影成城 2024-09-09 18:12:49

I'm going to say no...

The .NET team introduced the TPL to explicitly delegate thread execution to utilize multiple cores. Windows Vista didn't have much intelligence built in to support the OS delegating threads to multiple cores. I'm not suprised to see this improvement in the .NET framework(4.0) considering that Windows 7 has much improved support for multiple cores.

阳光①夏 2024-09-09 18:12:49

这完全取决于每个线程要完成多少工作。如果您要在 4 核计算机上启动 4 个线程并简单地运行一个紧密循环,那么它很可能会消耗 100% 的总 CPU 时间。

更广泛的问题是,给定 k 个线程和 k 个内核,操作系统是否会自动将每个线程调度为 0->k-1核心 0->k-1,则无法保证这一点。一般来说,一旦一个线程即将被调度运行,它就会被分配到下一个可用的CPU。然而,我相信,操作系统将是智能的,并且将尝试重用线程先前运行的同一核心,因为线程本地数据可能缓存在该核心上。然而,尽管如此,在当今共享处理器缓存的世界中,这并不是良好线程调度的先决条件。

您可以通过调用 SetProcessorAffinity() 方法。然而,我倾向于回避这样做,因为操作系统通常非常擅长让线程正确。

警告

跨多个线程的非统一内存访问存在一些有趣的问题,即使不涉及锁定,也会导致线程相互阻塞。

假设您有一个很大的值数组,并且您希望有 n 个线程对它们进行操作。您必须确保每个线程访问的数据与其他线程访问的数据位于单独的缓存行上 - 这是一个低级问题,不是 .Net 程序员(但在 C++ 或较低级别平台上长大的程序员)所使用的问题来处理。

MSDN 杂志的这篇文章很好地演示了这个问题。它使阅读变得引人入胜。

It would entirely depend on how much work each thread is going to do. If you were to start up 4 threads on a 4-core machine and simply run a tight loop then it is most likely going to consume 100% of total CPU time.

On the wider question of whether, given k threads and k cores, the OS will automatically schedule each thread 0->k-1 on the core 0->k-1, then this cannot be guaranteed. In general, once a thread is about to be scheduled to run, it will be allocated to the next available CPU. However, the OS will, I believe, be intelligent, and will try to reuse the same core that the thread previously ran on, given that thread local data is likely to be cached on that core. However, that said, in today's world of shared processor caches, this won't be a prerequisite for good thread scheduling.

You can influence a thread's affinity for a given core by calling the SetProcessorAffinity() method. However, I tend to shy away from doing this, because the OS is generally pretty good at getting your threads right.

CAUTION

There are some interesting issues with non-uniform memory access across multiple threads that will cause threads to block each other even where there is no locking involved.

Let's say that you have a large array of values and you want n threads to operate on them. You must ensure that each thread accesses data that is on a separate cache line to data accessed by other threads - a low-level issue that is not something that .Net programmers (but those who grew up on C++ or lower level platforms) are used to dealing with.

The problem is excellently demonstrated in this article from MSDN magazine. It makes for fascinating reading.

把人绕傻吧 2024-09-09 18:12:49

我想这可能取决于平台和操作系统。根据我的经验,对于 Linux 上的 C++ 控制台应用程序,如果您需要从计算机中挤出尽可能多的性能,那么在 X 内核上使用 X 线程正是正确的选择。但请注意,任何并发任务(包括 GUI)都会占用程序可用的 CPU 时间。但在没有 GUI 的专用服务器上,我的程序专门使用每个核心 99-100%。

I guess this might depend on platform and OS. From my experience, with a C++ console application on Linux, using X threads on X cores is exactly the right thing if you need to squeeze out as much performance as possible from a machine. However, note that any concurrent task (including GUI) will eat out of CPU time available to your program. But on a dedicated server without GUI I had each core 99-100% used exclusively by my program.

双马尾 2024-09-09 18:12:49

由于 C# 使用本机线程,因此我觉得我可以发表评论,尽管我的经验主要是使用 Java(在 Windows 上)。一般来说,操作系统会尝试平衡负载,因此,如果您在一个线程上使用计算密集型任务的核心最大化,那么该核心上将调度很少的线程。

我最近使用任务框架编写了一些 CPU 密集型多线程代码,其中工作被分解为小任务并馈送到 N 个队列。每个队列都由一个线程拥有。当我从 1..X 增加线程数量时,我得到了大致线性的速度提升,其中 X 是核心数量。

所以总的来说,答案是肯定的,您可以期望操作系统做正确的事情,特别是当线程数量增加并接近核心数量时。

Since C# uses native threads, I feel I can comment, even though my experience is mostly with Java (on Windows). In general, the OS will try to balance the load, so if you max out a core with a computationally intensive task on one thread, then few threads will be scheduled on that core.

I recently wrote some cpu-intensive multi-threaded code using a task framework, where the work is broken into small tasks and fed to N queues. Each queue is owned by a thread. I got roughtly linear speed up as I increased the number of threads from 1..X where X was the number of cores.

So in general, the answer is yes, you can expect the OS to do the right thing, especially as the number of threads increase and approaches the number of cores.

雾里花 2024-09-09 18:12:49

通常由操作系统调度程序将任务分配给执行核心。
令 N 为要运行的任务数,X 为执行核心数。

如果N<N X 你的机器资源不会被充分利用,除非你有其他任务在运行。
如果 N >= X,则操作系统的“最佳意图”是在所有可用内核之间对线程进行负载平衡。
实际上,除非您在每个任务线程上强制执行关联性,否则您无法保证所有任务都将在单独的内核上运行。
事实上,如果您的旧操作系统不支持 SMT 处理器,它就会被愚弄,并且可以为每个内核分配多个任务,而其他内核可能处于空闲状态。

Typically it's up to the OS scheduler to assign the tasks to the executing cores.
Let N be number of your tasks to run and X be number of execution cores.

If N < X your machine resources will not be fully employed, unless you have other tasks running.
if N >= X it's OS's "best intent" to load balance the threads between all available cores.
In reality you can't guarantee that all tasks will run on separate cores unless you enforce affinity on each task thread.
Matter fact if you have the older OS that doesn't understand SMT processors it will get fooled and can allocate multiple tasks per single core while other cores might be idling.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文