当前位置：文江博客话题详情

“协程”与“协程”之间的区别和“线程”？

发布于 2024-08-15 09:23:49 字数 22 浏览 9 评论 0原文

“协程”和“线程”有什么区别？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

妞丶爷亲个 2024-08-22 09:23:49

首先阅读： 并发与并行 - 有什么区别？

并发是任务分离以提供交错
执行。并行性是同时执行多个
件工作以提高速度。 —https://github.com/servo/servo/wiki/Design

简短回答：对于线程，操作系统根据其调度程序（操作系统内核中的一种算法）抢先切换正在运行的线程。通过协程，程序员和编程语言决定何时切换协程；换句话说，任务通过在设定点暂停和恢复功能来协作地进行多任务处理，通常（但不一定）在单个线程内。

长答案：与操作系统抢先调度的线程相比，协程切换是协作的，这意味着程序员（可能还有编程语言及其运行时）控制切换何时发生。

与抢占式的线程相比，协程切换是
合作（程序员控制何时发生切换）。这
内核不参与协程切换。
—http://www.boost。 org/doc/libs/1_55_0/libs/coroutine/doc/html/coroutine/overview.html

支持本机线程的语言可以在操作系统的线程上执行其线程（用户线程）线程（内核线程）。每个进程至少有一个内核线程。内核线程与进程类似，只不过它们与该进程中的所有其他线程共享其所属进程中的内存空间。进程“拥有”所有为其分配的资源，例如内存、文件句柄、套接字、设备句柄等，并且这些资源都在其内核线程之间共享。

操作系统调度程序是内核的一部分，它使每个线程运行一定的时间（在单处理器机器上）。调度程序为每个线程分配时间（时间片），如果线程在该时间内未完成，调度程序会抢占它（中断它并切换到另一个线程）。多个线程可以在多处理器计算机上并行运行，因为每个线程可以（但不一定必须）调度到单独的处理器上。

在单处理器计算机上，线程会快速被时间片和抢占（在 Linux 上默认时间片为 100 毫秒）抢占（之间切换），这使得它们并发。但是，它们不能并行（同时）运行，因为单核处理器一次只能运行一件事情。

协程和/或生成器可用于实现协作功能。它们不是在内核线程上运行并由操作系统调度，而是在单个线程中运行，直到屈服或完成，从而屈服于程序员确定的其他函数。具有生成器的语言（例如 Python 和 ECMAScript 6）可用于构建协程。 Async/await（见于 C#、Python、ECMAscript 7、Rust）是构建在生成 future/promise 的生成器函数之上的抽象。

在某些情况下，协程可能指的是堆栈函数，而生成器可能指的是无堆栈函数。

纤维、轻量级线程和绿色线程是协程或类似协程的事物的其他名称。它们有时可能看起来（通常是故意的）更像编程语言中的操作系统线程，但它们并不像真正的线程那样并行运行，而是像协程一样工作。（根据语言或实现的不同，这些概念之间可能存在更具体的技术特性或差异。）

例如，Java 有“绿色线程”；这些线程是由 Java 虚拟机 (JVM) 调度的，而不是在底层操作系统的内核线程上本地调度的。它们没有并行运行或利用多个处理器/内核——因为这需要本机线程！由于它们不是由操作系统调度的，因此它们更像是协程而不是内核线程。在 Java 1.2 中引入本机线程之前，Java 一直使用绿色线程。

线程消耗资源。在 JVM 中，每个线程都有自己的堆栈，通常大小为 1MB。 64k 是 JVM 中每个线程允许的最小堆栈空间量。可以在 JVM 的命令行上配置线程堆栈大小。尽管有这个名称，线程并不是免费的，因为它们使用资源，例如每个线程需要自己的堆栈、线程本地存储（如果有）以及线程调度/上下文切换/CPU 缓存失效的成本。这就是协程在性能关键型高并发应用程序中流行的部分原因。

Mac OS 只允许一个进程分配大约 2000 个线程，而 Linux 为每个线程分配 8MB 堆栈，并且只允许物理 RAM 中容纳的线程数。

因此，线程是最重的（就内存使用和上下文切换时间而言），然后是协程，最后是生成器是最轻的。

First read: Concurrency vs Parallelism - What is the difference?

Concurrency is the separation of tasks to provide interleaved
execution. Parallelism is the simultaneous execution of multiple
pieces of work in order to increase speed. —https://github.com/servo/servo/wiki/Design

Short answer: With threads, the operating system switches running threads preemptively according to its scheduler, which is an algorithm in the operating system kernel. With coroutines, the programmer and programming language determine when to switch coroutines; in other words, tasks are cooperatively multitasked by pausing and resuming functions at set points, typically (but not necessarily) within a single thread.

Long answer: In contrast to threads, which are pre-emptively scheduled by the operating system, coroutine switches are cooperative, meaning the programmer (and possibly the programming language and its runtime) controls when a switch will happen.

In contrast to threads, which are pre-emptive, coroutine switches are
cooperative (programmer controls when a switch will happen). The
kernel is not involved in the coroutine switches.
—http://www.boost.org/doc/libs/1_55_0/libs/coroutine/doc/html/coroutine/overview.html

A language that supports native threads can execute its threads (user threads) onto the operating system's threads (kernel threads). Every process has at least one kernel thread. Kernel threads are like processes, except that they share memory space in their owning process with all other threads in that process. A process "owns" all its assigned resources, like memory, file handles, sockets, device handles, etc., and these resources are all shared among its kernel threads.

The operating system scheduler is part of the kernel that runs each thread for a certain amount time (on a single processor machine). The scheduler allocates time (timeslicing) to each thread, and if the thread isn't finished within that time, the scheduler pre-empts it (interrupts it and switches to another thread). Multiple threads can run in parallel on a multi-processor machine, as each thread can be (but doesn't necessarily have to be) scheduled onto a separate processor.

On a single processor machine, threads are timesliced and preempted (switched between) quickly (on Linux the default timeslice is 100ms) which makes them concurrent. However, they can't be run in parallel (simultaneously), since a single-core processor can only run one thing at a time.

Coroutines and/or generators can be used to implement cooperative functions. Instead of being run on kernel threads and scheduled by the operating system, they run in a single thread until they yield or finish, yielding to other functions as determined by the programmer. Languages with generators, such as Python and ECMAScript 6, can be used to build coroutines. Async/await (seen in C#, Python, ECMAscript 7, Rust) is an abstraction built on top of generator functions that yield futures/promises.

In some contexts, coroutines may refer to stackful functions while generators may refer to stackless functions.

Fibers, lightweight threads, and green threads are other names for coroutines or coroutine-like things. They may sometimes look (typically on purpose) more like operating system threads in the programming language, but they do not run in parallel like real threads and work instead like coroutines. (There may be more specific technical particularities or differences among these concepts depending on the language or implementation.)

For example, Java had "green threads"; these were threads that were scheduled by the Java virtual machine (JVM) instead of natively on the underlying operating system's kernel threads. These did not run in parallel or take advantage of multiple processors/cores--since that would require a native thread! Since they were not scheduled by the OS, they were more like coroutines than kernel threads. Green threads are what Java used until native threads were introduced into Java 1.2.

Threads consume resources. In the JVM, each thread has its own stack, typically 1MB in size. 64k is the least amount of stack space allowed per thread in the JVM. The thread stack size can be configured on the command line for the JVM. Despite the name, threads are not free, due to their use resources like each thread needing its own stack, thread-local storage (if any), and the cost of thread scheduling/context-switching/CPU cache invalidation. This is part of the reason why coroutines have become popular for performance critical, highly-concurrent applications.

Mac OS will only allow a process to allocate about 2000 threads, and Linux allocates 8MB stack per thread and will only allow as many threads that will fit in physical RAM.

Hence, threads are the heaviest weight (in terms of memory usage and context-switching time), then coroutines, and finally generators are the lightest weight.

回复收藏 0 原文

绳情 2024-08-22 09:23:49

大约晚了 7 年，但这里的答案缺少一些关于协同例程与线程的上下文。为什么协程最近受到如此多的关注？与线程相比，我什么时候会使用它们？

首先，如果协程并发运行（从不并行），为什么有人会更喜欢它们而不是线程呢？

答案是协程可以以非常小的开销提供非常高的并发性。通常，在线程环境中，最多有 30-50 个线程，然后实际调度这些线程（由系统调度程序）所浪费的开销会显着减少线程实际执行有用工作的时间。

好的，对于线程，您可以具有并行性，但不能有太多并行性，这不是比在单线程中运行的协同例程更好吗？好吧，不一定。请记住，协同例程仍然可以在没有调度程序开销的情况下实现并发 - 它只是管理上下文切换本身。

例如，如果您有一个例程正在执行某些工作，并且它执行一个您知道会阻塞一段时间的操作（即网络请求），则使用协同例程您可以立即切换到另一个例程，而无需将系统调度程序包含在其中的开销这个决定 - 是的，程序员必须指定协同例程何时可以切换。

通过许多例程执行非常小的工作并自动在彼此之间切换，您已经达到了调度程序无法达到的效率水平。现在，您可以让数千个协程一起工作，而不是数十个线程。

因为您的例程现在在预先确定的点之间切换，所以您现在还可以避免锁定共享数据结构（因为您永远不会告诉您的代码在关键部分的中间切换到另一个协程））

另一个好处是内存使用量低得多。使用线程模型，每个线程都需要分配自己的堆栈，因此内存使用量随着线程数量的增加而线性增长。对于协同例程，例程的数量与内存使用情况没有直接关系。

最后，协同例程受到了很多关注，因为在某些编程语言（例如 Python）中，线程无论如何都不能并行运行 - 它们就像协同例程一样同时运行，但没有低内存和免调度开销。

About 7 years late, but the answers here are missing some context on co-routines vs threads. Why are coroutines receiving so much attention lately, and when would I use them compared to threads?

First of all if coroutines run concurrently (never in parallel), why would anyone prefer them over threads?

The answer is that coroutines can provide a very high level of concurrency with very little overhead. Generally in a threaded environment you have at most 30-50 threads before the amount of overhead wasted actually scheduling these threads (by the system scheduler) significantly cuts into the amount of time the threads actually do useful work.

Ok so with threads you can have parallelism, but not too much parallelism, isn't that still better than a co-routine running in a single thread? Well not necessarily. Remember a co-routine can still do concurrency without scheduler overhead - it simply manages the context-switching itself.

For example if you have a routine doing some work and it performs an operation you know will block for some time (i.e. a network request), with a co-routine you can immediately switch to another routine without the overhead of including the system scheduler in this decision - yes you the programmer must specify when co-routines can switch.

With a lot of routines doing very small bits of work and voluntarily switching between each other, you've reached a level of efficiency no scheduler could ever hope to achieve. You can now have thousands of coroutines working together as opposed to tens of threads.

Because your routines now switch between each other a pre-determined points you can now also avoid locking on shared data structures (because you would never tell your code to switch to another coroutine in the middle of a critical section)

Another benefit is the much lower memory usage. With the threaded-model, each thread needs to allocate its own stack, and so your memory usage grows linearly with the number of threads you have. With co-routines, the number of routines you have doesn't have a direct relationship with your memory usage.

And finally, co-routines are receiving a lot of attention because in some programming languages (such as Python) your threads cannot run in parallel anyway - they run concurrently just like coroutines, but without the low memory and free scheduling overhead.

回复收藏 0 原文