“协程”与“协程”之间的区别和“线程”?
“协程”和“线程”有什么区别?
What are the differences between a "coroutine" and a "thread"?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
“协程”和“线程”有什么区别?
What are the differences between a "coroutine" and a "thread"?
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(7)
首先阅读: 并发与并行 - 有什么区别?
简短回答:对于线程,操作系统根据其调度程序(操作系统内核中的一种算法)抢先切换正在运行的线程。通过协程,程序员和编程语言决定何时切换协程;换句话说,任务通过在设定点暂停和恢复功能来协作地进行多任务处理,通常(但不一定)在单个线程内。
长答案:与操作系统抢先调度的线程相比,协程切换是协作的,这意味着程序员(可能还有编程语言及其运行时)控制切换何时发生。
支持本机线程的语言可以在操作系统的线程上执行其线程(用户线程)线程(内核线程)。每个进程至少有一个内核线程。内核线程与进程类似,只不过它们与该进程中的所有其他线程共享其所属进程中的内存空间。进程“拥有”所有为其分配的资源,例如内存、文件句柄、套接字、设备句柄等,并且这些资源都在其内核线程之间共享。
操作系统调度程序是内核的一部分,它使每个线程运行一定的时间(在单处理器机器上)。调度程序为每个线程分配时间(时间片),如果线程在该时间内未完成,调度程序会抢占它(中断它并切换到另一个线程)。多个线程可以在多处理器计算机上并行运行,因为每个线程可以(但不一定必须)调度到单独的处理器上。
在单处理器计算机上,线程会快速被时间片和抢占(在 Linux 上默认时间片为 100 毫秒)抢占(之间切换),这使得它们并发。但是,它们不能并行(同时)运行,因为单核处理器一次只能运行一件事情。
协程和/或生成器可用于实现协作功能。它们不是在内核线程上运行并由操作系统调度,而是在单个线程中运行,直到屈服或完成,从而屈服于程序员确定的其他函数。具有生成器的语言(例如 Python 和 ECMAScript 6)可用于构建协程。 Async/await(见于 C#、Python、ECMAscript 7、Rust)是构建在生成 future/promise 的生成器函数之上的抽象。
在某些情况下,协程可能指的是堆栈函数,而生成器可能指的是无堆栈函数。
纤维、轻量级线程和绿色线程是协程或类似协程的事物的其他名称。它们有时可能看起来(通常是故意的)更像编程语言中的操作系统线程,但它们并不像真正的线程那样并行运行,而是像协程一样工作。 (根据语言或实现的不同,这些概念之间可能存在更具体的技术特性或差异。)
例如,Java 有“绿色线程”;这些线程是由 Java 虚拟机 (JVM) 调度的,而不是在底层操作系统的内核线程上本地调度的。它们没有并行运行或利用多个处理器/内核——因为这需要本机线程!由于它们不是由操作系统调度的,因此它们更像是协程而不是内核线程。在 Java 1.2 中引入本机线程之前,Java 一直使用绿色线程。
线程消耗资源。在 JVM 中,每个线程都有自己的堆栈,通常大小为 1MB。 64k 是 JVM 中每个线程允许的最小堆栈空间量。可以在 JVM 的命令行上配置线程堆栈大小。尽管有这个名称,线程并不是免费的,因为它们使用资源,例如每个线程需要自己的堆栈、线程本地存储(如果有)以及线程调度/上下文切换/CPU 缓存失效的成本。这就是协程在性能关键型高并发应用程序中流行的部分原因。
Mac OS 只允许一个进程分配大约 2000 个线程,而 Linux 为每个线程分配 8MB 堆栈,并且只允许物理 RAM 中容纳的线程数。
因此,线程是最重的(就内存使用和上下文切换时间而言),然后是协程,最后是生成器是最轻的。
First read: Concurrency vs Parallelism - What is the difference?
Short answer: With threads, the operating system switches running threads preemptively according to its scheduler, which is an algorithm in the operating system kernel. With coroutines, the programmer and programming language determine when to switch coroutines; in other words, tasks are cooperatively multitasked by pausing and resuming functions at set points, typically (but not necessarily) within a single thread.
Long answer: In contrast to threads, which are pre-emptively scheduled by the operating system, coroutine switches are cooperative, meaning the programmer (and possibly the programming language and its runtime) controls when a switch will happen.
A language that supports native threads can execute its threads (user threads) onto the operating system's threads (kernel threads). Every process has at least one kernel thread. Kernel threads are like processes, except that they share memory space in their owning process with all other threads in that process. A process "owns" all its assigned resources, like memory, file handles, sockets, device handles, etc., and these resources are all shared among its kernel threads.
The operating system scheduler is part of the kernel that runs each thread for a certain amount time (on a single processor machine). The scheduler allocates time (timeslicing) to each thread, and if the thread isn't finished within that time, the scheduler pre-empts it (interrupts it and switches to another thread). Multiple threads can run in parallel on a multi-processor machine, as each thread can be (but doesn't necessarily have to be) scheduled onto a separate processor.
On a single processor machine, threads are timesliced and preempted (switched between) quickly (on Linux the default timeslice is 100ms) which makes them concurrent. However, they can't be run in parallel (simultaneously), since a single-core processor can only run one thing at a time.
Coroutines and/or generators can be used to implement cooperative functions. Instead of being run on kernel threads and scheduled by the operating system, they run in a single thread until they yield or finish, yielding to other functions as determined by the programmer. Languages with generators, such as Python and ECMAScript 6, can be used to build coroutines. Async/await (seen in C#, Python, ECMAscript 7, Rust) is an abstraction built on top of generator functions that yield futures/promises.
In some contexts, coroutines may refer to stackful functions while generators may refer to stackless functions.
Fibers, lightweight threads, and green threads are other names for coroutines or coroutine-like things. They may sometimes look (typically on purpose) more like operating system threads in the programming language, but they do not run in parallel like real threads and work instead like coroutines. (There may be more specific technical particularities or differences among these concepts depending on the language or implementation.)
For example, Java had "green threads"; these were threads that were scheduled by the Java virtual machine (JVM) instead of natively on the underlying operating system's kernel threads. These did not run in parallel or take advantage of multiple processors/cores--since that would require a native thread! Since they were not scheduled by the OS, they were more like coroutines than kernel threads. Green threads are what Java used until native threads were introduced into Java 1.2.
Threads consume resources. In the JVM, each thread has its own stack, typically 1MB in size. 64k is the least amount of stack space allowed per thread in the JVM. The thread stack size can be configured on the command line for the JVM. Despite the name, threads are not free, due to their use resources like each thread needing its own stack, thread-local storage (if any), and the cost of thread scheduling/context-switching/CPU cache invalidation. This is part of the reason why coroutines have become popular for performance critical, highly-concurrent applications.
Mac OS will only allow a process to allocate about 2000 threads, and Linux allocates 8MB stack per thread and will only allow as many threads that will fit in physical RAM.
Hence, threads are the heaviest weight (in terms of memory usage and context-switching time), then coroutines, and finally generators are the lightest weight.
大约晚了 7 年,但这里的答案缺少一些关于协同例程与线程的上下文。为什么协程最近受到如此多的关注?与线程相比,我什么时候会使用它们?
首先,如果协程并发运行(从不并行),为什么有人会更喜欢它们而不是线程呢?
答案是协程可以以非常小的开销提供非常高的并发性。通常,在线程环境中,最多有 30-50 个线程,然后实际调度这些线程(由系统调度程序)所浪费的开销会显着减少线程实际执行有用工作的时间。
好的,对于线程,您可以具有并行性,但不能有太多并行性,这不是比在单线程中运行的协同例程更好吗?好吧,不一定。请记住,协同例程仍然可以在没有调度程序开销的情况下实现并发 - 它只是管理上下文切换本身。
例如,如果您有一个例程正在执行某些工作,并且它执行一个您知道会阻塞一段时间的操作(即网络请求),则使用协同例程您可以立即切换到另一个例程,而无需将系统调度程序包含在其中的开销这个决定 - 是的,程序员必须指定协同例程何时可以切换。
通过许多例程执行非常小的工作并自动在彼此之间切换,您已经达到了调度程序无法达到的效率水平。现在,您可以让数千个协程一起工作,而不是数十个线程。
因为您的例程现在在预先确定的点之间切换,所以您现在还可以避免锁定共享数据结构(因为您永远不会告诉您的代码在关键部分的中间切换到另一个协程) )
另一个好处是内存使用量低得多。使用线程模型,每个线程都需要分配自己的堆栈,因此内存使用量随着线程数量的增加而线性增长。对于协同例程,例程的数量与内存使用情况没有直接关系。
最后,协同例程受到了很多关注,因为在某些编程语言(例如 Python)中,线程无论如何都不能并行运行 - 它们就像协同例程一样同时运行,但没有低内存和免调度开销。
About 7 years late, but the answers here are missing some context on co-routines vs threads. Why are coroutines receiving so much attention lately, and when would I use them compared to threads?
First of all if coroutines run concurrently (never in parallel), why would anyone prefer them over threads?
The answer is that coroutines can provide a very high level of concurrency with very little overhead. Generally in a threaded environment you have at most 30-50 threads before the amount of overhead wasted actually scheduling these threads (by the system scheduler) significantly cuts into the amount of time the threads actually do useful work.
Ok so with threads you can have parallelism, but not too much parallelism, isn't that still better than a co-routine running in a single thread? Well not necessarily. Remember a co-routine can still do concurrency without scheduler overhead - it simply manages the context-switching itself.
For example if you have a routine doing some work and it performs an operation you know will block for some time (i.e. a network request), with a co-routine you can immediately switch to another routine without the overhead of including the system scheduler in this decision - yes you the programmer must specify when co-routines can switch.
With a lot of routines doing very small bits of work and voluntarily switching between each other, you've reached a level of efficiency no scheduler could ever hope to achieve. You can now have thousands of coroutines working together as opposed to tens of threads.
Because your routines now switch between each other a pre-determined points you can now also avoid locking on shared data structures (because you would never tell your code to switch to another coroutine in the middle of a critical section)
Another benefit is the much lower memory usage. With the threaded-model, each thread needs to allocate its own stack, and so your memory usage grows linearly with the number of threads you have. With co-routines, the number of routines you have doesn't have a direct relationship with your memory usage.
And finally, co-routines are receiving a lot of attention because in some programming languages (such as Python) your threads cannot run in parallel anyway - they run concurrently just like coroutines, but without the low memory and free scheduling overhead.
协程是顺序处理的一种形式:在任何给定时间只有一个正在执行(就像子例程又称为过程又称为函数一样——它们只是更流畅地在彼此之间传递接力棒)。
线程(至少在概念上)是并发处理的一种形式:多个线程可以在任何给定时间执行。 (传统上,在单 CPU、单核机器上,并发性是在操作系统的帮助下模拟的——如今,由于许多机器是多 CPU 和/或多核的,线程将事实上< /em> 同时执行,而不仅仅是“概念上”)。
Coroutines are a form of sequential processing: only one is executing at any given time (just like subroutines AKA procedures AKA functions -- they just pass the baton among each other more fluidly).
Threads are (at least conceptually) a form of concurrent processing: multiple threads may be executing at any given time. (Traditionally, on single-CPU, single-core machines, that concurrency was simulated with some help from the OS -- nowadays, since so many machines are multi-CPU and/or multi-core, threads will de facto be executing simultaneously, not just "conceptually").
讨论晚了 12 年,但协程的名称中有解释。 Coroutine可以分解为Co和Routine。
在这种情况下,例程只是一系列操作/动作,通过执行/处理例程,操作序列将按照指定的完全相同的顺序逐一执行。
Co代表合作。协同例程被要求(或者更好地期望)自愿暂停其执行,以便给其他协同例程也有执行的机会。因此,协同例程就是(自愿)共享 CPU 资源,以便其他人可以使用与自己使用的相同的资源。
另一方面,线程不需要暂停其执行。挂起对于线程来说是完全透明的,并且线程被底层硬件强制挂起。它的完成方式也使得它对线程来说基本上是透明的,因为它不会收到通知,并且它的状态不会改变,而是会被保存并在允许线程继续时恢复。
有一件事是不正确的,即协同例程不能同时执行并且竞争条件不能发生。这取决于协同例程运行的系统,并且很容易对协同例程进行成像。
协同例程如何暂停自己并不重要。回到 Windows 3.1 int 03 被编织到任何程序中(或者必须放置在那里),并且在 C# 中我们添加了yield。
12 Years late to the discussion but a coroutine has the explaination in the name. Coroutine can be decomposed into Co and Routine.
A routine in this context is just a sequence of operations/actions and by executing / processing a routine the sequence of operations gets executed one by one in the exact same order as specified.
Co stands for cooperation. A co routine is asked to (or better expected to) willingly suspend its execution to give other co-routines a chance to execute too. So a co-routine is about sharing CPU resources (willingly) so others can use the same resource as oneself is using.
A thread on the other hand does not need to suspend its execution. Being suspended is completely transparent to the thread and the thread is forced by underlying hardware to suspend itself. It is also done in a way so that it is mostly transparent to the thread as it does not get notified and it's state is not altered but saved and later restored when the thread is allowed to continue.
One thing that is not true, that co-routines can not be concurrently executed and race conditions can not occur. It depends on the system that the co-routines are running on and it is easy possible to imaging co-routines .
It does not matter how the co-routines suspend themselves. Back in Windows 3.1 int 03 was woven into any programs (or had to be placed in there) and in C# we add yield.
一句话:抢先。协程就像杂技演员一样,不断向彼此传递经过精心排练的要点。线程(真正的线程)几乎可以在任何时候中断,然后再恢复。当然,这会带来各种资源冲突问题,因此出现了Python臭名昭著的GIL——全局解释器锁。
许多线程实现实际上更像是协程。
In a word: preemption. Coroutines act like jugglers that keep handing off to each other a well-rehearsed points. Threads (true threads) can be interrupted at almost any point and then resumed later. Of course, this brings with it all sorts of resource conflict issues, hence Python's infamous GIL - Global Interpreter Lock.
Many thread implementations are actually more like coroutines.
这取决于您使用的语言。例如在Lua中它们是同一件事(协程的变量类型称为<代码>线程)。
通常,虽然协程会实现自愿让出,但程序员可以决定在哪里让出,即将控制权交给另一个例程。
相反,线程由操作系统自动管理(停止和启动),它们甚至可以在多核 CPU 上同时运行。
It depends on the language you're using. For example in Lua they are the same thing (the variable type of a coroutine is called
thread
).Usually though coroutines implement voluntary yielding where (you) the programmer decide where to
yield
, ie, give control to another routine.Threads instead are automatically managed (stopped and started) by the OS, and they can even run at the same time on multicore CPUs.
协程的意思是它是系统中异步运行的协作函数
而线程也是一个轻量级进程,在系统上并行运行。
线程可以共享父进程的内存和资源。
在操作系统中,线程的内存量是有限的,而协程不需要任何内存,因为协程在线程上执行。与线程相比,协同例程更有用。
coroutine means it is a cooperative function that runs asynchronously in the system
while thread is also a light-weight process, that run parallel on the system .
Thread can share the memory and resources of the parent process.
In the operating system , thread have a limited amount of memory while co-routine doesn't need any memory because co-routine execute on the thread. co-routine are more use as compared to thread.