线和纤维有什么区别?
线和纤维有什么区别? 我听说过红宝石纤维,也听说过它们有其他语言版本,有人可以用简单的术语向我解释一下线和纤维之间的区别吗?
What is the difference between a thread and a fiber? I've heard of fibers from ruby and I've read heard they're available in other languages, could somebody explain to me in simple terms what is the difference between a thread and a fiber.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
用最简单的术语来说,线程通常被认为是抢占式的(尽管这可能并不总是正确的,具体取决于操作系统),而光纤则被认为是轻量级的协作线程。 两者都是应用程序的单独执行路径。
对于线程:当前执行路径可能随时被中断或抢占(注意:此声明是概括性的,并且可能并不总是成立,具体取决于操作系统/线程包/等)。 这意味着对于线程来说,数据完整性是一个大问题,因为一个线程可能会在更新一大块数据的过程中停止,从而使数据的完整性处于不良或不完整的状态。 这也意味着操作系统可以通过同时运行多个线程来利用多个 CPU 和 CPU 内核,并由开发人员来保护数据访问。
对于纤程:只有当纤程产生执行时,当前执行路径才会被中断(与上面的注释相同)。 这意味着光纤始终在明确定义的位置启动和停止,因此数据完整性不再是问题。 此外,由于纤程通常在用户空间中进行管理,因此无需进行昂贵的上下文切换和 CPU 状态更改,从而使得从一个纤程到下一个纤程的更改极其高效。 另一方面,由于没有两个纤程可以完全相同地运行,因此仅使用纤程将无法充分利用多个 CPU 或多 CPU 核心。
In the most simple terms, threads are generally considered to be preemptive (although this may not always be true, depending on the operating system) while fibers are considered to be light-weight, cooperative threads. Both are separate execution paths for your application.
With threads: the current execution path may be interrupted or preempted at any time (note: this statement is a generalization and may not always hold true depending on OS/threading package/etc.). This means that for threads, data integrity is a big issue because one thread may be stopped in the middle of updating a chunk of data, leaving the integrity of the data in a bad or incomplete state. This also means that the operating system can take advantage of multiple CPUs and CPU cores by running more than one thread at the same time and leaving it up to the developer to guard data access.
With fibers: the current execution path is only interrupted when the fiber yields execution (same note as above). This means that fibers always start and stop in well-defined places, so data integrity is much less of an issue. Also, because fibers are often managed in the user space, expensive context switches and CPU state changes need not be made, making changing from one fiber to the next extremely efficient. On the other hand, since no two fibers can run at exactly the same time, just using fibers alone will not take advantage of multiple CPUs or multiple CPU cores.
线程使用抢占式调度,而纤程使用协作调度。
对于线程,控制流可能随时中断,另一个线程可以接管。 使用多个处理器,您可以同时运行多个线程(同步多线程,或 SMT)。 因此,您必须非常小心并发数据访问,并使用互斥体、信号量、条件变量等保护您的数据。 要做到正确往往非常棘手。
对于光纤,控制仅在您告诉它时才切换,通常使用名为
yield()
的函数调用。 这使得并发数据访问变得更容易,因为您不必担心数据结构或互斥体的原子性。 只要您不让步,就不会有被抢占的危险,也不会有其他光纤尝试读取或修改您正在使用的数据的危险。 但结果是,如果您的光纤进入无限循环,则其他光纤都无法运行,因为您没有屈服。您还可以混合线和纤维,这会引起两者面临的问题。 不建议这样做,但如果谨慎行事,有时可能是正确的做法。
Threads use pre-emptive scheduling, whereas fibers use cooperative scheduling.
With a thread, the control flow could get interrupted at any time, and another thread can take over. With multiple processors, you can have multiple threads all running at the same time (simultaneous multithreading, or SMT). As a result, you have to be very careful about concurrent data access, and protect your data with mutexes, semaphores, condition variables, and so on. It is often very tricky to get right.
With a fiber, control only switches when you tell it to, typically with a function call named something like
yield()
. This makes concurrent data access easier, since you don't have to worry about atomicity of data structures or mutexes. As long as you don't yield, there's no danger of being preempted and having another fiber trying to read or modify the data you're working with. As a result, though, if your fiber gets into an infinite loop, no other fiber can run, since you're not yielding.You can also mix threads and fibers, which gives rise to the problems faced by both. Not recommended, but it can sometimes be the right thing to do if done carefully.
首先,我建议阅读进程和线程之间的区别的解释作为背景材料。
一旦你读完,它就非常简单了。 线程可以在内核、用户空间中实现,或者两者可以混合实现。 纤维基本上是在用户空间中实现的线程。
在《现代操作系统》第 11.4 节“Windows Vista 中的进程和线程”中,Tanenbaum 评论道:
First I would recommend reading this explanation of the difference between processes and threads as background material.
Once you've read that it's pretty straight forward. Threads cans be implemented either in the kernel, in user space, or the two can be mixed. Fibers are basically threads implemented in user space.
In section 11.4 "Processes and Threads in Windows Vista" in Modern Operating Systems, Tanenbaum comments:
在 Win32 中,纤程是一种用户管理的线程。 纤程有自己的堆栈和指令指针等,但纤程不由操作系统调度:您必须显式调用 SwitchToFiber。 相比之下,线程是由操作系统抢先调度的。 因此粗略地说,纤程是在应用程序/运行时级别管理的线程,而不是真正的操作系统线程。
其结果是光纤更便宜,并且应用程序对调度有更多的控制权。 如果应用程序创建大量并发任务,和/或希望在运行时进行密切优化,这一点可能很重要。 例如,数据库服务器可能选择使用纤程而不是线程。
(同一术语可能有其他用法;如前所述,这是 Win32 定义。)
In Win32, a fiber is a sort of user-managed thread. A fiber has its own stack and its own instruction pointer etc., but fibers are not scheduled by the OS: you have to call SwitchToFiber explicitly. Threads, by contrast, are pre-emptively scheduled by the operation system. So roughly speaking a fiber is a thread that is managed at the application/runtime level rather than being a true OS thread.
The consequences are that fibers are cheaper and that the application has more control over scheduling. This can be important if the app creates a lot of concurrent tasks, and/or wants to closely optimise when they run. For example, a database server might choose to use fibers rather than threads.
(There may be other usages for the same term; as noted, this is the Win32 definition.)
请注意,除了线程和纤程之外,Windows 7 还引入了用户模式调度:
有关线程、光纤和 UMS 的更多信息,请观看 Dave Probert:Windows 7 内部 - 用户模式调度程序 (UMS)。
Note that in addition to Threads and Fibers, Windows 7 introduces User-Mode Scheduling:
More information about threads, fibers and UMS is available by watching Dave Probert: Inside Windows 7 - User Mode Scheduler (UMS).
线程通常依靠内核来中断线程,以便它或另一个线程可以运行(这更好地称为抢占式多任务处理),而纤程使用协作多任务处理,其中纤程本身放弃其运行时间,以便其他光纤可以运行。
一些比我更好地解释它的有用链接是:
Threads generally rely on the kernel to interrupt the thread so it or another thread can run (which is better known as Pre-emptive multitasking) whereas fibers use co-operative multitasking where it is the fiber itself that give up the its running time so that other fibres can run.
Some useful links explaining it better than I probably did are:
线程最初是作为轻量级进程创建的。 以类似的方式,纤程是一种轻量级线程,(简单地)依靠纤程本身通过让出控制来相互调度。
我想下一步将是每次你希望他们执行指令时你都必须向他们发送一个信号(与我 5 岁的儿子不同:-)。 在过去(甚至现在在某些嵌入式平台上),所有线程都是纤程,没有抢占,您必须编写线程才能表现良好。
Threads were originally created as lightweight processes. In a similar fashion, fibers are a lightweight thread, relying (simplistically) on the fibers themselves to schedule each other, by yielding control.
I guess the next step will be strands where you have to send them a signal every time you want them to execute an instruction (not unlike my 5yo son :-). In the old days (and even now on some embedded platforms), all threads were fibers, there was no pre-emption and you had to write your threads to behave nicely.
线程由操作系统调度(抢占式)。 操作系统可以随时停止或恢复线程,但纤程或多或少会自我管理(协作)并相互让步。 也就是说,程序员控制光纤何时进行处理以及该处理何时切换到另一根光纤。
Threads are scheduled by the OS (pre-emptive). A thread may be stopped or resumed at any time by the OS, but fibers more or less manage themselves (co-operative) and yield to each other. That is, the programmer controls when fibers do their processing and when that processing switches to another fiber.
Win32 光纤定义实际上是 Sun Microsystems 建立的“Green Thread”定义。 没有必要在某种线程上浪费术语“纤程”,即在用户代码/线程库控制下在用户空间中执行的线程。
为了澄清这一论点,请看以下评论:
我们应该假设进程是由线程组成的,而线程应该是由纤维组成的。 考虑到这一逻辑,将纤维用于其他类型的线程是错误的。
Win32 fiber definition is in fact "Green Thread" definition established at Sun Microsystems. There is no need to waste the term fiber on the thread of some kind, i.e., a thread executing in user space under user code/thread-library control.
To clarify the argument look at the following comments:
We should assume that processes are made of threads and that threads should be made of fibers. With that logic in mind, using fibers for other sorts of threads is wrong.