什么是“大多数并发垃圾收集器”?
我知道停止世界、增量、并行、并发、(软/硬)实时垃圾收集器的概念。但我无法理解大部分并发 GC。和并发GC有什么不同吗?有什么区别?为什么它被称为大部分?
I know concepts of stop-the-world, incremental, parallel, concurrent, (soft/hard) realtime garbage collectors. But I can't understanding mostly-concurrent GC. Is it different with concurrent GC? What's the difference? Why it is called mostly?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
与许多其他主题一样,垃圾收集笼罩在术语模糊的迷雾中。伯姆因以非传统方式使用传统术语而臭名昭著,但我们应该原谅他,因为他在传统含义尚未僵化的时代开创了该领域! :-)
据我了解,stop-the-world GC 是指在 GC 周期的整个持续时间内(例如标记整个堆时)挂起所有变异器线程的算法。例如,.NET Server GC 会执行此操作,并因此导致 300 毫秒的巨大暂停时间。增量GC 在每个次要GC 周期执行一点主要GC 工作,例如OCaml GC 中的“主要切片”。并行意味着GC使用多个线程来加速垃圾收集过程。并发GC是指GC与mutators同时运行,例如.NET工作站GC。实时很难定义,最初意味着有界的最大暂停时间,但现在也意味着最小变异器利用率(MMU),以避免GC从不允许变异器运行而长时间暂停变异器的病态问题!根据 Richard Jones 的新书,即时 GC 一次不会挂起多个变体(即不存在停止世界阶段),尽管我怀疑他的意思是变体是彼此独立挂起的。最后,大多数并发 GC 会同时挂起所有赋值器线程,但仅挂起很短的时间,而不是任意长的 GC 周期。因此,在 GC 运行时,变异器大部分时间都可以自由运行,因此被称为“大部分并发”GC。
“大部分并发”的分类很重要,因为大多数(全部?)主要 GC 都属于这一类,因为它在暂停时间和吞吐量之间提供了良好的权衡。例如,.NET 工作站 GC 在拍摄全局根快照时会挂起所有变异器线程,但在标记和清除时会恢复它们。
Like so many other subjects, garbage collection is shrouded in a fog of terminological ambiguity. Boehm is particularly infamous for using conventional terms in unconventional ways but we should forgive him because he was pioneering the field at a time when the conventional meanings had not yet been ossified! :-)
As I understand it, stop-the-world GC refers to an algorithm that suspends all mutator threads for the entire duration of a GC cycle, e.g. when marking the entire heap. For example, the .NET Server GC does this and incurs huge 300ms pause times as a consequence. Incremental GCs perform a little bit of major GC work at each minor GC cycle, e.g. "major slices" in OCaml's GC. Parallel means the GC uses multiple threads to speedup the process of collecting garbage. Concurrent GC means the GC runs at the same time as the mutators, e.g. .NET workstation GC. Real-time is difficult to define, originally meant bounded maximum pause times but now also means minimum mutator utilization (MMU), to avoid the pathological problem of a GC that never pauses a mutator for long by never allowing it to run! According to Richard Jones' new book, an on-the-fly GC never suspends more than one mutator at a time (i.e. there is no stop-the-world phase) although I suspect he meant that mutators are suspended independently of each other. Finally, a mostly-concurrent GC is one that does suspend all mutator threads simultaneously but only for a short period of time and not for an arbitrarily-long GC cycle. Therefore, the mutators are allowed to run freely most of the time while the GC is running and, hence, it is called "mostly concurrent" GC.
The classification of "mostly concurrent" is important because most (all?) major GCs fall into this category because it provides a good trade-off between pause times and throughput. For example, the .NET workstation GC suspends all mutator threads when taking a snapshot of the global roots but resumes them while it marks and sweeps.
您可以在论文“大部分并行垃圾”中找到易于理解的描述Collection”,作者:Bohem、Demers 和 Shenker(ACM SIGPLAN '91 编程语言设计和实现会议记录,SIGPLANNotices 26, 6(1991 年 6 月),第 157-164 页)。
他们写道:
当他们提到跟踪垃圾收集器时,他们指的是启动的收集器从指定的“根节点”(通常是程序的寄存器)开始,并跟随指向每个可到达对象的指针。一切无法到达的东西都是垃圾。
简而言之,大部分并行的收集器并行完成大部分工作,然后停止程序的执行以更正收集器运行时程序所做的任何更改。因此,它是“大部分是并行的”。
You can find an accessible description in the paper "Mostly Parallel Garbage Collection" by Bohem, Demers, and Shenker (Proceedings of the ACM SIGPLAN '91 Conference on Programming Language Design and Implementation, SIGPLAN Notices 26, 6 (June 1991), pp. 157-164).
They write:
When they refer to tracing garbage collectors, they are referring to collectors that start from designated "root nodes" (usually the program's registers) and follow pointers to every reachable object. Everything unreachable is garbage.
In short, a mostly-parallel collector does the bulk of the work in parallel, then halts the program's execution to correct any changes that the program made while the collector was running. Hence, it is "mostly parallel."