关于 C++ 中垃圾回收的标记-清除(惰性方法)?
我知道引用计数器技术,但从未听说过标记清除技术,直到今天,当我阅读《编程语言概念》一书时。
据书中所述:
垃圾收集的原始标记-清除过程的操作如下:运行时系统根据请求分配存储单元,并根据需要断开指针与单元的连接,而不考虑存储回收(允许垃圾累积),直到分配完所有可用的存储单元。细胞。此时,开始进行标记清除过程,收集堆中漂浮的所有垃圾。为了促进该过程,每个堆单元都有一个额外的指示位或字段供收集算法使用。
根据我有限的理解,C++ 库中的智能指针使用引用计数技术。我想知道 C++ 中有没有使用这种智能指针实现的库?由于这本书纯粹是理论性的,我无法想象如何实现。一个证明这个想法的例子将非常有价值。如果我错了,请纠正我。
谢谢,
I know reference counter technique but never heard of mark-sweep technique until today, when reading the book named "Concepts of programming language".
According to the book:
The original mark-sweep process of garbage collection operates as follow: The runtime system allocates storage cells as requested and disconnects pointers from cells as necessary, without regard of storage reclamation ( allowing garbage to accumulate), until it has allocated all available cells. At this point, a mark-sweep process is begun to gather all the garbage left floating-around in the heap. To facilitate the process, every heap cells has an extra indicator bit or field that is used by the collection algorithm.
From my limited understanding, smart-pointers in C++ libraries use reference counting technique. I wonder is there any library in C++ using this kind of implementation for smart-pointers? And since the book is purely theoretical, I could not visualize how the implementation is done. An example to demonstrate this idea would be greatly valuable. Please correct me if I'm wrong.
Thanks,
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
在C++中使用垃圾收集有一个困难,那就是识别什么是指针,什么不是。
如果您可以调整编译器为每个对象类型提供此信息,那么您就完成了,但是如果您不能,那么您需要使用保守的方法:即扫描内存,搜索任何可能看起来像指针。这里还存在“位填充”的困难,即人们将位填充到指针中(高位在 64 位中大多未使用)或对两个不同的指针进行异或以“节省空间”。
现在,在 C++0x 中,标准委员会引入了标准 ABI 来帮助实现垃圾收集。在 n3225 中,您可以在20.9.11 指针安全 [util.dynamic.safety] 中找到它。当然,这是假设人们将为他们的类型实现这些函数:
实现时,它将授权您将任何垃圾收集方案(定义这些函数)插入您的应用程序中。当然,需要做一些工作才能在需要的地方实际提供这些操作。一种解决方案可能是简单地覆盖
new
和delete
但它不考虑指针算术......最后,垃圾收集有很多策略:引用计数(使用 Cycle检测算法)和标记和清除是主要的不同系统,但它们有不同的风格(是否为生成,是否为复制/压缩,...)。
There is one difficulty to using garbage collection in C++, it's to identify what is pointer and what is not.
If you can tweak a compiler to provide this information for each and every object type, then you're done, but if you cannot, then you need to use conservative approach: that is scanning the memory searching for any pattern that may look like a pointer. There is also the difficulty of "bit stuffing" here, where people stuff bits into pointers (the higher bits are mostly unused in 64 bits) or XOR two different pointers to "save space".
Now, in C++0x the Standard Committee introduced a standard ABI to help implementing Garbage Collection. In n3225 you can find it at 20.9.11 Pointer safety [util.dynamic.safety]. This supposes that people will implement those functions for their types, of course:
When implemented, it will authorize you to plug any garbage collection scheme (defining those functions) into your application. It will of course require some work of course to actually provide those operations wherever they are needed. One solution could be to simply override
new
anddelete
but it does not account for pointer arithmetic...Finally, there are many strategies for Garbage Collection: Reference Counting (with Cycle Detection algorithms) and Mark And Sweep are the main different systems, but they come in various flavors (Generational or not, Copying/Compacting or not, ...).
尽管他们现在可能已经对其进行了升级,但 Mozilla Firefox 曾经使用一种混合方法,其中尽可能使用引用计数智能指针,并并行运行标记和清除垃圾收集器来清理引用循环。尽管我不完全确定,其他项目可能也采用了这种方法。
我看到 C++ 程序员避免这种类型的垃圾收集的主要原因是,这意味着对象析构函数将异步运行。这意味着,如果创建了任何占用重要资源(例如网络连接或物理硬件)的对象,则无法保证及时进行清理。此外,如果析构函数要访问共享资源,则必须非常小心地使用适当的同步,而在单线程、直接引用计数解决方案中,这是不必要的。
这种方法的另一个复杂性是 C++ 允许对指针进行原始算术运算,这使得垃圾收集器的实现变得非常复杂。保守地解决这个问题是可能的(例如,看看 Boehm GC),尽管它是构建此类系统的一个重大障碍。
Although they may have upgraded it by now, Mozilla Firefox used to use a hybrid approach in which reference-counted smart pointers were used when possible, with a mark-and-sweep garbage collector running in parallel to clean up reference cycles. It's possible other projects have adopted this approach, though I'm not fully sure.
The main reason that I could see C++ programmers avoiding this type of garbage collection is that it means that object destructors would run asynchronously. This means that if any objects were created that held on to important resources, such as network connections or physical hardware, the cleanup wouldn't be guaranteed to occur in a timely fashion. Moreover, the destructors would have to be very careful to use appropriate synchronization if they were to access shared resources, while in a single-threaded, straight reference-counting solution this wouldn't be necessary.
The other complexity of this approach is that C++ allows for raw arithmetic operations on pointers, which greatly complicates the implementation of any garbage collector. It's possible to conservatively solve this problem (look at the Boehm GC, for example), though it's a significant barrier to building a system of this sort.