为什么在禁用 CPython 垃圾收集器时会调用析构函数?
我试图了解 CPython 垃圾收集器的内部结构,特别是在调用析构函数时。到目前为止,行为很直观,但以下情况让我困惑:
- 禁用 GC。
- 创建一个对象,然后删除对其的引用。
- 该对象被销毁并调用 _____del_____ 方法。
我认为只有启用垃圾收集器才会发生这种情况。有人可以解释为什么会发生这种情况吗?有没有办法推迟调用析构函数?
import gc
import unittest
_destroyed = False
class MyClass(object):
def __del__(self):
global _destroyed
_destroyed = True
class GarbageCollectionTest(unittest.TestCase):
def testExplicitGarbageCollection(self):
gc.disable()
ref = MyClass()
ref = None
# The next test fails.
# The object is automatically destroyed even with the collector turned off.
self.assertFalse(_destroyed)
gc.collect()
self.assertTrue(_destroyed)
if __name__=='__main__':
unittest.main()
免责声明:此代码不适用于生产 - 我已经注意到,这是非常特定于实现的,并且不适用于 Jython。
I'm trying to understand the internals of the CPython garbage collector, specifically when the destructor is called. So far, the behavior is intuitive, but the following case trips me up:
- Disable the GC.
- Create an object, then remove a reference to it.
- The object is destroyed and the _____del_____ method is called.
I thought this would only happen if the garbage collector was enabled. Can someone explain why this happens? Is there a way to defer calling the destructor?
import gc
import unittest
_destroyed = False
class MyClass(object):
def __del__(self):
global _destroyed
_destroyed = True
class GarbageCollectionTest(unittest.TestCase):
def testExplicitGarbageCollection(self):
gc.disable()
ref = MyClass()
ref = None
# The next test fails.
# The object is automatically destroyed even with the collector turned off.
self.assertFalse(_destroyed)
gc.collect()
self.assertTrue(_destroyed)
if __name__=='__main__':
unittest.main()
Disclaimer: this code is not meant for production -- I've already noted that this is very implementation-specific and does not work on Jython.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
Python 同时具有引用计数垃圾收集和循环垃圾收集,并且后者由
gc
模块控制。引用计数无法禁用,因此当循环垃圾收集器关闭时仍然会发生。由于在
ref = None
之后不再有对对象的引用,因此其__del__
方法会因其引用计数变为零而被调用。文档中有一条线索:“由于收集器补充 Python 中已经使用的引用计数...”(我的重点)。
您可以通过使对象引用自身来阻止第一个断言的触发,这样它的引用计数就不会为零,例如通过给它这个构造函数:
但是如果您这样做,第二个断言将触发。这是因为使用
__del__
方法的垃圾循环不会被收集 - 请参阅 gc.垃圾。Python has both reference counting garbage collection and cyclic garbage collection, and it's the latter that the
gc
module controls. Reference counting can't be disabled, and hence still happens when the cyclic garbage collector is switched off.Since there are no references left to your object after
ref = None
, its__del__
method is called as a result of its reference count going to zero.There's a clue in the documentation: "Since the collector supplements the reference counting already used in Python..." (my emphasis).
You can stop the first assertion from firing by making the object refer to itself, so that its reference count doesn't go to zero, for instance by giving it this constructor:
But if you do that, the second assertion will fire. That's because garbage cycles with
__del__
methods don't get collected - see the documentation for gc.garbage.文档此处(原始链接是Python 3.5 之前的文档部分位于此处,后来重新定位)解释如何所谓的“可选垃圾收集器”实际上是一个循环垃圾收集器(引用计数无法捕获的那种)(另请参阅此处)。 此处解释了引用计数,并强调了它与循环GC:
The docs here (original link was to a documentation section which up to Python 3.5 was here, and was later relocated) explain how what's called "the optional garbage collector" is actually a collector of cyclic garbage (the kind that reference counting wouldn't catch) (see also here). Reference counting is explained here, with a nod to its interplay with the cyclic
gc
:根据您对垃圾收集器的定义,CPython 有两个垃圾收集器,一个是引用计数收集器,另一个是引用计数收集器。
引用计数器始终工作,并且无法关闭,因为它是一种非常快速且轻量级的计数器,不会显着影响系统的运行时间。
另一种(我认为是标记和清除的某种变体)经常运行,并且可以禁用。这是因为它需要解释器在运行时暂停,而这可能会在错误的时刻发生,并消耗大量的 CPU 时间。
当您希望做一些时间紧迫的事情时,可以禁用它,并且缺少此 GC 不会给您带来任何问题。
Depending on your definition of garbage collector, CPython has two garbage collectors, the reference counting one, and the other one.
The reference counter is always working, and cannot be turned off, as it's quite a fast and lightweight one that does not sigificantly affect the run time of the system.
The other one (some varient of mark and sweep, I think), gets run every so often, and can be disabled. This is because it requires the interpreter to be paused while it is running, and this can happen at the wrong moment, and consume quite a lot of CPU time.
This ability to disable it is there for those time when you expect to be doing something that's time critical, and the lack of this GC won't cause you any problems.