循环引用的对象没有被垃圾收集

发布于 2024-12-23 17:12:26 字数 6510 浏览 2 评论 0原文

我有一个方便的小类,我在代码中经常使用它,如下所示:

class Structure(dict):
    def __init__(self, **kwargs):
        dict.__init__(self, **kwargs)
        self.__dict__ = self

它的好处是您可以使用字典键语法或常用对象样式访问属性:

myStructure = Structure(name="My Structure")
print myStructure["name"]
print myStructure.name

今天我注意到我的应用程序内存消耗在我预期会减少的情况下略有增加。在我看来,从 Structure 类生成的实例不会被垃圾收集。为了说明这一点,这里有一个小片段:

import gc

class Structure(dict):
    def __init__(self, **kwargs):
        dict.__init__(self, **kwargs)
        self.__dict__ = self

structures = [Structure(name="__{0}".format(str(value))) for value in range(4096)]
print "Structure name: ", structures[16].name
print "Structure name: ", structures[16]["name"]
del structures
gc.collect()
print "Structures count: ", len([obj for obj in gc.get_objects() if type(obj) is Structure])

具有以下输出:

Structure name:  __16
Structure name:  __16
Structures count:  4096

正如您所注意到的,结构实例计数仍然是 4096。

我评论了创建方便的自引用的行:

import gc

class Structure(dict):
    def __init__(self, **kwargs):
        dict.__init__(self, **kwargs)
        # self.__dict__ = self

structures = [Structure(name="__{0}".format(str(value))) for value in range(4096)]
# print "Structure name: ", structures[16].name
print "Structure name: ", structures[16]["name"]
del structures
gc.collect()
print "Structures count: ", len([obj for obj in gc.get_objects() if type(obj) is Structure])

现在,循环引用已被删除输出有意义:

Structure name:  __16
Structures count:  0

我使用 Melia 进一步推动测试来分析内存消耗:

import gc
import pprint
from meliae import scanner
from meliae import loader

class Structure(dict):
    def __init__(self, **kwargs):
        dict.__init__(self, **kwargs)
        self.__dict__ = self

structures = [Structure(name="__{0}".format(str(value))) for value in range(4096)]
print "Structure name: ", structures[16].name
print "Structure name: ", structures[16]["name"]
del structures
gc.collect()
print "Structures count: ", len([obj for obj in gc.get_objects() if type(obj) is Structure])

scanner.dump_all_objects("Test_001.json")
om = loader.load("Test_001.json")
summary = om.summarize()
print summary

structures = om.get_all("Structure")
if structures:
    pprint.pprint(structures[0].c)

生成以下输出

Structure name:  __16
Structure name:  __16
Structures count:  4096
loading... line 5001, 5002 objs,   0.6 /   1.8 MiB read in 0.2s
loading... line 10002, 10003 objs,   1.1 /   1.8 MiB read in 0.3s
loading... line 15003, 15004 objs,   1.7 /   1.8 MiB read in 0.5s
loaded line 16405, 16406 objs,   1.8 /   1.8 MiB read in 0.5s        
checked        1 /    16406 collapsed        0    
checked    16405 /    16406 collapsed      157    
compute parents        0 /    16249        
compute parents    16248 /    16249        
set parents    16248 /    16249        
collapsed in 0.2s
Total 16249 objects, 58 types, Total size = 3.2MiB (3306183 bytes)
 Index   Count   %      Size   % Cum     Max Kind
     0    4096  25   1212416  36  36     296 Structure
     1     390   2    536976  16  52   49432 dict
     2    5135  31    417550  12  65   12479 str
     3      82   0    290976   8  74   12624 module
     4     235   1    212440   6  80     904 type
     5     947   5    121216   3  84     128 code
     6    1008   6    120960   3  88     120 function
     7    1048   6     83840   2  90      80 wrapper_descriptor
     8     654   4     47088   1  92      72 builtin_function_or_method
     9     562   3     40464   1  93      72 method_descriptor
    10     517   3     37008   1  94     216 tuple
    11     139   0     35832   1  95    2280 set
    12     351   2     30888   0  96      88 weakref
    13     186   1     23200   0  97    1664 list
    14      63   0     21672   0  97     344 WeakSet
    15      21   0     18984   0  98     904 ABCMeta
    16     197   1     14184   0  98      72 member_descriptor
    17     188   1     13536   0  99      72 getset_descriptor
    18     284   1      6816   0  99      24 int
    19      14   0      5296   0  99    2280 frozenset
[Structure(4312707312 296B 2refs 2par),
 type(4298634592 904B 4refs 100par 'Structure')]

:内存使用量是3.2MiB,删除自引用行会产生以下输出:

Structure name:  __16
Structures count:  0
loading... line 5001, 5002 objs,   0.6 /   1.4 MiB read in 0.1s
loading... line 10002, 10003 objs,   1.1 /   1.4 MiB read in 0.3s
loaded line 12308, 12309 objs,   1.4 /   1.4 MiB read in 0.4s        
checked       12 /    12309 collapsed        0    
checked    12308 /    12309 collapsed      157    
compute parents        0 /    12152        
compute parents    12151 /    12152        
set parents    12151 /    12152        
collapsed in 0.1s
Total 12152 objects, 57 types, Total size = 2.0MiB (2093714 bytes)
 Index   Count   %      Size   % Cum     Max Kind
     0     390   3    536976  25  25   49432 dict
     1    5134  42    417497  19  45   12479 str
     2      82   0    290976  13  59   12624 module
     3     235   1    212440  10  69     904 type
     4     947   7    121216   5  75     128 code
     5    1008   8    120960   5  81     120 function
     6    1048   8     83840   4  85      80 wrapper_descriptor
     7     654   5     47088   2  87      72 builtin_function_or_method
     8     562   4     40464   1  89      72 method_descriptor
     9     517   4     37008   1  91     216 tuple
    10     139   1     35832   1  92    2280 set
    11     351   2     30888   1  94      88 weakref
    12     186   1     23200   1  95    1664 list
    13      63   0     21672   1  96     344 WeakSet
    14      21   0     18984   0  97     904 ABCMeta
    15     197   1     14184   0  98      72 member_descriptor
    16     188   1     13536   0  98      72 getset_descriptor
    17     284   2      6816   0  99      24 int
    18      14   0      5296   0  99    2280 frozenset
    19      22   0      2288   0  99     104 classobj

确认 Structure 实例已被销毁,内存使用量降至 2.0MiB。

知道如何确保此类得到正确的垃圾收集吗?顺便说一句,所有这些都是在 Python 2.7.2 ( Darwin ) 上执行的。

干杯,

托马斯

I have a small handy class that I use a lot in my code which is the following:

class Structure(dict):
    def __init__(self, **kwargs):
        dict.__init__(self, **kwargs)
        self.__dict__ = self

The nice thing about it is that you can either access the attributes using dictionary key syntax or usual object style:

myStructure = Structure(name="My Structure")
print myStructure["name"]
print myStructure.name

Today I have noticed that my application memory consumption was increasing slightly in a situation where I would have expected it to reduce. It seems to me that the instances generated from the Structure class are not garbaged collected. To illustrate this here is a small snippet:

import gc

class Structure(dict):
    def __init__(self, **kwargs):
        dict.__init__(self, **kwargs)
        self.__dict__ = self

structures = [Structure(name="__{0}".format(str(value))) for value in range(4096)]
print "Structure name: ", structures[16].name
print "Structure name: ", structures[16]["name"]
del structures
gc.collect()
print "Structures count: ", len([obj for obj in gc.get_objects() if type(obj) is Structure])

With the following output:

Structure name:  __16
Structure name:  __16
Structures count:  4096

As you noticed the Structure instances count is still 4096.

I commented the line creating the handy self reference:

import gc

class Structure(dict):
    def __init__(self, **kwargs):
        dict.__init__(self, **kwargs)
        # self.__dict__ = self

structures = [Structure(name="__{0}".format(str(value))) for value in range(4096)]
# print "Structure name: ", structures[16].name
print "Structure name: ", structures[16]["name"]
del structures
gc.collect()
print "Structures count: ", len([obj for obj in gc.get_objects() if type(obj) is Structure])

Now that the circular reference is removed the output makes sense:

Structure name:  __16
Structures count:  0

I pushed the tests a bit further using Melia to analyze the memory consumption:

import gc
import pprint
from meliae import scanner
from meliae import loader

class Structure(dict):
    def __init__(self, **kwargs):
        dict.__init__(self, **kwargs)
        self.__dict__ = self

structures = [Structure(name="__{0}".format(str(value))) for value in range(4096)]
print "Structure name: ", structures[16].name
print "Structure name: ", structures[16]["name"]
del structures
gc.collect()
print "Structures count: ", len([obj for obj in gc.get_objects() if type(obj) is Structure])

scanner.dump_all_objects("Test_001.json")
om = loader.load("Test_001.json")
summary = om.summarize()
print summary

structures = om.get_all("Structure")
if structures:
    pprint.pprint(structures[0].c)

Generating the following output:

Structure name:  __16
Structure name:  __16
Structures count:  4096
loading... line 5001, 5002 objs,   0.6 /   1.8 MiB read in 0.2s
loading... line 10002, 10003 objs,   1.1 /   1.8 MiB read in 0.3s
loading... line 15003, 15004 objs,   1.7 /   1.8 MiB read in 0.5s
loaded line 16405, 16406 objs,   1.8 /   1.8 MiB read in 0.5s        
checked        1 /    16406 collapsed        0    
checked    16405 /    16406 collapsed      157    
compute parents        0 /    16249        
compute parents    16248 /    16249        
set parents    16248 /    16249        
collapsed in 0.2s
Total 16249 objects, 58 types, Total size = 3.2MiB (3306183 bytes)
 Index   Count   %      Size   % Cum     Max Kind
     0    4096  25   1212416  36  36     296 Structure
     1     390   2    536976  16  52   49432 dict
     2    5135  31    417550  12  65   12479 str
     3      82   0    290976   8  74   12624 module
     4     235   1    212440   6  80     904 type
     5     947   5    121216   3  84     128 code
     6    1008   6    120960   3  88     120 function
     7    1048   6     83840   2  90      80 wrapper_descriptor
     8     654   4     47088   1  92      72 builtin_function_or_method
     9     562   3     40464   1  93      72 method_descriptor
    10     517   3     37008   1  94     216 tuple
    11     139   0     35832   1  95    2280 set
    12     351   2     30888   0  96      88 weakref
    13     186   1     23200   0  97    1664 list
    14      63   0     21672   0  97     344 WeakSet
    15      21   0     18984   0  98     904 ABCMeta
    16     197   1     14184   0  98      72 member_descriptor
    17     188   1     13536   0  99      72 getset_descriptor
    18     284   1      6816   0  99      24 int
    19      14   0      5296   0  99    2280 frozenset
[Structure(4312707312 296B 2refs 2par),
 type(4298634592 904B 4refs 100par 'Structure')]

The memory usage is 3.2MiB, removing the self referencing line leads to the following output:

Structure name:  __16
Structures count:  0
loading... line 5001, 5002 objs,   0.6 /   1.4 MiB read in 0.1s
loading... line 10002, 10003 objs,   1.1 /   1.4 MiB read in 0.3s
loaded line 12308, 12309 objs,   1.4 /   1.4 MiB read in 0.4s        
checked       12 /    12309 collapsed        0    
checked    12308 /    12309 collapsed      157    
compute parents        0 /    12152        
compute parents    12151 /    12152        
set parents    12151 /    12152        
collapsed in 0.1s
Total 12152 objects, 57 types, Total size = 2.0MiB (2093714 bytes)
 Index   Count   %      Size   % Cum     Max Kind
     0     390   3    536976  25  25   49432 dict
     1    5134  42    417497  19  45   12479 str
     2      82   0    290976  13  59   12624 module
     3     235   1    212440  10  69     904 type
     4     947   7    121216   5  75     128 code
     5    1008   8    120960   5  81     120 function
     6    1048   8     83840   4  85      80 wrapper_descriptor
     7     654   5     47088   2  87      72 builtin_function_or_method
     8     562   4     40464   1  89      72 method_descriptor
     9     517   4     37008   1  91     216 tuple
    10     139   1     35832   1  92    2280 set
    11     351   2     30888   1  94      88 weakref
    12     186   1     23200   1  95    1664 list
    13      63   0     21672   1  96     344 WeakSet
    14      21   0     18984   0  97     904 ABCMeta
    15     197   1     14184   0  98      72 member_descriptor
    16     188   1     13536   0  98      72 getset_descriptor
    17     284   2      6816   0  99      24 int
    18      14   0      5296   0  99    2280 frozenset
    19      22   0      2288   0  99     104 classobj

Confirming that the Structure instances have been destroyed and the memory usage dropped to 2.0MiB.

Any idea how I could ensure that this class gets properly garbage collected? All this is executed on Python 2.7.2 ( Darwin ) by the way.

Cheers,

Thomas

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

无尽的现实 2024-12-30 17:12:26

您可以使用 __getattr____setattr__ 更直接地实现 Structure 类,以允许属性访问底层字典。

class Structure(dict):
    def __getattr__(self, k):
        return self[k]
    def __setattr__(self, k, v):
        self[k] = v

循环在 Python 中是垃圾收集的,但只是定期收集(与常规引用计数对象不同,一旦引用计数下降到 0,就会立即收集)。

避免循环(就像使用 __getattr____setattr__ 的 Structure 类所做的那样),意味着您将获得更好的 gc 行为。您可能想看看collections.namedtuple作为一个不错的选择:它并不完全按照您所实现的方式进行,但也许它适合您的目的。

You can more straightforwardly implement your Structure class by using __getattr__ and __setattr__ to allow attribute access to go to the underlying dict.

class Structure(dict):
    def __getattr__(self, k):
        return self[k]
    def __setattr__(self, k, v):
        self[k] = v

Cycles are garbage collected in Python, but only periodically (unlike regular reference counted objects which get collected as soon as their reference count drops to 0).

Avoiding the cycle (as the Structure class using __getattr__ and __setattr__ does), means you'll get better gc behavior. You may want a look at collections.namedtuple as a good alternative: it's not doing exactly what you've implemented but perhaps it suits your ends.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文