Python:集合类是否“泄漏”?当项目被删除时,比如字典?
我知道当项目被删除时,Python dict 会“泄漏”(因为项目的插槽将被神奇的“删除”值覆盖)……但是 set
类会吗?行为方式相同吗?保留一个集合
,随着时间的推移添加和删除内容是否安全?
编辑:好吧,我已经尝试过了,这就是我发现的:
>>> import gc >>> gc.collect() 0 >>> nums = range(1000000) >>> gc.collect() 0 ### rsize: 20 megs ### A baseline measurement >>> s = set(nums) >>> gc.collect() 0 ### rsize: 36 megs >>> for n in nums: s.remove(n) >>> gc.collect() 0 ### rsize: 36 megs ### Memory usage doesn't drop after removing every item from the set… >>> s = None >>> gc.collect() 0 ### rsize: 20 megs ### … but nulling the reference to the set *does* free the memory. >>> s = set(nums) >>> for n in nums: s.remove(n) >>> for n in nums: s.add(n) >>> gc.collect() 0 ### rsize: 36 megs ### Removing then re-adding keys uses a constant amount of memory… >>> for n in nums: s.remove(n) >>> for n in nums: s.add(n+1000000) >>> gc.collect() 0 ### rsize: 47 megs ### … but adding new keys uses more memory.
I know that Python dict
s will "leak" when items are removed (because the item's slot will be overwritten with the magic "removed" value)… But will the set
class behave the same way? Is it safe to keep a set
around, adding and removing stuff from it over time?
Edit: Alright, I've tried it out, and here's what I found:
>>> import gc >>> gc.collect() 0 >>> nums = range(1000000) >>> gc.collect() 0 ### rsize: 20 megs ### A baseline measurement >>> s = set(nums) >>> gc.collect() 0 ### rsize: 36 megs >>> for n in nums: s.remove(n) >>> gc.collect() 0 ### rsize: 36 megs ### Memory usage doesn't drop after removing every item from the set… >>> s = None >>> gc.collect() 0 ### rsize: 20 megs ### … but nulling the reference to the set *does* free the memory. >>> s = set(nums) >>> for n in nums: s.remove(n) >>> for n in nums: s.add(n) >>> gc.collect() 0 ### rsize: 36 megs ### Removing then re-adding keys uses a constant amount of memory… >>> for n in nums: s.remove(n) >>> for n in nums: s.add(n+1000000) >>> gc.collect() 0 ### rsize: 47 megs ### … but adding new keys uses more memory.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
是的,
set
基本上是一个哈希表,就像dict
一样——界面上的差异并不意味着它“下面”有很多差异。偶尔,您应该复制该集合 -myset = set(myset)
- 就像您应该复制一个随着时间的推移定期进行许多添加和删除的字典一样。Yes,
set
is basically a hash table just likedict
-- the differences at the interface don't imply many differences "below" it. Once in a while, you should copy the set --myset = set(myset)
-- just like you should for a dict on which many additions and removals are regularly made over time.对于此类问题,通常最好进行像这样的快速实验,看看会发生什么:
文档和人们所说的内容与实际行为往往不一致。如果这对您很重要,请测试一下。不要依赖别人。
For questions like these it is often best to run a quick experiment like this one and see what happens:
What docs and people say and what behaviour actually is are often at odds. If this is important for you, test it. Don't rely on others.