GQL 查询对象内存泄漏

发布于 2024-12-10 09:39:22 字数 4717 浏览 0 评论 0原文

我有一个长时间运行的后台进程,可以解析 CSV 的几十万行。我注意到该进程存在内存泄漏,有时会导致任务达到其软内存限制并终止。我已将代码部分缩小到以下代码块:

class BaseModel(db.Model):
    _keyNamespace = 'MyApp.Models'

    @classmethod
    def get_by_item_id(cls, id):
        key = "%s_%d" % (cls._keyNamespace, id)
        item = CacheStrategy.get(key)
        if not item:
            query = cls.gql("WHERE Id = :1", id)
            item = query.get()
            del query

        return item

我已将其缩减为最基本的代码,但它仍然导致 Query 对象保留在内存中。注释末尾包含一个示例 GC 参考转储,显示 Query 和 Query_Filter 计数在每 200 个订单批处理步骤后增加 200。如果我摆脱查询调用,这当然就会消失。

我的问题是,为什么这是泄漏的查询引用,如何让它遵守 del 并删除查询引用?

我尝试将其设为实例方法(没有区别)。引用计数跟踪如下:

INFO     2011-10-17 16:29:39,158 orderparser.py:151] Putting a 200 unit batch of orders, 0.335000 seconds from start
DEBUG    2011-10-17 16:29:40,315 memleaker.py:20] Top Mem Leaks
DEBUG    2011-10-17 16:29:40,334 memleaker.py:22]     356306 Property
DEBUG    2011-10-17 16:29:40,334 memleaker.py:22]     356305 PropertyValue
DEBUG    2011-10-17 16:29:40,334 memleaker.py:22]      74410 Path
DEBUG    2011-10-17 16:29:40,334 memleaker.py:22]      74408 Path_Element
DEBUG    2011-10-17 16:29:40,334 memleaker.py:22]      45127 PropertyValue_ReferenceValue
DEBUG    2011-10-17 16:29:40,334 memleaker.py:22]      45127 PropertyValue_ReferenceValuePathElement
DEBUG    2011-10-17 16:29:40,334 memleaker.py:22]      43822 Reference
DEBUG    2011-10-17 16:29:40,335 memleaker.py:22]      30595 EntityProto
DEBUG    2011-10-17 16:29:40,335 memleaker.py:22]        320 ProtocolMessage
DEBUG    2011-10-17 16:29:40,335 memleaker.py:22]        217 Query
DEBUG    2011-10-17 16:29:40,335 memleaker.py:22]        209 Query_Filter
DEBUG    2011-10-17 16:29:40,335 memleaker.py:22]         55 NOT_PROVIDED
DEBUG    2011-10-17 16:29:40,335 memleaker.py:22]         34 Index_Property
DEBUG    2011-10-17 16:29:40,335 memleaker.py:22]         28 ExtendableProtocolMessage
DEBUG    2011-10-17 16:29:40,336 memleaker.py:22]         18 CompositeIndex
INFO     2011-10-17 16:29:40,644 orderparser.py:151] Putting a 200 unit batch of orders, 1.821000 seconds from start
DEBUG    2011-10-17 16:29:41,930 memleaker.py:20] Top Mem Leaks
DEBUG    2011-10-17 16:29:41,948 memleaker.py:22]     356506 Property
DEBUG    2011-10-17 16:29:41,948 memleaker.py:22]     356505 PropertyValue
DEBUG    2011-10-17 16:29:41,948 memleaker.py:22]      74410 Path
DEBUG    2011-10-17 16:29:41,948 memleaker.py:22]      74408 Path_Element
DEBUG    2011-10-17 16:29:41,948 memleaker.py:22]      45127 PropertyValue_ReferenceValue
DEBUG    2011-10-17 16:29:41,948 memleaker.py:22]      45127 PropertyValue_ReferenceValuePathElement
DEBUG    2011-10-17 16:29:41,948 memleaker.py:22]      43822 Reference
DEBUG    2011-10-17 16:29:41,951 memleaker.py:22]      30595 EntityProto
DEBUG    2011-10-17 16:29:41,951 memleaker.py:22]        417 Query
DEBUG    2011-10-17 16:29:41,951 memleaker.py:22]        409 Query_Filter
DEBUG    2011-10-17 16:29:41,951 memleaker.py:22]        320 ProtocolMessage
DEBUG    2011-10-17 16:29:41,951 memleaker.py:22]         55 NOT_PROVIDED
DEBUG    2011-10-17 16:29:41,951 memleaker.py:22]         34 Index_Property
DEBUG    2011-10-17 16:29:41,951 memleaker.py:22]         28 ExtendableProtocolMessage
DEBUG    2011-10-17 16:29:41,953 memleaker.py:22]         18 CompositeIndex
INFO     2011-10-17 16:29:42,276 orderparser.py:151] Putting a 200 unit batch of orders, 3.450000 seconds from start
DEBUG    2011-10-17 16:29:43,565 memleaker.py:20] Top Mem Leaks
DEBUG    2011-10-17 16:29:43,585 memleaker.py:22]     356706 Property
DEBUG    2011-10-17 16:29:43,585 memleaker.py:22]     356705 PropertyValue
DEBUG    2011-10-17 16:29:43,585 memleaker.py:22]      74410 Path
DEBUG    2011-10-17 16:29:43,585 memleaker.py:22]      74408 Path_Element
DEBUG    2011-10-17 16:29:43,585 memleaker.py:22]      45127 PropertyValue_ReferenceValue
DEBUG    2011-10-17 16:29:43,585 memleaker.py:22]      45127 PropertyValue_ReferenceValuePathElement
DEBUG    2011-10-17 16:29:43,585 memleaker.py:22]      43822 Reference
DEBUG    2011-10-17 16:29:43,586 memleaker.py:22]      30595 EntityProto
DEBUG    2011-10-17 16:29:43,586 memleaker.py:22]        617 Query
DEBUG    2011-10-17 16:29:43,586 memleaker.py:22]        609 Query_Filter
DEBUG    2011-10-17 16:29:43,586 memleaker.py:22]        320 ProtocolMessage
DEBUG    2011-10-17 16:29:43,586 memleaker.py:22]         55 NOT_PROVIDED
DEBUG    2011-10-17 16:29:43,586 memleaker.py:22]         34 Index_Property
DEBUG    2011-10-17 16:29:43,586 memleaker.py:22]         28 ExtendableProtocolMessage
DEBUG    2011-10-17 16:29:43,588 memleaker.py:22]         18 CompositeIndex

I have a long running background process that parses a few hundred thousand lines of a CSV. I noticed that the process has a memory leak that occasionally causes the task to hit its soft memory limit and terminate. I have narrowed the section of code down to the following chunk of code:

class BaseModel(db.Model):
    _keyNamespace = 'MyApp.Models'

    @classmethod
    def get_by_item_id(cls, id):
        key = "%s_%d" % (cls._keyNamespace, id)
        item = CacheStrategy.get(key)
        if not item:
            query = cls.gql("WHERE Id = :1", id)
            item = query.get()
            del query

        return item

I've cut this down to the bare bones but it is still causing Query objects to remain in memory. A sample GC reference dump is included at the end of the comment showing the Query and Query_Filter counts increase by 200 after every 200 order batch step. If i get rid of the query call, this of course goes away.

My question is, WHY is this leaking Query references and how do I get it to honour the del and drop the query reference?

I've tried making this an instance method (no difference). Reference count trace below:

INFO     2011-10-17 16:29:39,158 orderparser.py:151] Putting a 200 unit batch of orders, 0.335000 seconds from start
DEBUG    2011-10-17 16:29:40,315 memleaker.py:20] Top Mem Leaks
DEBUG    2011-10-17 16:29:40,334 memleaker.py:22]     356306 Property
DEBUG    2011-10-17 16:29:40,334 memleaker.py:22]     356305 PropertyValue
DEBUG    2011-10-17 16:29:40,334 memleaker.py:22]      74410 Path
DEBUG    2011-10-17 16:29:40,334 memleaker.py:22]      74408 Path_Element
DEBUG    2011-10-17 16:29:40,334 memleaker.py:22]      45127 PropertyValue_ReferenceValue
DEBUG    2011-10-17 16:29:40,334 memleaker.py:22]      45127 PropertyValue_ReferenceValuePathElement
DEBUG    2011-10-17 16:29:40,334 memleaker.py:22]      43822 Reference
DEBUG    2011-10-17 16:29:40,335 memleaker.py:22]      30595 EntityProto
DEBUG    2011-10-17 16:29:40,335 memleaker.py:22]        320 ProtocolMessage
DEBUG    2011-10-17 16:29:40,335 memleaker.py:22]        217 Query
DEBUG    2011-10-17 16:29:40,335 memleaker.py:22]        209 Query_Filter
DEBUG    2011-10-17 16:29:40,335 memleaker.py:22]         55 NOT_PROVIDED
DEBUG    2011-10-17 16:29:40,335 memleaker.py:22]         34 Index_Property
DEBUG    2011-10-17 16:29:40,335 memleaker.py:22]         28 ExtendableProtocolMessage
DEBUG    2011-10-17 16:29:40,336 memleaker.py:22]         18 CompositeIndex
INFO     2011-10-17 16:29:40,644 orderparser.py:151] Putting a 200 unit batch of orders, 1.821000 seconds from start
DEBUG    2011-10-17 16:29:41,930 memleaker.py:20] Top Mem Leaks
DEBUG    2011-10-17 16:29:41,948 memleaker.py:22]     356506 Property
DEBUG    2011-10-17 16:29:41,948 memleaker.py:22]     356505 PropertyValue
DEBUG    2011-10-17 16:29:41,948 memleaker.py:22]      74410 Path
DEBUG    2011-10-17 16:29:41,948 memleaker.py:22]      74408 Path_Element
DEBUG    2011-10-17 16:29:41,948 memleaker.py:22]      45127 PropertyValue_ReferenceValue
DEBUG    2011-10-17 16:29:41,948 memleaker.py:22]      45127 PropertyValue_ReferenceValuePathElement
DEBUG    2011-10-17 16:29:41,948 memleaker.py:22]      43822 Reference
DEBUG    2011-10-17 16:29:41,951 memleaker.py:22]      30595 EntityProto
DEBUG    2011-10-17 16:29:41,951 memleaker.py:22]        417 Query
DEBUG    2011-10-17 16:29:41,951 memleaker.py:22]        409 Query_Filter
DEBUG    2011-10-17 16:29:41,951 memleaker.py:22]        320 ProtocolMessage
DEBUG    2011-10-17 16:29:41,951 memleaker.py:22]         55 NOT_PROVIDED
DEBUG    2011-10-17 16:29:41,951 memleaker.py:22]         34 Index_Property
DEBUG    2011-10-17 16:29:41,951 memleaker.py:22]         28 ExtendableProtocolMessage
DEBUG    2011-10-17 16:29:41,953 memleaker.py:22]         18 CompositeIndex
INFO     2011-10-17 16:29:42,276 orderparser.py:151] Putting a 200 unit batch of orders, 3.450000 seconds from start
DEBUG    2011-10-17 16:29:43,565 memleaker.py:20] Top Mem Leaks
DEBUG    2011-10-17 16:29:43,585 memleaker.py:22]     356706 Property
DEBUG    2011-10-17 16:29:43,585 memleaker.py:22]     356705 PropertyValue
DEBUG    2011-10-17 16:29:43,585 memleaker.py:22]      74410 Path
DEBUG    2011-10-17 16:29:43,585 memleaker.py:22]      74408 Path_Element
DEBUG    2011-10-17 16:29:43,585 memleaker.py:22]      45127 PropertyValue_ReferenceValue
DEBUG    2011-10-17 16:29:43,585 memleaker.py:22]      45127 PropertyValue_ReferenceValuePathElement
DEBUG    2011-10-17 16:29:43,585 memleaker.py:22]      43822 Reference
DEBUG    2011-10-17 16:29:43,586 memleaker.py:22]      30595 EntityProto
DEBUG    2011-10-17 16:29:43,586 memleaker.py:22]        617 Query
DEBUG    2011-10-17 16:29:43,586 memleaker.py:22]        609 Query_Filter
DEBUG    2011-10-17 16:29:43,586 memleaker.py:22]        320 ProtocolMessage
DEBUG    2011-10-17 16:29:43,586 memleaker.py:22]         55 NOT_PROVIDED
DEBUG    2011-10-17 16:29:43,586 memleaker.py:22]         34 Index_Property
DEBUG    2011-10-17 16:29:43,586 memleaker.py:22]         28 ExtendableProtocolMessage
DEBUG    2011-10-17 16:29:43,588 memleaker.py:22]         18 CompositeIndex

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

寄与心 2024-12-17 09:39:22

我无法使用您的引用计数代码和下面的一个简单代码片段(在 shell.appspot.com 或新应用程序上)重现此问题:

from google.appengine.ext import db
import logging
import sys
import types

def get_refcounts():
    d = {}
    # collect all classes
    for m in sys.modules.values():
        for sym in dir(m):
            o = getattr (m, sym)
            if type(o) is types.ClassType:
                d[o] = sys.getrefcount (o)
    # sort by refcount
    pairs = map (lambda x: (x[1],x[0]), d.items())
    pairs.sort()
    pairs.reverse()
    return pairs

def print_top(num = 15):
    print 'Top Mem Leaks'
    for n, c in get_refcounts()[:num]:
        print '%10d %s' % (n, c.__name__)

class TestModel(db.Model):
  id = db.IntegerProperty()


print_top()

q = TestModel.gql("WHERE id = :1", 1)
item = q.get()
del q

print_top()

您的环境中的某些内容似乎保存了对已执行查询的引用。您是否使用 appstats 或其他开发或调试工具?您能否创建一个最小再现案例来展示您观察到的行为?

I'm unable to reproduce this using your refcount code and a trivial snippet below (on shell.appspot.com or a fresh app):

from google.appengine.ext import db
import logging
import sys
import types

def get_refcounts():
    d = {}
    # collect all classes
    for m in sys.modules.values():
        for sym in dir(m):
            o = getattr (m, sym)
            if type(o) is types.ClassType:
                d[o] = sys.getrefcount (o)
    # sort by refcount
    pairs = map (lambda x: (x[1],x[0]), d.items())
    pairs.sort()
    pairs.reverse()
    return pairs

def print_top(num = 15):
    print 'Top Mem Leaks'
    for n, c in get_refcounts()[:num]:
        print '%10d %s' % (n, c.__name__)

class TestModel(db.Model):
  id = db.IntegerProperty()


print_top()

q = TestModel.gql("WHERE id = :1", 1)
item = q.get()
del q

print_top()

It seems likely that something in your environment is holding references to the queries that have been executed. Are you using appstats or another development or debugging tool? Can you create a minimum reproduction case that exhibits the behaviour you observed?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文