如何在调用 db.put() 之前在 GAE Python 中查找 db.Model 实例的大小?
我正在为我的应用程序编写一个优化器,因此尽可能少地调用 db.put() 。我遇到了以下问题:
我有许多从 db.Model 派生的类。这些类的实例存储在列表中:
class DBPutter:
data = [] # list of instances
def add(self, model):
# HERE I WANT TO CHECK THAT self.data IS NOT EXEEDING 1MB
self.data.append(model)
if len(self.data) == 1000:
self.flush() # actual call to db.put() using deferred
通过这种方法,我收到了很多 RequestTooLargeError
异常。如何检查我的数据是否未超过 1MB?
I'm writing an optimizer for my application, so db.put() invoked as rarely as possible. I stuck with following problem:
I have a number of classes derived from db.Model. The instances of those classes stored in list:
class DBPutter:
data = [] # list of instances
def add(self, model):
# HERE I WANT TO CHECK THAT self.data IS NOT EXEEDING 1MB
self.data.append(model)
if len(self.data) == 1000:
self.flush() # actual call to db.put() using deferred
With this approach I receive alot of RequestTooLargeError
exceptions. How do I check that my data is not exeeding 1MB?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
Pympler 有一个 asizeof 方法,应该在 python 2.5 中运行: http://code.google.com/p /pympler/
但我认为你过度优化了。如果在推杆中放入 1000 个对象之前关闭实例,您可能会丢失数据。另外,我认为使用带有大量数据的延迟库将导致至少两个 db.put。一个是在任务提交时(因为有效负载超过 10k),另一个是在任务内部,实际编写模型。
Pympler has a asizeof method, and should run in python 2.5: http://code.google.com/p/pympler/
I think you're over-optimizing though. If an instance is shut down before 1000 objects are in your putter you could lose data. Also, I think using the deferred library with a large amount of data would result in at least two db.puts. One when the task is submitted (because the payload is over 10k), and one inside the task, actually writing your models.
根据 1.4.0 发行说明:
也就是说,为此使用 deferred 是没有意义的:任务队列有效负载限制为 10k,如果您的延迟有效负载大于此值,它将创建一个数据存储实体来存储有效负载。因此,它无论如何都会执行数据存储操作,所以你也可以自己做。
但是,如果您要存储数千个实体,那么您几乎肯定希望首先在任务队列上执行整个过程,而不是在交互式请求中。
As per the 1.4.0 release notes:
That said, using deferred for this is pointless: Task Queue payloads are limited to 10k, and if your deferred payload is bigger than that, it will create a datastore entity to store the payload in. As a result, it's doing a datastore operation anyway, so you may as well do it yourself.
If you're storing thousands of entities, though, you almost certainly want to be doing the whole process on the task queue in the first place, rather than in an interactive request.
我不使用 GAE,但您可以尝试在每个模型上调用 sys.getsizeof 并验证总和是否小于 1 MB。
编辑:请参阅此 ActiveState Recipe 以获取 sys.getsizeof 的替代方案,它应该在 Python 2.5 中工作。
I don't work with GAE, but you could try to call sys.getsizeof on each of your models and verify that the sum is less than 1 MB.
Edit: See this ActiveState recipe for an alternative to sys.getsizeof, which should work in Python 2.5.