删除所有 Blobstore 数据的最简单方法是什么?
从 blobstore 中删除所有 blob 的最佳方法是什么?我正在使用Python。
我有很多斑点,我想将它们全部删除。我是 目前正在执行以下操作:
class deleteBlobs(webapp.RequestHandler):
def get(self):
all = blobstore.BlobInfo.all();
more = (all.count()>0)
blobstore.delete(all);
if more:
taskqueue.add(url='/deleteBlobs',method='GET');
这似乎使用了大量的CPU并且(据我所知)正在做 没什么用处。
What is your best way to remove all of the blob from blobstore? I'm using Python.
I have quite a lot of blobs and I'd like to delete them all. I'm
currently doing the following:
class deleteBlobs(webapp.RequestHandler):
def get(self):
all = blobstore.BlobInfo.all();
more = (all.count()>0)
blobstore.delete(all);
if more:
taskqueue.add(url='/deleteBlobs',method='GET');
Which seems to be using tons of CPU and (as far as I can tell) doing
nothing useful.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我使用这种方法:
我的经验是,一次超过 400 个 blob 将失败,因此我让它每 400 个重新加载一次。我尝试了
blobstore.delete(query.fetch(400))
,但我认为现在有一个错误。一切都没有发生,也没有删除任何内容。I use this approach:
My experience is that more than 400 blobs at once will fail, so I let it reload for every 400. I tried
blobstore.delete(query.fetch(400))
, but I think there's a bug right now. Nothing happened at all, and nothing was deleted.您将查询对象传递给删除方法,该方法将对其进行迭代以批量获取它,然后提交一个巨大的删除。这是低效的,因为它需要多次提取,并且如果您的结果多于可用时间或可用内存所能提取的结果,则该方法将不起作用。该任务要么完成一次,根本不需要链接,要么更有可能重复失败,因为它无法一次获取每个 blob。
此外,调用
count
执行查询只是为了确定计数,这很浪费时间,因为无论如何您都会尝试获取结果。相反,您应该使用
fetch
批量获取结果,并删除每个批次。使用游标设置下一个批次,避免查询在找到第一个活动记录之前迭代所有“逻辑删除”记录,理想情况下,每个任务删除多个批次,使用计时器来确定何时应该停止和链接下一个任务。You're passing the query object to the delete method, which will iterate over it fetching it in batches, then submit a single enormous delete. This is inefficient because it requires multiple fetches, and won't work if you have more results than you can fetch in the available time or with the available memory. The task will either complete once and not require chaining at all, or more likely, fail repeatedly, since it can't fetch every blob at once.
Also, calling
count
executes the query just to determine the count, which is a waste of time since you're going to try fetching the results anyway.Instead, you should fetch results in batches using
fetch
, and delete each batch. Use cursors to set the next batch and avoid the need for the query to iterate over all the 'tombstoned' records before finding the first live one, and ideally, delete multiple batches per task, using a timer to determine when you should stop and chain the next task.