Google App Engine 数据存储区中的批量更新

发布于 2024-12-17 06:50:05 字数 217 浏览 2 评论 0原文

对 Google App Engine 数据存储区中的实体执行批量更新的正确方法是什么?是否可以在不检索实体的情况下完成?

例如,GAE 相当于 SQL 中的以下内容:

UPDATE dbo.authors
SET    city = replace(city, 'Salt', 'Olympic')
WHERE  city LIKE 'Salt%';

What is the proper way to perform mass updates on entities in a Google App Engine Datastore? Can it be done without having to retrieve the entities?

For example, what would be the GAE equivilant to something like this in SQL:

UPDATE dbo.authors
SET    city = replace(city, 'Salt', 'Olympic')
WHERE  city LIKE 'Salt%';

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

挖鼻大婶 2024-12-24 06:50:05

没有直接翻译。数据存储实际上没有更新的概念;您所能做的就是用同一地址(键)的新实体覆盖旧实体。要更改实体,您必须从数据存储中获取它,在本地修改它,然后将其保存回来。

也没有与 LIKE 运算符等效的操作。虽然通过一些技巧可以进行通配符后缀匹配,但如果您想匹配“%Salt%”,则必须将每个实体读入内存并在本地进行字符串比较。

因此它不会像 SQL 那样干净或高效。这是大多数分布式对象存储的权衡,数据存储也不例外。

也就是说,映射器库可用于促进此类批量更新。按照示例并在您的 process 函数中使用类似的内容:

def process(entity):
  if entity.city.startswith('Salt'):
    entity.city = entity.city.replace('Salt', 'Olympic')
    yield op.db.Put(entity)

除了映射器之外,还有其他替代方案。最重要的优化技巧是批量更新;不要单独保存每个更新的实体。如果您使用映射器和yield put,则会自动处理。

There isn't a direct translation. The datastore really has no concept of updates; all you can do is overwrite old entities with a new entity at the same address (key). To change an entity, you must fetch it from the datastore, modify it locally, and then save it back.

There's also no equivalent to the LIKE operator. While wildcard suffix matching is possible with some tricks, if you wanted to match '%Salt%' you'd have to read every single entity into memory and do the string comparison locally.

So it's not going to be quite as clean or efficient as SQL. This is a tradeoff with most distributed object stores, and the datastore is no exception.

That said, the mapper library is available to facilitate such batch updates. Follow the example and use something like this for your process function:

def process(entity):
  if entity.city.startswith('Salt'):
    entity.city = entity.city.replace('Salt', 'Olympic')
    yield op.db.Put(entity)

There are other alternatives besides the mapper. The most important optimization tip is to batch your updates; don't save back each updated entity individually. If you use the mapper and yield puts, this is handled automatically.

奈何桥上唱咆哮 2024-12-24 06:50:05

不,如果不检索实体就无法完成此操作。

不存在“最大 1000 条记录限制”之类的东西,但是任何单个请求当然都有超时 - 如果您有大量实体需要修改,则简单的迭代可能会违反这一点。您可以通过将其拆分为多个操作并使用 查询游标,或者可能使用 MapReduce 框架。

No, it can't be done without retrieving the entities.

There's no such thing as a '1000 max record limit', but there is of course a timeout on any single request - and if you have large amounts of entities to modify, a simple iteration will probably fall foul of that. You could manage this by splitting it up into multiple operations and keeping track with a query cursor, or potentially by using the MapReduce framework.

落叶缤纷 2024-12-24 06:50:05

您可以使用查询类 http://code.google.com/ appengine/docs/python/datastore/queryclass.html

 query = authors.all().filter('city >', 'Salt').fetch()
 for record in query:
   record.city = record.city.replace('Salt','Olympic')

you could use the query class, http://code.google.com/appengine/docs/python/datastore/queryclass.html

 query = authors.all().filter('city >', 'Salt').fetch()
 for record in query:
   record.city = record.city.replace('Salt','Olympic')
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文