GoogleAppEngine 上的 JDO:如何从大量记录中高效检索字段子集

发布于 2024-12-04 21:02:55 字数 564 浏览 1 评论 0原文

我面临着可扩展性的小问题。我正在使用 JDO 查询我的数据存储。 我需要检索给定实体的所有键(此类键的类型为 Long)。鉴于在我的数据存储中,此类实体有 1.000.000 条记录,我需要以非常有效的方式获取它们,以便在后台任务中循环遍历该集。

哪种方法最有效?

如果我不仅需要密钥,还需要另一个字段怎么办?假设我有一个名为 TPImage 的实体:

    Long idPic; //this is my key
    String title; //this is the field I want to retrieve together with the key
    ... // other properties

如何在单个有效查询中检索 idPic 和 title?

东西

    Query q = new Query("select idPic, title from " + TPImage.class.getName());

类似但更有效的

?非常感谢!

再见 盖尔西

I'm facing a little problem of scalability. I'm using JDO to query my datastore.
I need to retrieve all the keys of a given entity (such keys are of type Long). Given that in my datastore such entity has 1.000.000 of records, I need to get them in a very efficient way, in order to loop over this set in a background task.

Which is the most efficient way to do this?

And what if I need not only the key, but also another field? Let's say I've got an entity called TPImage:

    Long idPic; //this is my key
    String title; //this is the field I want to retrieve together with the key
    ... // other properties

How may I retrieve both idPic and title in a single efficient query?

Something like

    Query q = new Query("select idPic, title from " + TPImage.class.getName());

but more efficient?

Thank you very much!

Bye
cghersi

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

影子的影子 2024-12-11 21:02:55

您遇到的扩展问题是您需要所有密钥 - 并不是您无法足够有效地获取它们。无论您使用什么系统,这始终至少是 O(n)。

您不应尝试预取所有内容,而应分批完成工作,并使用 光标有效地检索下一组结果。

如果您需要模型中的某个字段,则必须检索整个模型实例 - 它们存储为序列化 blob,因此无法仅检索一个字段。

The scaling problem you have is that you need all the keys - not that you can't fetch them efficiently enough. No matter what system you use, this is always going to be at least O(n).

Rather than trying to prefetch everything, you should do your work in batches, and use cursors to retrieve the next set of results efficiently.

If you need a field from the model, you must retrieve the whole model instance - they're stored as serialized blobs, so there's no way to retrieve just one field.

回忆凄美了谁 2024-12-11 21:02:55

你的问题有两部分。对于第一部分,仅获取键,您可以通过将参数keys_only设置为True来指定查询应仅在创建查询时返回键。
参见这里:
http://code.google.com/appengine/docs/python /datastore/queryclass.html#Query

这会有所帮助,因为您没有检索整个实体。但是,如果您想一次处理 1,000,000 个数据,它可能对您的帮助不够。在这种情况下,请采纳尼克的建议并分解工作。

Your question has 2 parts. For the first part, getting keys only, you can specify that query should only return keys when you create it by setting the parameter keys_only to True.
see here:
http://code.google.com/appengine/docs/python/datastore/queryclass.html#Query

This will help somewhat, as you are not retrieving the entire entity. However, it will probably not help you enough if you want to process 1,000,000 all at once. In that case, take Nick's advice and break up the work.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文