GQL 查询优化和表架构
我一直在使用 Google App Engine,但我的一些数据查询遇到了性能缓慢的问题。我读到,设计 App Engine 数据存储区与使用 SQL 数据库是不同的思维方式,而且我不确定我是否是最好的方法。我有两个问题要尝试走上正确的轨道:
具体来说:
我有一个 Foo
类型和一个 UserFoo
类型。每个 UserFoo
都是相应 Foo
的一个“实例”,并保存特定于该实例的数据。我的 Foo
类型有一个 fooCode
属性,它是唯一标识符,我使用以下方法将每个 UserFoo
与每个 Foo
进行映射他们的 fooCode 属性。然后,我使用如下代码对每个 Foo 进行操作:
foos = Foo.all().filter('bar =', bar)
for foo in foos:
userFoo = UserFoo.all().filter('userKey =', user).filter('fooCode =', foo.fooCode)
注意:我在引用键上使用 fooCode
,以便我们可以轻松删除和重新添加新的 Foo
以及不必重新映射所有相应的 UserFoo。
一般情况:
设计 GAE 数据存储表的典型方法以及使用它们的最佳实践是什么?
I've been working with Google App Engine and I'm running into some slow performance with some of my data queries. I've read that designing an App Engine datastore is a different mindset from working with SQL databases and I'm not sure I'm doing this the best way. I have two questions to try to get on the right track:
Specifically:
I have a Foo
type and a UserFoo
type. Each UserFoo
is an "instance" of a corresponding Foo
and holds data specific to that instance. My Foo
type has a fooCode
property which is a unique identifier and I map each UserFoo
with each Foo
by using their fooCode
properties. I then operate on each Foo with code like so:
foos = Foo.all().filter('bar =', bar)
for foo in foos:
userFoo = UserFoo.all().filter('userKey =', user).filter('fooCode =', foo.fooCode)
Note: I'm using fooCode
over a reference key so that we can easily delete and re-add new Foo
s and not have to then remap all the corresponding UserFoo
s.
In General:
What are typical approaches to designing GAE datastore tables and best-practices for using them?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这是gets 的阶梯反模式。解决方案是ReferenceProperty 预取。
问题是您决定不使用 ReferenceProperty。我建议您重新考虑这个选择。
请记住,实体键只是其路径的编码表示:实体及其任何祖先的种类和名称或 ID。如果您删除并重新创建了
Foo
,则只有给定不同的名称或 ID 时,它才会具有不同的密钥。如果您有某种方法为新旧实体提供相同的fooCode
,您可以轻松地使用fooCode
作为键名,这将允许删除然后重新添加了Foo
以保留其原始密钥。This is the staircase of gets anti-pattern. The solution is ReferenceProperty pre-fetching.
The hitch is that you've decided not to use a ReferenceProperty. I would advise you to reconsider this choice.
Remember that an entity key is just an encoded representation of its path: the kind and name or ID of the entity and any of its ancestors. If you deleted and then re-created a
Foo
, it would only have a different key if it was given a different name or ID. If you have some way of giving the old and new entity the samefooCode
, you could just as easily use thefooCode
as the key name, which would allow a deleted and then re-addedFoo
to retain its original key.一般来说:
尽可能多地反规范化。
通过密钥引用实体
只要有可能;这是最快的
从数据存储中获取数据的方法。
具体来说,如果您使用 ReferenceProperty 来建立关系而不是进入过滤器的代码,则可能会显着提高性能。我猜测查询 UserFoo 的 Foo 比删除和重新映射 Foo 发生的频率要高得多,是吗?在这种情况下,务实且明智的做法是使用引用属性。
此外,如果 Foo-UserFoo 关系可以非规范化为单个实体,则完全不需要整个系列的查询。
In General:
Denormalize as much as possible.
Refer to entities via their Key
whenever possible; it is the fastest
way to get data out of the datastore.
Specifically, you could likely get a dramatic increase in your performance if you used ReferenceProperty to establish relationships rather than codes that go into filters. I am going to guess that querying for a UserFoo's Foo happens a good deal more often than removing and remapping Foo's, yeah? In that case, the pragmatic and datastore-wise thing to do is to use Reference Properties.
Also, if the Foo-UserFoo relationship can be denormalized into a single entity, you remove the need for an entire series of queries altogether.
我建议进行以下更改:
I would suggest the following changes: