Google App Engine 数据库索引
我需要在 Google App Engine 数据库中存储无向图。 出于优化目的,我正在考虑使用数据库索引。 使用 Google App Engine,有没有办法定义数据库表的列来创建索引?
我需要一些优化,因为我的应用程序在基于内容的过滤中使用此存储的无向图来进行项目推荐。此外,推荐算法会更新某些图的边的权重。
如果无法使用数据库索引,请建议另一种方法来减少图表的查询时间。我相信我的算法从图表中执行的数据检索操作多于写入操作。
PS:我使用的是Python。
I need to store a undirected graph in a Google App Engine database.
For optimization purposes, I am thinking to use database indexes.
Using Google App Engine, is there any way to define the columns of a database table to create its index?
I will need some optimization, since my app uses this stored undirected graph on a content-based filtering for item recommendation. Also, the recommender algorithm updates the weights of some graph's edges.
If it is not possible to use database indexes, please suggest another method to reduce query time for the graph table. I believe my algorithm does more data retrieval operations from graph table than write operations.
PS: I am using Python.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
也许这会有所帮助:http:// /code.google.com/intl/sv-SE/appengine/docs/python/datastore/queriesandindexes.html#Defining_Indexes_With_Configuration
Perhaps this will help: http://code.google.com/intl/sv-SE/appengine/docs/python/datastore/queriesandindexes.html#Defining_Indexes_With_Configuration
你真的看到查询速度慢得令人望而却步吗?我猜不是。我怀疑这有点不成熟的优化。 App Engine 数据存储区不会在内存中执行任何排序、过滤、联接或其他有意义的操作,因此查询时间通常相当恒定。特别是,查询延迟不取决于数据存储的实体数量,甚至不取决于与查询匹配的实体数量。这仅取决于您要求的结果数量。
与此相关的是,向数据存储添加索引不会加快现有查询的速度。如果查询需要自定义索引,没有它它不会降级并且运行速度会变慢。在添加索引之前,查询根本不会运行。
对于您提到的特定查询,
select * from Edges where vertex1 == x and vertex2 == y
,数据存储区可以在没有自定义索引的情况下运行它。有关更多详情,请参阅文档的这一部分。简而言之,只需运行您需要的查询,不要过多考虑索引或尝试像 DBA 一样进行优化。它不是关系数据库。 :P
are you actually seeing prohibitively slow queries? i'm guessing not. i suspect this is somewhat premature optimization. the app engine datastore doesn't do any sorting, filtering, joins, or other meaningful operations in memory, so query times are generally fairly constant. in particular, query latency does not depend on the number of entities of your datastore, or even the number of entities that match your query. it only depends on the number of results you ask for.
on a related note, adding indexes to your datastore will not speed up existing queries. if a query needs a custom index, it won't degrade and run slower without it. the query simply won't run at all until you add the index.
for the specific query you mention,
select * from edges where vertex1 == x and vertex2 == y
, the datastore can run it without a custom index at all. see this section of the docs for more details.in short, just run the queries you need, and don't think too much about indices or try to optimize as if you were a DBA. it's not a relational database. :P