数据存储中真的需要索引吗?

发布于 2024-10-12 14:10:23 字数 727 浏览 5 评论 0原文

我对一些 GAE 文档有点困惑。虽然我打算添加索引来优化应用程序的性能,但我想澄清一下它们是否仅建议用于此目的,或者是否确实需要它们。

查询找不到属性值 未编入索引的。这包括 标记为“不”的属性 索引,以及属性 长文本值类型的值 (文本)或长二进制值类型 (斑点)。

带有过滤器或排序顺序的查询 属性永远不会与实体匹配 其属性值为 Text 或 Blob,或用 该属性标记为未索引。 具有此类值的属性的行为如下 如果未设置该属性 查询过滤器和排序顺序。

来自 http://code.google.com/appengine/docs/ java/datastore/queries.html#Introduction_to_Indexes

第一段让我相信您根本无法对未索引的属性进行排序或过滤。但是,第二段让我认为此限制仅局限于 Text 或 Blob 属性或专门注释为未索引的属性。

我对这种区别很好奇,因为我目前正在生产环境中对一些数字和字符串字段进行排序/过滤,这些字段未建立索引。这些查询在后台任务中运行,该任务大多不关心性能(宁愿优化此情况下的大小/成本)。我是否很幸运这些返回了正确的数据?

I'm a bit confused by some of the GAE documentation. While I intend to add indexes to optimize performance of my application, I wanted to get some clarification on if they are only suggested for this purpose or if they are truly required.

Queries can't find property values
that aren't indexed. This includes
properties that are marked as not
indexed, as well as properties with
values of the long text value type
(Text) or the long binary value type
(Blob).

A query with a filter or sort order on
a property will never match an entity
whose value for the property is a Text
or Blob, or which was written with
that property marked as not indexed.
Properties with such values behave as
if the property is not set with regard
to query filters and sort orders.

from http://code.google.com/appengine/docs/java/datastore/queries.html#Introduction_to_Indexes

The first paragraph leads me to believe that you simply cannot sort or filter on unindexed properties. However, the second paragraph makes me think that this limitation is only confined to Text or Blob properties or properties specifically annotated as unindexed.

I'm curious about the distinction because I have some numeric and string fields that I am currently sorting/filtering against in a production environment which are unindexed. These queries are being run in a background task that mostly doesn't care about performance (would rather optimize for size/cost in this sitation). Am I somehow just lucky that these are returning the right data?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

烟雨凡馨 2024-10-19 14:10:23

在 GAE 数据存储中,会自动为所有不可索引的属性(显式标记的或这些类型的属性)创建单个属性索引。

我认为该文档中的语言有点令人困惑。

仅当您想要按多个属性建立索引时(例如,按两个不同的属性排序),才需要显式定义索引。

In the GAE datastore, single property indexes are automatically created for all properties that are not unindexable (explicitly marked, or of those types).

The language in that doc, I suppose, is a tad confusing.

You only need to explicitly define indexes when you want to index by more than one property (say, for sorting by two different properties.)

ゝ偶尔ゞ 2024-10-19 14:10:23

在 GAE 中,不幸的是,如果该属性被标记为未索引

  num = db.IntegerProperty(required=True, indexed=False)

,那么就不可能将其包含在自定义索引中...这是适得其反的(大多数内置索引从未被我的代码使用,但占用大量空间)。但这就是 GAE 目前的运作方式。

数据存储区索引 - 未编入索引的属性

注意:如果某个属性出现在由多个属性组成的索引中,则将其设置为未索引将阻止它在组合索引中建立索引。

In GAE, unfortunately if the property is marked as unindexed

  num = db.IntegerProperty(required=True, indexed=False)

Then it is impossible to include it in the custom index... This is counterproductive (Most built-in indices are never used by my code, but take lots of space). But it is how GAE currently works.

Datastore Indexes - Unindexed properties:

Note: If a property appears in an index composed of multiple properties, then setting it to unindexed will prevent it from being indexed in the composed index.

烏雲後面有陽光 2024-10-19 14:10:23

在未明确输入indexed=True 或indexed=False 的情况下,切勿将属性添加到模型中。索引占用大量资源:空间、写入操作成本以及执行 put() 时的延迟增加。即使索引=False,我们也永远不会在没有明确声明其索引值的情况下添加属性。节省了代价高昂的疏忽,并迫使人们总是思考是否要建立索引。 (有时您会发现自己在咒骂您忘记覆盖默认值=True。)恕我直言,GAE 工程师不允许将其默认为 True,这将是一项伟大的服务。如果我是他们,我根本不会提供默认值。 HTH。 -史蒂夫普

Never add a property to a model without EXPLICITLY entering either indexed=True or indexed=False. Indices take substantial resources: space, write ops costs, and latency increases when doing put()s. We never, never add a property without explicitly stating its indexed value even if the index=False. Saves costly oversights, and forces one to always think a bit about whether or not to index. (You will at some point find yourself cursing the fact that you forgot to override the default=True.) GAE Engineers would do a great service by not allowing this to default to True imho. I would simply not provide a default if I was them. HTH. -stevep

装纯掩盖桑 2024-10-19 14:10:23

如果您想在一个查询中使用两个或多个过滤功能,则必须使用索引。
例如:
Foobar.filter('foo =', foo).filter('bar =', bar)

如果只用一个过滤器查询,则不需要使用索引,索引是自动生成的。

对于Blob和Text,即使在index.yaml中指定,也无法为其生成索引,同时不能在其中使用过滤器。
例如
类 Foobar(db.Model):
内容 = db.TextProperty()
Foobar.filter('内容=', 内容)
上面的代码将引发错误,因为无法为 TextProperty 分配索引并且无法匹配。

you must use index if you want to use two or more filter function in one single query.
e.g:
Foobar.filter('foo =', foo).filter('bar =', bar)

if you just query with one filter, no need to use index, which is auto-generated.

for Blob and Text, you can't generate index for them, even you specify it in index.yaml, meanwhile you can't use filter in them.
e.g.
class Foobar(db.Model):
content = db.TextProperty()
Foobar.filter('content =', content)
codes above will raise an Error because TextProperty can't be assigned a index and can't be matched.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文