MongoDB 子文档数组上的多个索引与单个索引？

发布于 2024-12-11 06:13:09 字数 2042 浏览 6 评论 0原文

想知道哪种技术对我需要跟踪的文档的各种时间戳进行索引是更有效的技术，请记住我的应用程序在写入方面相当繁重，但在阅读方面又足够繁重，如果没有索引，查询就会太慢。

是为每个时间戳设置一个字段并为每个字段建立索引，还是将时间戳及其关联类型存储在数组字段中，并为该数组的每个字段建立索引？

第一个选项，单独的字段，每个字段都有一个索引：

{
    "_id" : "...",
    "Field1.Timestamp" : '2011-01-01 01:00.000',
    "Field2.Timestamp" : '2011-01-01 01:00.000',
    "Field3.Timestamp" : '2011-01-01 01:00.000',
    "Field4.Timestamp" : '2011-01-01 01:00.000',
    "Field5.Timestamp" : '2011-01-01 01:00.000',
    "Field6.Timestamp" : '2011-01-01 01:00.000',
    "Field7.Timestamp" : '2011-01-01 01:00.000',
    "Field8.Timestamp" : '2011-01-01 01:00.000',
    "Field9.Timestamp" : '2011-01-01 01:00.000',
}

db.mycollection.ensureIndex({ "Field1.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field2.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field3.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field4.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field5.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field6.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field7.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field8.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field9.Timestamp" : 1 });

然后有一个时间戳及其状态的数组，只有一个索引

{
    "_id" : "...",
    "Timestamps" : [
        { "Type" : "Field1", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field2", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field3", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field4", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field5", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field6", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field7", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field8", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field9", "Timestamp" : '2011-01-01  01:00.000' },
    ]
}

db.mycollection.ensureIndex({ "Timestamps.Type" : 1, "Timestamps.Timestamp" : 1 });

我在这里离题了吗？或者哪一个是更好的方法

原文

Wondering which would be the more efficient technique for indexing my document's various timestamps that I need to keep track of, keeping in mind my application is fairly heavy on writing, but heavy enough on reading that without the indexes, the queries are too slow.

Is it better to have a field for each timestamp, and index each field, or store the timestamps and their associated type in an array field, and index each field of that array?

First option, separate fields, and an index for each:

{
    "_id" : "...",
    "Field1.Timestamp" : '2011-01-01 01:00.000',
    "Field2.Timestamp" : '2011-01-01 01:00.000',
    "Field3.Timestamp" : '2011-01-01 01:00.000',
    "Field4.Timestamp" : '2011-01-01 01:00.000',
    "Field5.Timestamp" : '2011-01-01 01:00.000',
    "Field6.Timestamp" : '2011-01-01 01:00.000',
    "Field7.Timestamp" : '2011-01-01 01:00.000',
    "Field8.Timestamp" : '2011-01-01 01:00.000',
    "Field9.Timestamp" : '2011-01-01 01:00.000',
}

db.mycollection.ensureIndex({ "Field1.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field2.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field3.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field4.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field5.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field6.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field7.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field8.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field9.Timestamp" : 1 });

Then there's an array of the timestamps and their status, with only a single index

{
    "_id" : "...",
    "Timestamps" : [
        { "Type" : "Field1", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field2", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field3", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field4", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field5", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field6", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field7", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field8", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field9", "Timestamp" : '2011-01-01  01:00.000' },
    ]
}

db.mycollection.ensureIndex({ "Timestamps.Type" : 1, "Timestamps.Timestamp" : 1 });

Am I way off the mark here? or which would be the better way

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

沧桑㈠ 2024-12-18 06:13:09

这基本上可以归结为，10 个大小为 N 的索引是否比一个大小为 N * 10 的索引更有效。如果您纯粹考虑读取，那么单独的索引应该总是更快。相关的 b 树遍历将检查较小的键集等。

不过，有几点需要考虑：

数组字段上的索引基本上单独索引每个数组元素。因此，在 B 树遍历期间，查找开销最多将是 1-2 个额外步骤，这对性能的影响可以忽略不计。换句话说，它们的速度几乎一样快。
拥有 10 个索引可能意味着每次更新/插入都需要更新多个索引（取决于您的索引是否共享一个字段或者您是否一次更新超过 1 个时间戳）。这是一个重要的性能考虑因素。
使用数组索引可以更轻松地添加附加时间戳（例如 Timestamp10）。
每个数据库可以使用的命名空间数量 (24k) 是有限制的，每个索引占用一个。如果您为每个字段创建单独的索引，这可能会成为一个问题。
最重要的是，数组索引更加简单，并且将简化您的代码，从而简化可维护性。考虑到有限的性能差异，我想说这是使用数组索引的最强烈动机。