mongodb 获取不同的记录

发布于 2024-10-18 14:57:20 字数 544 浏览 7 评论 0原文

我正在使用 mongoDB,其中我收集了以下格式。

{"id" : 1 , name : x  ttm : 23 , val : 5 }
{"id" : 1 , name : x  ttm : 34 , val : 1 }
{"id" : 1 , name : x  ttm : 24 , val : 2 }
{"id" : 2 , name : x  ttm : 56 , val : 3 }
{"id" : 2 , name : x  ttm : 76 , val : 3 }
{"id" : 3 , name : x  ttm : 54 , val : 7 }

在该集合中,我查询了按降序排列的记录,如下所示:

db.foo.find({"id" : {"$in" : [1,2,3]}}).sort(ttm : -1).limit(3)

但它给出了相同 id = 1 的两条记录,并且我希望记录每个 id 给出 1 条记录 >。

在mongodb中可以吗?

I am using mongoDB in which I have collection of following format.

{"id" : 1 , name : x  ttm : 23 , val : 5 }
{"id" : 1 , name : x  ttm : 34 , val : 1 }
{"id" : 1 , name : x  ttm : 24 , val : 2 }
{"id" : 2 , name : x  ttm : 56 , val : 3 }
{"id" : 2 , name : x  ttm : 76 , val : 3 }
{"id" : 3 , name : x  ttm : 54 , val : 7 }

On that collection I have queried to get records in descending order like this:

db.foo.find({"id" : {"$in" : [1,2,3]}}).sort(ttm : -1).limit(3)

But it gives two records of same id = 1 and I want records such that it gives 1 record per id.

Is it possible in mongodb?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

最初的梦 2024-10-25 14:57:21

您想使用聚合。您可以这样做:

db.test.aggregate([
    // each Object is an aggregation.
    {
        $group: {
            originalId: {$first: '$_id'}, // Hold onto original ID.
            _id: '$id', // Set the unique identifier
            val:  {$first: '$val'},
            name: {$first: '$name'},
            ttm:  {$first: '$ttm'}
        }

    }, {
        // this receives the output from the first aggregation.
        // So the (originally) non-unique 'id' field is now
        // present as the _id field. We want to rename it.
        $project:{
            _id : '$originalId', // Restore original ID.

            id  : '$_id', // 
            val : '$val',
            name: '$name',
            ttm : '$ttm'
        }
    }
])

这将非常快...对于我的 100,000 个文档的测试数据库来说大约需要 90 毫秒。

示例:

db.test.find()
// { "_id" : ObjectId("55fb595b241fee91ac4cd881"), "id" : 1, "name" : "x", "ttm" : 23, "val" : 5 }
// { "_id" : ObjectId("55fb596d241fee91ac4cd882"), "id" : 1, "name" : "x", "ttm" : 34, "val" : 1 }
// { "_id" : ObjectId("55fb59c8241fee91ac4cd883"), "id" : 1, "name" : "x", "ttm" : 24, "val" : 2 }
// { "_id" : ObjectId("55fb59d9241fee91ac4cd884"), "id" : 2, "name" : "x", "ttm" : 56, "val" : 3 }
// { "_id" : ObjectId("55fb59e7241fee91ac4cd885"), "id" : 2, "name" : "x", "ttm" : 76, "val" : 3 }
// { "_id" : ObjectId("55fb59f9241fee91ac4cd886"), "id" : 3, "name" : "x", "ttm" : 54, "val" : 7 }


db.test.aggregate(/* from first code snippet */)

// output
{
    "result" : [
        {
            "_id" : ObjectId("55fb59f9241fee91ac4cd886"),
            "val" : 7,
            "name" : "x",
            "ttm" : 54,
            "id" : 3
        },
        {
            "_id" : ObjectId("55fb59d9241fee91ac4cd884"),
            "val" : 3,
            "name" : "x",
            "ttm" : 56,
            "id" : 2
        },
        {
            "_id" : ObjectId("55fb595b241fee91ac4cd881"),
            "val" : 5,
            "name" : "x",
            "ttm" : 23,
            "id" : 1
        }
    ],
    "ok" : 1
}

优点:几乎可以肯定是最快的方法。

缺点:涉及使用复杂的聚合 API。此外,它与文档的原始模式紧密耦合。不过,也许可以概括这一点。

You want to use aggregation. You could do that like this:

db.test.aggregate([
    // each Object is an aggregation.
    {
        $group: {
            originalId: {$first: '$_id'}, // Hold onto original ID.
            _id: '$id', // Set the unique identifier
            val:  {$first: '$val'},
            name: {$first: '$name'},
            ttm:  {$first: '$ttm'}
        }

    }, {
        // this receives the output from the first aggregation.
        // So the (originally) non-unique 'id' field is now
        // present as the _id field. We want to rename it.
        $project:{
            _id : '$originalId', // Restore original ID.

            id  : '$_id', // 
            val : '$val',
            name: '$name',
            ttm : '$ttm'
        }
    }
])

This will be very fast... ~90ms for my test DB of 100,000 documents.

Example:

db.test.find()
// { "_id" : ObjectId("55fb595b241fee91ac4cd881"), "id" : 1, "name" : "x", "ttm" : 23, "val" : 5 }
// { "_id" : ObjectId("55fb596d241fee91ac4cd882"), "id" : 1, "name" : "x", "ttm" : 34, "val" : 1 }
// { "_id" : ObjectId("55fb59c8241fee91ac4cd883"), "id" : 1, "name" : "x", "ttm" : 24, "val" : 2 }
// { "_id" : ObjectId("55fb59d9241fee91ac4cd884"), "id" : 2, "name" : "x", "ttm" : 56, "val" : 3 }
// { "_id" : ObjectId("55fb59e7241fee91ac4cd885"), "id" : 2, "name" : "x", "ttm" : 76, "val" : 3 }
// { "_id" : ObjectId("55fb59f9241fee91ac4cd886"), "id" : 3, "name" : "x", "ttm" : 54, "val" : 7 }


db.test.aggregate(/* from first code snippet */)

// output
{
    "result" : [
        {
            "_id" : ObjectId("55fb59f9241fee91ac4cd886"),
            "val" : 7,
            "name" : "x",
            "ttm" : 54,
            "id" : 3
        },
        {
            "_id" : ObjectId("55fb59d9241fee91ac4cd884"),
            "val" : 3,
            "name" : "x",
            "ttm" : 56,
            "id" : 2
        },
        {
            "_id" : ObjectId("55fb595b241fee91ac4cd881"),
            "val" : 5,
            "name" : "x",
            "ttm" : 23,
            "id" : 1
        }
    ],
    "ok" : 1
}

PROS: Almost certainly the fastest method.

CONS: Involves use of the complicated Aggregation API. Also, it is tightly coupled to the original schema of the document. Though, it may be possible to generalize this.

寻找一个思念的角度 2024-10-25 14:57:21

我相信你可以像这样使用聚合

collection.aggregate({
   $group : {
        "_id" : "$id",
        "docs" : { 
            $first : { 
            "name" : "$name",
            "ttm" : "$ttm",
            "val" : "$val",
            }
        } 
    }
});

I believe you can use aggregate like this

collection.aggregate({
   $group : {
        "_id" : "$id",
        "docs" : { 
            $first : { 
            "name" : "$name",
            "ttm" : "$ttm",
            "val" : "$val",
            }
        } 
    }
});
所谓喜欢 2024-10-25 14:57:21

问题是您希望将 3 个匹配记录精简为 1 个,而不在查询中提供任何关于如何在匹配结果之间进行选择的逻辑。

您的选项基本上是指定某种聚合逻辑(例如,选择每列的最大值或最小值),或者运行选择不同查询并仅选择您希望不同的字段。

querymongo.com 很好地为您翻译了这些不同的查询(从 SQL 到 MongoDB)。

例如,此 SQL:

SELECT DISTINCT columnA FROM collection WHERE columnA > 5

返回为 MongoDB:

db.runCommand({
    "distinct": "collection",
    "query": {
        "columnA": {
            "$gt": 5
        }
    },
    "key": "columnA"
});

The issue is that you want to distill 3 matching records down to one without providing any logic in the query for how to choose between the matching results.

Your options are basically to specify aggregation logic of some kind (select the max or min value for each column, for example), or to run a select distinct query and only select the fields that you wish to be distinct.

querymongo.com does a good job of translating these distinct queries for you (from SQL to MongoDB).

For example, this SQL:

SELECT DISTINCT columnA FROM collection WHERE columnA > 5

Is returned as this MongoDB:

db.runCommand({
    "distinct": "collection",
    "query": {
        "columnA": {
            "$gt": 5
        }
    },
    "key": "columnA"
});
渔村楼浪 2024-10-25 14:57:21

如果你想使用 javascript 将不同的结果写入文件中......这就是你要做的

cursor = db.myColl.find({'fieldName':'fieldValue'})

var Arr = new Array();
var count = 0;

cursor.forEach(

function(x) {

    var temp = x.id;    
var index = Arr.indexOf(temp);      
if(index==-1)
   {
     printjson(x.id);
     Arr[count] = temp;
         count++;
   }
})

If you want to write the distinct result in a file using javascript...this is how you do

cursor = db.myColl.find({'fieldName':'fieldValue'})

var Arr = new Array();
var count = 0;

cursor.forEach(

function(x) {

    var temp = x.id;    
var index = Arr.indexOf(temp);      
if(index==-1)
   {
     printjson(x.id);
     Arr[count] = temp;
         count++;
   }
})
风情万种。 2024-10-25 14:57:21

指定具有不同的查询。
以下示例从部门等于“A”的文档中返回嵌入在项目字段中的字段 sku 的不同值:

db.inventory.distinct( "item.sku", { dept: "A" } )

参考:https://docs.mongodb.com/manual/reference/method/db.collection.distinct/

Specify Query with distinct.
The following example returns the distinct values for the field sku, embedded in the item field, from the documents whose dept is equal to "A":

db.inventory.distinct( "item.sku", { dept: "A" } )

Reference: https://docs.mongodb.com/manual/reference/method/db.collection.distinct/

梦途 2024-10-25 14:57:20

mongodb 中有一个 distinct 命令,可以与查询结合使用。但是,我相信这只是返回您指定的特定键的不同值列表(即在您的情况下,您只会返回 id 值),所以我不确定这是否会为您提供您想要的内容,如果您需要整个文档 - 您可能需要 MapReduce。

关于不同的文档:
http://www.mongodb.org/display/DOCS/Aggregation#Aggregation-Distinct

There is a distinct command in mongodb, that can be used in conjunction with a query. However, I believe this just returns a distinct list of values for a specific key you name (i.e. in your case, you'd only get the id values returned) so I'm not sure this will give you exactly what you want if you need the whole documents - you may require MapReduce instead.

Documentation on distinct:
http://www.mongodb.org/display/DOCS/Aggregation#Aggregation-Distinct

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文