mongodb - 检索数组子集
看似简单的任务对我来说是一个挑战。
我有以下 mongodb 结构:
{
(...)
"services": {
"TCP80": {
"data": [{
"status": 1,
"delay": 3.87,
"ts": 1308056460
},{
"status": 1,
"delay": 2.83,
"ts": 1308058080
},{
"status": 1,
"delay": 5.77,
"ts": 1308060720
}]
}
}}
现在,以下查询返回整个文档:
{ 'services.TCP80.data.ts':{$gt:1308067020} }
我想知道 - 我是否可以只接收那些与 $gt 条件匹配的“数据”数组条目(某种缩小的文档)?
我正在考虑 MapReduce,但找不到关于如何将外部参数(时间戳)传递给 Map() 函数的单个示例。 (此功能已在 1.1.4 https://jira.mongodb.org/browse/SERVER- 401)
另外,总是有一种替代方法来编写storedJs函数,但是由于我们谈论的是大量数据,所以这里不能容忍数据库锁。
最有可能的是,我必须将结构重新设计为 1 层深度,例如:
{
status:1,delay:3.87,ts:138056460,service:TCP80
},{
status:1,delay:2.83,ts:1308058080,service:TCP80
},{
status:1,delay:5.77,ts:1308060720,service:TCP80
}
但是数据库将急剧增长,因为“服务”只是将附加每个文档的众多选项之一。
请指教!
提前致谢
what seemed a simple task, came to be a challenge for me.
I have the following mongodb structure:
{
(...)
"services": {
"TCP80": {
"data": [{
"status": 1,
"delay": 3.87,
"ts": 1308056460
},{
"status": 1,
"delay": 2.83,
"ts": 1308058080
},{
"status": 1,
"delay": 5.77,
"ts": 1308060720
}]
}
}}
Now, the following query returns whole document:
{ 'services.TCP80.data.ts':{$gt:1308067020} }
I wonder - is it possible for me to receive only those "data" array entries matching $gt criteria (kind of shrinked doc)?
I was considering MapReduce, but could not locate even a single example on how to pass external arguments (timestamp) to Map() function. (This feature was added in 1.1.4 https://jira.mongodb.org/browse/SERVER-401)
Also, there's always an alternative to write storedJs function, but since we speak of large quantities of data, db-locks can't be tolerated here.
Most likely I'll have to redesign the structure to something 1-level deep, like:
{
status:1,delay:3.87,ts:138056460,service:TCP80
},{
status:1,delay:2.83,ts:1308058080,service:TCP80
},{
status:1,delay:5.77,ts:1308060720,service:TCP80
}
but DB will grow dramatically, since "service" is only one of many options which will append each document.
please advice!
thanks in advance
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
在带有聚合框架的 2.1 版本中,您现在可以执行以下操作:
您可以在第 2 行中使用自定义条件来过滤父文档。如果您不想过滤它们,只需将第 2 行省略即可。
In version 2.1 with the aggregation framework you are now able to do this:
You can use a custom criteria in line 2 to filter the parent documents. If you don't want to filter them, just leave line 2 out.
目前不支持此功能。默认情况下,除非您使用字段限制或 $slice 运算符,否则您将始终收到整个文档/数组。目前,这些工具不允许根据搜索条件过滤数组元素。
您应该关注此请求以了解执行此操作的方法: https://jira.mongodb.org/browse /SERVER-828
This is not currently supported. By default you will always receive the whole document/array unless you use field restrictions or the $slice operator. Currently these tools do not allow filtering the array elements based on the search criteria.
You should watch this request for a way to do this: https://jira.mongodb.org/browse/SERVER-828
我正在尝试做类似的事情。我尝试了您使用 GROUP 函数的建议,但我无法将嵌入的文档分开或者做错了一些事情。
我需要按 ID 拉取/获取嵌入文档的子集。以下是我使用 Map/Reduce 的方法:
我将集合名称抽象为“父级”,并将其嵌入文档抽象为“子级”。我传入两个参数:父文档 ID 和我想要从父文档检索的嵌入文档 ID 的数组。这些参数作为第三个参数传递给mapReduce 函数。
在映射函数中,我找到集合中的父文档(我很确定它使用 _id 索引)并将其 id 和子文档发送给reduce 函数。
在reduce 函数中,我获取传入的文档并循环遍历每个子项,收集具有所需ID 的文档。循环遍历所有子项并不理想,但我不知道还有另一种方法可以通过嵌入文档上的 ID 进行查找。
我还假设在reduce 函数中只有一个文档被发出,因为我是按ID 搜索的。如果您希望匹配多个parent_id,则必须在reduce 函数中循环遍历
values
数组。我希望这对那里的人有帮助,因为我到处搜索都没有结果。希望我们很快就能看到 MongoDB 的内置功能,但在那之前我必须使用它。
I'm attempting to do something similar. I tried your suggestion of using the GROUP function, but I couldn't keep the embedded documents separate or was doing something incorrectly.
I needed to pull/get a subset of embedded documents by ID. Here's how I did it using Map/Reduce:
I've abstracted my collection name to 'parent' and it's embedded documents to 'children'. I pass in two parameters: The parent document ID and an array of the embedded document IDs that I want to retrieve from the parent. Those parameters are passed in as the third parameter to the mapReduce function.
In the map function I find the parent document in the collection (which I'm pretty sure uses the _id index) and emit its id and children to the reduce function.
In the reduce function, I take the passed in document and loop through each of the children, collecting the ones with the desired ID. Looping through all the children is not ideal, but I don't know of another way to find by ID on an embedded document.
I also assume in the reduce function that there is only one document emitted since I'm searching by ID. If you expect more than one parent_id to match, than you will have to loop through the
values
array in the reduce function.I hope this helps someone out there, as I googled everywhere with no results. Hopefully we'll see a built in feature soon from MongoDB, but until then I have to use this.
Fadi,至于“将嵌入式文档分开”——小组应该毫无问题地处理这个问题
Fadi, as for "keeping embedded documents separate" - group should handle this with no issues