Mongodb 映射减少 2 个集合

发布于 2024-12-13 13:15:48 字数 978 浏览 2 评论 0原文

假设我们有用户和帖子集合。在后期收集中,投票将用户名存储为密钥。

db.user.insert({name:'a', age:12});
db.user.insert({name:'b', age:12});
db.user.insert({name:'c', age:22});
db.user.insert({name:'d', age:22});

db.post.insert({Title:'Title1', vote:[a]});
db.post.insert({Title:'Title2', vote:[a,b]});
db.post.insert({Title:'Title3', vote:[a,b,c]});
db.post.insert({Title:'Title4', vote:[a,b,c,d]});

我们想按帖子标题进行分组,并找出不同用户年龄的投票数。

> {_id:'Title1', value:{ ages:[{age:12, Count:1},{age:22, Count:0}]} }
> {_id:'Title2', value:{ ages:[{age:12, Count:2},{age:22, Count:0}]} }
> {_id:'Title3', value:{ ages:[{age:12, Count:2},{age:22, Count:1}]} }
> {_id:'Title4', value:{ ages:[{age:12, Count:2},{age:22, Count:2}]} }

我已经搜索过,但没有找到访问 mongodb mapreduce 中的 2 个集合的方法。 是否可以通过re-reduce来实现?

我知道在帖子中嵌入用户文档非常简单,但这不是一个好方法,因为真正的用户文档有很多属性。如果我们包含用户文档的简化版本,则会限制分析的维度。

{Title:'Title1', vote:[{name:'a', age:12}]}

Let say we have user and post collection. In post collection, vote store the user name as a key.

db.user.insert({name:'a', age:12});
db.user.insert({name:'b', age:12});
db.user.insert({name:'c', age:22});
db.user.insert({name:'d', age:22});

db.post.insert({Title:'Title1', vote:[a]});
db.post.insert({Title:'Title2', vote:[a,b]});
db.post.insert({Title:'Title3', vote:[a,b,c]});
db.post.insert({Title:'Title4', vote:[a,b,c,d]});

We would like to group by the post.Title and find out the count of vote in different user age.

> {_id:'Title1', value:{ ages:[{age:12, Count:1},{age:22, Count:0}]} }
> {_id:'Title2', value:{ ages:[{age:12, Count:2},{age:22, Count:0}]} }
> {_id:'Title3', value:{ ages:[{age:12, Count:2},{age:22, Count:1}]} }
> {_id:'Title4', value:{ ages:[{age:12, Count:2},{age:22, Count:2}]} }

I have searched through and doesn't find a way to access 2 collection in mongodb mapreduce.
Could it be possible to achieve in re-reduce?

I know it is much simple to embedded the user document in post, but it is not a nice way to do as the real user document have many properties. If we include the simplify version of user document, it will limit the dimension of analysis.

{Title:'Title1', vote:[{name:'a', age:12}]}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

梦情居士 2024-12-20 13:15:49

MongoDB没有多集合Map/Reduce。 MongoDB 没有任何 JOIN 语法,可能不太适合临时连接。您需要以某种方式对该数据进行非规范化。

您有几个选项:

选项#1:将年龄嵌入投票。

{Title:'Title1', vote:[{name:'a', age:12}]}

选项#2:保留年龄计数器

{Title:'Title1', vote:[a, b], age: { "12" : 1, "22" : 1 }}

选项#3:执行“手动”加入

您的最后一个选择是编写脚本/代码,对两个集合执行 for 循环并正确合并数据。

因此,您将循环 post 并输出一个包含标题和投票列表的集合。然后,您将循环遍历新集合并通过查找每个用户来更新年龄。

我的建议

选择#1或#2。

MongoDB does not have a multi-collection Map / Reduce. MongoDB does not have any JOIN syntax and may not be very good for ad-hoc joins. You will need to denormalize this data in some way.

You have a few options:

Option #1: Embed the age with the vote.

{Title:'Title1', vote:[{name:'a', age:12}]}

Option #2: Keep a counter of the ages

{Title:'Title1', vote:[a, b], age: { "12" : 1, "22" : 1 }}

Option #3: Do a "manual" join

Your last option is to write script/code that does a for loop over both collections and merges the data correctly.

So you would loop over post and output a collection with the title and the list of votes. Then you would loop through the new collection and update the ages by looking up each user.

My suggestion

Go with #1 or #2.

小糖芽 2024-12-20 13:15:49

相反,

{name:'a', age:12}

向用户文档添加新字段并在每次投票更新中维护它会更容易。当然,您可以享受使用 MapReduce 来分析您的数据。

{name:'a', age:12, voteTitle:["Title1","Title2","Title3","Title4"]}

Instead of

{name:'a', age:12}

It is easier to add a new field to user document and maintain it in each vote update.Of course, you can enjoy to use map reduce to analysis your data.

{name:'a', age:12, voteTitle:["Title1","Title2","Title3","Title4"]}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文