在 MongoDB 中,我如何查询仅基于数组字段中包含用户名的故事的故事集合?

发布于 2024-12-20 19:57:34 字数 255 浏览 0 评论 0原文

这个问题很大程度上是一个健全性检查。我通过故事集合和用户集合组织了数据库。每个故事都有一组对该对象进行投票的“选民”。每个用户还有一系列“朋友”。我想要做的是仅搜索我的朋友已投票的故事,而且还能够根据对该项目投票的朋友数量对这些故事进行排序。

我最初的想法是这样的:在 Story 对象中索引选民字段。然后使用用户文档中的“朋友”数组对这个索引选民字段上的故事进行地图缩减查询,并使用分组函数来计算每个故事出现的次数?不确定这是否正确..我也不确定这是否会扩展..感谢想法和建议。

This question is largely a sanity check. I've organized a DB by a collection of stories and a collection of users. Each story has an array of 'voters' who have voted on that object. Each user also has an array of 'friends'. What I want to do is search for only stories that my friends have voted on, but additionally to be able to sort these by the number of friends voting on that item.

My initial thinking is this: To index the field of voters in the Story objects. Then do a map reduce query for just stories on this indexed voter field using the array of 'friends' from the user document, with a grouping function to count the number of times each story shows up? Not sure if that is correct.. I'm also not sure if this would scale.. Thoughts and suggestions appreciated.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

东北女汉子 2024-12-27 19:57:34

我认为你应该使用一个后台工作人员定期运行你的 M/R 查询并将结果存储在一个集合中,你可以很容易地查询,例如,

TopStories { 
  "UserId" : ObjectId("..."),
  "List" : [ 
              { "TotalVotes" : 200, 
                "FriendVotes" : 28, 
                "StoryName" : "test", 
                "StoryId" : ObjectId('...') 
               }, 
               {
                 /* etc. */ }
               } 
           ]
}

这查询起来很简单,但不是很灵活。更灵活的结构,避免嵌入列表:

TopStory { 
   "UserId": ObjectId("..."),
   "StoryId" : ObjectId("..."),
   "StoryName" : "foo",
   "FriendVotes" : 28,
   "TotalVotes" : 200
   // etc.
}

例如,后者也可用于按总票数排序。

M/R 曾经是“大锤子”,不应该从 Web 前端或任何东西实时运行。有计划对此进行改进,但我不知道目前的状况,所以我会谨慎行事。我还相信,如果您的集合变大,这个 M/R 作业不会很快,预计它会以几十秒(如果不是几分钟)的顺序运行,而不是毫秒。

I think you should use a background worker that runs your M/R query periodically and stores the results in a collection which you can the query very easily, e.g

TopStories { 
  "UserId" : ObjectId("..."),
  "List" : [ 
              { "TotalVotes" : 200, 
                "FriendVotes" : 28, 
                "StoryName" : "test", 
                "StoryId" : ObjectId('...') 
               }, 
               {
                 /* etc. */ }
               } 
           ]
}

This is trivial to query, but not very flexible. A more flexible structure, avoiding an embedded list:

TopStory { 
   "UserId": ObjectId("..."),
   "StoryId" : ObjectId("..."),
   "StoryName" : "foo",
   "FriendVotes" : 28,
   "TotalVotes" : 200
   // etc.
}

The latter can be used to sort by the number of total votes as well, for example.

M/R used to be 'the big hammer', which should not be run in real-time from a web frontend or anything. There were plans to improve this, but I don't know the current state of that, so I'd play it safe. I also believe that this M/R job won't be very fast if your collections grow big, expect this to run in the order of dozens of seconds if not minutes, rather than milliseconds.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文