Mongo Map第一次Reduce
第一次使用 Map/Reduce 用户,并使用 MongoDB。我有很多页面访问数据,我想通过使用 Map/Reduce 来理解这些数据。下面基本上是我想做的,但作为一个 Map/Reduce 的初学者,我认为这超出了我的知识范围!
- 浏览过去 30 天内有访问量且 external = true 的所有页面。
- 然后,对于每个页面,查找所有访问
- 按引荐位置对所有访问进行分组
- 对于每个引荐位置,计算有多少人访问具有特定“类型”并且在“标签”中也具有特定单词的页面。
数据库和集合的组织方式如下:
$mongo->dbname->visits
示例文档是:
{"url": "www.example.com", "type": "a", "refer": {"external": true, "domain": "twitter.com", "url": "http://www.twitter.com/page"}, "page": "1235", "user": "1232", "time": 1234567890}
然后我想查找具有特定标签的 B 类型文档。
{"url": "www.example.com", "type": "b", "page": "745", "user": "1232", "time": 1234567890, "tags": {"a", "b", "c"}}
如果有影响的话,我正在使用普通的 Mongo PHP 扩展。
First time Map/Reduce user here, and using MongoDB. I have a lot of page visit data which I'd like to make some sense of by using Map/Reduce. Below is basically what I want to do, but as a total beginner a Map/Reduce, I think this is above my knowledge!
- Go through all the pages with visits in the last 30 days, and where external = true.
- Then for each page, find all visits
- Group all visits by referral location
- For each referral location, calculate how many then went to visit a page which has a certain "type" and also has a certain word in the "tags".
The database and collection are organised as
$mongo->dbname->visits
A sample document is:
{"url": "www.example.com", "type": "a", "refer": {"external": true, "domain": "twitter.com", "url": "http://www.twitter.com/page"}, "page": "1235", "user": "1232", "time": 1234567890}
And then I want to find documents of type B with a certain tag.
{"url": "www.example.com", "type": "b", "page": "745", "user": "1232", "time": 1234567890, "tags": {"a", "b", "c"}}
I'm using the normal Mongo PHP extension if that has an impact.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
好吧,我想出了一些我认为可以实现你想要的东西。请注意,这可能无法完全正常工作,因为我不能 100% 确定您的架构(考虑到您的示例显示
refer
在类型 a 中可用,但在类型 b 中不可用(我不确定这是否是一个遗漏) ,或者考虑到您想通过引用查看)...无论如何,这就是我想出的:映射函数:
减少函数:
基本上,它的工作原理是这样的 映射函数使用引用键。 .url (我根据你的描述猜测)。所以最终结果看起来像一个
_id
等于refer.url 的数组(它根据 url 进行分组)。它下面有两个对象(类型和标签)。对象的原因是为了让map和reduce可以发出相同格式的对象除此之外,我认为它应该是相对不言自明的(如果你不明白,我。可以尝试解释更多)...所以让我们在 PHP 中实现这个(假设
$map
和$reduce
是字符串,为了简洁起见,其中包含上述内容): ,我没有测试过这个。这正是我根据我对您的模式的理解以及我对 Mongo 及其 Map-Reduce 实现的理解而得出的结论......
Ok, I've come up with something that I think may do what you want. Note, that this may not work exactly since I'm not 100% sure of your schema (considering your examples show
refer
available in type a, but not b (I'm not sure if that's an omission, or what considering you want to view by referer)... Anyway, here's what I've come up with:The map function:
The Reduce function:
So basically, how it works is this. The Map function uses a key of refer.url (what I guessed based on your description). So the end result will look like an array with
_id
equal to refer.url (It groups based on url). It then creates an object that has two objects under it (types and tags). The reason for the object is so that map and reduce can emit the same format object. Other than that, I THINK that it should be relatively self explanatory (If you don't understand, I can try to explain more)...So let's implement this in PHP (Assuming that
$map
and$reduce
are strings with the above contained with them for terseness):Note, I haven't tested this. This is just what I've come up with based on my understanding of your schema, and from my understanding of Mongo and its Map-Reduce implementation...
MapReduce 已经在 Mongo DB ODM 中实现:
http ://www.doctrine-project.org/docs/mongodb_odm/1.0/en/reference/map-reduce.html
Map reduce is already implemented in Mongo DB ODM:
http://www.doctrine-project.org/docs/mongodb_odm/1.0/en/reference/map-reduce.html