MapReduce 的输出参数在哪里使用?
这是本教程的代码示例:
http://kylebanker.com /blog/2009/12/mongodb-map-reduce-basics/
他指出“从 MongoDB v1.8 开始,您必须指定输出集合名称”。但我不明白在哪里提到它或为什么需要它。
# Running map-reduce from Ruby (irb) assuming
# that @comments references the comments collection
# Specify the map and reduce functions in JavaScript, as strings
>> map = "function() { emit(this.author, {votes: this.votes}); }"
>> reduce = "function(key, values) { " +
"var sum = 0; " +
"values.forEach(function(doc) { " +
" sum += doc.votes; " +
"}); " +
"return {votes: sum}; " +
"};"
# Pass those to the map_reduce helper method
@results = @comments.map_reduce(map, reduce, :out => "mr_results")
# Since this method returns an instantiated results collection,
# we just have to query that collection and iterate over the cursor.
>> @results.find().to_a
=> [{"_id" => "hwaet", "value"=>{"votes"=>21.0}},
{"_id" => "kbanker", "value"=>{"votes"=>13.0}}
]
This is a code example from this tutorial:
http://kylebanker.com/blog/2009/12/mongodb-map-reduce-basics/
He notes that "as of MongoDB v1.8, you must specify an output collection name." But I don't see where this is referred to or why it is needed.
# Running map-reduce from Ruby (irb) assuming
# that @comments references the comments collection
# Specify the map and reduce functions in JavaScript, as strings
>> map = "function() { emit(this.author, {votes: this.votes}); }"
>> reduce = "function(key, values) { " +
"var sum = 0; " +
"values.forEach(function(doc) { " +
" sum += doc.votes; " +
"}); " +
"return {votes: sum}; " +
"};"
# Pass those to the map_reduce helper method
@results = @comments.map_reduce(map, reduce, :out => "mr_results")
# Since this method returns an instantiated results collection,
# we just have to query that collection and iterate over the cursor.
>> @results.find().to_a
=> [{"_id" => "hwaet", "value"=>{"votes"=>21.0}},
{"_id" => "kbanker", "value"=>{"votes"=>13.0}}
]
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
新的 Map/Reduce 输出选项记录在此处。
基本前提是 Map/Reduce 最初只是输出到临时集合。临时集合存在一些问题(为什么所有这些工作只是为了使其成为临时的?)并且在合并和重新减少方面添加了一些功能。
特别是,您现在可以运行 M/R,有效更新先前 M/R 的输出。 (考虑每小时更新一次每日统计数据,并且仅处理最后一小时)。
但是,如果您只需要结果的内存版本,则可以使用内联选项。
The new Map / Reduce output options are documented here.
The basic premise is that Map / Reduce would originally just output to a temp collection. There were issues around the temp collection, (why do all of that work just to have it be temporary?) and there were some features added around merging and re-reducing.
In particular, you can now run an M/R that effectively updates what output from a previous M/R. (think of updating daily stats once / hour and only processing the last hour).
However, if you only want an in-memory version of the results, you can use the inline option.