Hadoop 映射器发出一个唯一的密钥。我可以在每个地图之后执行减速器吗?
我的映射器发出 “uniq key”-“非常大的值”对。
我的减速器不知道密钥是唯一的。 因此,reducer 会等待所有映射器完成。
我尝试使用组合器,但这对我来说不是一个简单的解决方案,因为我的减速器非常复杂。
我的问题是如何在每个地图之后执行减速器?不使用组合器。
My mapper emits
'uniq key' - 'very large value' pair.
My reducer doesn't know the key is unique.
Thus, the reducer waits until all the mappers are completed.
I tried to use a combiner, but it is not an easy solution for me, because my reducer is very complicated.
My question is how can I perform the reducer after per map? without using a combiner.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果您的密钥是唯一的,则无需减少它们。因此,只需将减速器代码复制粘贴到映射器并将减速器编号设置为零。顺便说一句,有许多映射减少作业不需要减少步骤,所以这并不奇怪。
If your keys are uniq then the is no need to reduce them. Therefore just copy-paste reducer code to mapper and set reducer number to zero. BTW there are many map reduce jobs which do not require reduce step so it is not something strange.
如果您事先知道您的密钥是唯一的,那么您可以将所有代码从减速器步骤移到映射中并进行所有工作。
If you know in advance your key is unique then you can move all the code from the reducer step into the map and to all the work there.
我不明白你的问题。您可以简单地不在作业配置中指定组合器。
I don't understand your question. You can simply not specify a combiner in the Job configuration.