Hadoop:实现 oahmapred 的接口,还是从 oahmapreduce 扩展类?
我正在学习 Hadoop (0.20.205),我有点困惑。推荐哪种方式:
A) 从 org.apache.hadoop.mapred 实现 Mapper 和Reducer 接口,并使用 配置作业JobConf
,如 PiEstimator
示例中所示。
B) 从org.apache.hadoop.mapreduce
扩展Mapper
和Reducer
类,并使用Job
配置作业,如 WordCount
示例中所示。
哪一种在未来更有可能被淘汰?
Hbase (0.90.4) 似乎更喜欢第二种方式,因为 oahhmapred
中的 TableOutputFormat
已被弃用,并且 oahhmapreduce
中的 TableOutFormat
代码> 不是。另一方面,像 IdentityMapper
或 IdentityReducer
这样有用的类似乎只存在于 oahmapred
中。总的来说,我倾向于版本 B。
您会选择哪种方式,为什么?提前致谢。
I'm learning Hadoop (0.20.205) and I'm a little bit confused. Which way is recommended:
A) Implement Mapper
and Reducer
interfaces from org.apache.hadoop.mapred
, and configure the job using JobConf
, as in the PiEstimator
example.
B) Extend Mapper
and Reducer
classes from org.apache.hadoop.mapreduce
, and configure the job using Job
, as in the WordCount
example.
Which one is more likely to become obsolete in the future?
Hbase (0.90.4) seems to prefer the second way, since TableOutputFormat
in o.a.h.h.mapred
is deprecated, and TableOutFormat
in o.a.h.h.mapreduce
is not. On the other hand, useful classes like IdentityMapper
or IdentityReducer
seem to exist only in o.a.h.mapred
. Overall, I'm leaning towards version B.
Which way would you choose, and why? Thanks in advance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
oahmapred 是旧的 MR API,oahmapreduce 是新的 API。功能方面没有太大区别,但新的 API 更易于维护。请此处查看我在 StackOverflow 中的回复。
o.a.h.mapred is the old MR API and the o.a.h.mapreduce is the new API. There is not much difference functionality wise, but the new API is easier to maintain. Please see my response in StackOverflow here.