Hadoop JobConf 类已弃用,需要更新示例
我正在编写 hadoop 程序,我真的不想使用已弃用的类。 在网上的任何地方我都找不到更新的程序
org.apache.hadoop.conf.Configuration
配置类 插入的
org.apache.hadoop.mapred.JobConf
类。
public static void main(String[] args) throws Exception {
JobConf conf = new JobConf(Test.class);
conf.setJobName("TESST");
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
conf.setMapperClass(Map.class);
conf.setCombinerClass(Reduce.class);
conf.setReducerClass(Reduce.class);
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
FileInputFormat.setInputPaths(conf, new Path(args[0]));
FileOutputFormat.setOutputPath(conf, new Path(args[1]));
JobClient.runJob(conf);
}
这就是我的 main() 的样子。 请任何人都可以为我提供更新的功能。
I am writing hadoop programs , and i really dont want to play with deprecated classes .
Anywhere online i am not able to find programs with updated
org.apache.hadoop.conf.Configuration
class
insted of
org.apache.hadoop.mapred.JobConf
class.
public static void main(String[] args) throws Exception {
JobConf conf = new JobConf(Test.class);
conf.setJobName("TESST");
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
conf.setMapperClass(Map.class);
conf.setCombinerClass(Reduce.class);
conf.setReducerClass(Reduce.class);
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
FileInputFormat.setInputPaths(conf, new Path(args[0]));
FileOutputFormat.setOutputPath(conf, new Path(args[1]));
JobClient.runJob(conf);
}
This is how my main() looks like.
Can please anyone will provide me with updated function.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这是经典的 WordCount 示例。您会注意到其他导入的语气可能不是必需的,阅读代码您就会弄清楚哪个是哪个。
有什么不同?我正在使用 Tool 界面和 GenericOptionParser 来解析作业命令,又名:hadoop jar ...。
在映射器中,您会注意到一个运行的东西。您可以摆脱它,当您提供 Map 方法的代码时,通常会默认调用它。我把它放在那里是为了给您提供可以进一步控制映射阶段的信息。这一切都是使用新的API。我希望你觉得它有用。如有其他问题,请告诉我!
Here it's the classic WordCount example. You'll notice a tone of other imports that may not be necessary, reading the code you'll figure out which is which.
What's different? I'm using the Tool interface and the GenericOptionParser to parse the job command a.k.a : hadoop jar ....
In the mapper you'll notice a run thing. You can get rid of that, it's usually called by default when you supply the code for the Map method. I put it there to give you the info that you can further control the mapping stage. This is all using the new API. I hope you find it useful. Any other questions, let me know!
还以经典的 WordCount 为例:
org.apache.hadoop.mapred.JobConf
是旧的,在新版本中我们使用Configuration< /code> 和
Job
来实现。请使用
org.apache.hadoop.mapreduce.lib.*
(它是新的API)而不是org.apache.hadoop.mapred.TextInputFormat
(它是旧的)。Mapper
和Reducer
没什么新鲜的,请看main
函数,它包含了比较全面的配置,您可以根据您的具体需求随意更改它们。Also take classic WordCount as example:
org.apache.hadoop.mapred.JobConf
is old, in new version we useConfiguration
andJob
to achieve.Please use
org.apache.hadoop.mapreduce.lib.*
(it is new API) instead oforg.apache.hadoop.mapred.TextInputFormat
(it is old).The
Mapper
andReducer
are nothing new, please seemain
function, it includes relatively overall configurations, feel free to change them according to your specific requirements.