MapReduce Job执行报错 File file job.jar does not exist
我尝试在java中直接运行main方法去提交job到yarn中执行。但是得到如下的错误信息:
2018-08-26 10:25:37,544 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1375)) - Job job_1535213323614_0010 failed with state FAILED due to: Application application_1535213323614_0010 failed 2 times due to AM Container for appattempt_1535213323614_0010_000002 exited with exitCode: -1000 due to: File file:/tmp/hadoop-yarn/staging/nasuf/.staging/job_1535213323614_0010/job.jar does not exist
.Failing this attempt.. Failing the application.
并且HADOOP_HOME的日志目录中没有任何此次job的日志信息。
mapper代码如下
public class WCMapper extends Mapper<LongWritable, Text, Text, LongWritable> {
@Override
protected void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
String line = value.toString();
String[] words = StringUtils.split(line, " ");
for (String word: words) {
context.write(new Text(word), new LongWritable(1));
}
}
}
reducer代码如下:
public class WCReducer extends Reducer<Text, LongWritable, Text, LongWritable>{
@Override
protected void reduce(Text key, Iterable<LongWritable> values, Context context)
throws IOException, InterruptedException {
long count = 0;
for (LongWritable value: values) {
count += value.get();
}
context.write(key, new LongWritable(count));
}
}
main方法如下:
public class WCRunner {
public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
Configuration conf = new Configuration();
conf.set("mapreduce.job.jar", "wc.jar");
conf.set("mapreduce.framework.name", "yarn");
conf.set("yarn.resourcemanager.hostname", "hdcluster01");
conf.set("yarn.nodemanager.aux-services", "mapreduce_shuffle");
Job job = Job.getInstance(conf);
// 设置整个job所用的类在哪个jar包
job.setJarByClass(WCRunner.class);
// 本job实用的mapper和reducer的类
job.setMapperClass(WCMapper.class);
job.setReducerClass(WCReducer.class);
// 指定reducer的输出数据kv类型(若不指定下面mapper的输出类型,此处可以同时表明mapper和reducer的输出类型)
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(LongWritable.class);
// 指定mapper的输出数据kv类型
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(LongWritable.class);
// 指定原始数据存放位置
FileInputFormat.setInputPaths(job, new Path("hdfs://hdcluster01:9000/wc/srcdata"));
// 处理结果的输出数据存放路径
FileOutputFormat.setOutputPath(job, new Path("hdfs://hdcluster01:9000/wc/output3"));
// 将job提交给集群运行
job.waitForCompletion(true);
}
}
我本地执行代码的操作系统是MacOS,用户名是nasuf,远程部署的hadoop是伪分布式模式,hdfs和yarn都在一台服务器上,所属用户是parallels。
我查看了日志中提到的这个路径/tmp/hadoop-yarn/staging/nasuf/.staging/job_1535213323614_0010/job.jar确实并不存在。/tmp下并没有/hadoop-yarn目录。
请问是什么原因导致的这个问题呢?
谢谢大家
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
问题解决了。直接将core-site.xml文件拷贝到classpath下,或者添加如下配置即可: