MapReduce Job执行报错 File file job.jar does not exist

发布于 2022-09-07 21:32:46 字数 3221 浏览 15 评论 0

我尝试在java中直接运行main方法去提交job到yarn中执行。但是得到如下的错误信息：

2018-08-26 10:25:37,544 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1375)) - Job job_1535213323614_0010 failed with state FAILED due to: Application application_1535213323614_0010 failed 2 times due to AM Container for appattempt_1535213323614_0010_000002 exited with  exitCode: -1000 due to: File file:/tmp/hadoop-yarn/staging/nasuf/.staging/job_1535213323614_0010/job.jar does not exist
.Failing this attempt.. Failing the application.

并且HADOOP_HOME的日志目录中没有任何此次job的日志信息。

mapper代码如下

public class WCMapper extends Mapper<LongWritable, Text, Text, LongWritable> {
    
    @Override
    protected void map(LongWritable key, Text value, Context context)
            throws IOException, InterruptedException {
        
        String line = value.toString();
        String[] words = StringUtils.split(line, " ");
        
        for (String word: words) {
            context.write(new Text(word), new LongWritable(1));
        }
        
    }

}

reducer代码如下：

public class WCReducer extends Reducer<Text, LongWritable, Text, LongWritable>{
    
    @Override
    protected void reduce(Text key, Iterable<LongWritable> values, Context context) 
            throws IOException, InterruptedException {
        
        long count = 0;
        for (LongWritable value: values) {
            count += value.get();
        }
        
        context.write(key, new LongWritable(count));
        
    }

}

main方法如下：

public class WCRunner {
    
    public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
        Configuration conf = new Configuration();
        conf.set("mapreduce.job.jar", "wc.jar");
        conf.set("mapreduce.framework.name", "yarn");
        conf.set("yarn.resourcemanager.hostname", "hdcluster01");
        conf.set("yarn.nodemanager.aux-services", "mapreduce_shuffle");
        Job job = Job.getInstance(conf);
        
        // 设置整个job所用的类在哪个jar包
        job.setJarByClass(WCRunner.class);
        
        // 本job实用的mapper和reducer的类
        job.setMapperClass(WCMapper.class);
        job.setReducerClass(WCReducer.class);
        
        // 指定reducer的输出数据kv类型（若不指定下面mapper的输出类型，此处可以同时表明mapper和reducer的输出类型）
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(LongWritable.class);
        
        // 指定mapper的输出数据kv类型
        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(LongWritable.class);
        
        // 指定原始数据存放位置
        FileInputFormat.setInputPaths(job, new Path("hdfs://hdcluster01:9000/wc/srcdata"));
        
        // 处理结果的输出数据存放路径
        FileOutputFormat.setOutputPath(job, new Path("hdfs://hdcluster01:9000/wc/output3"));
        
        // 将job提交给集群运行
        job.waitForCompletion(true);
        
    }

}

我本地执行代码的操作系统是MacOS，用户名是nasuf，远程部署的hadoop是伪分布式模式，hdfs和yarn都在一台服务器上，所属用户是parallels。
我查看了日志中提到的这个路径/tmp/hadoop-yarn/staging/nasuf/.staging/job_1535213323614_0010/job.jar确实并不存在。/tmp下并没有/hadoop-yarn目录。

请问是什么原因导致的这个问题呢？
谢谢大家

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

神也荒唐 2022-09-14 21:32:46

问题解决了。直接将core-site.xml文件拷贝到classpath下，或者添加如下配置即可：

conf.set("hadoop.tmp.dir", "/home/parallels/app/hadoop-2.4.1/data/");

回复收藏 0

~没有更多了~

关于作者

￠蛋碎的人ぎ生

暂无简介

0 文章

0 评论

23 人气

关注发私信

lorenzathorton8

文章 0 评论 0

关注

Zero

文章 0 评论 0

关注

萧瑟寒风

文章 0 评论 0

关注

mylayout

文章 0 评论 0

关注

tkewei

文章 0 评论 0

关注

17818769742

文章 0 评论 0

友情链接

文江博客

MapReduce Job执行报错 File file job.jar does not exist

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

lorenzathorton8

Zero

萧瑟寒风

mylayout

tkewei

17818769742

友情链接

MapReduce Job执行报错 File file job.jar does not exist

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

lorenzathorton8

Zero

萧瑟寒风

mylayout

tkewei

17818769742

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。