MapReduceBase 和 Mapper 已弃用

发布于 2024-12-07 15:12:09 字数 1390 浏览 0 评论 0原文

public static class Map extends MapReduceBase implements Mapper

MapReduceBaseMapperJobConf 在 Hadoop 0.20.203 中已弃用。

我们现在应该用什么?

编辑 1 - 对于 MapperMapReduceBase,我发现我们只需要扩展 Mapper

public static class Map extends Mapper
            <LongWritable, Text, Text, IntWritable> {
  private final static IntWritable one = new IntWritable(1);
  private Text word = new Text();

  public void map(LongWritable key, Text value, 
         OutputCollector<Text, IntWritable> output, 
         Reporter reporter) throws IOException {
    String line = value.toString();
    StringTokenizer tokenizer = new StringTokenizer(line);
    while (tokenizer.hasMoreTokens()) {
      word.set(tokenizer.nextToken());
      output.collect(word, one);
    }
  }
}

编辑 2 - 对于 JobConf< /code> 我们应该使用这样的配置:

public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();
        Job job = new Job(conf);
        job.setMapperClass(WordCount.Map.class);
    }

编辑 3 - 我根据新的 API 找到了一个很好的教程:http://sonerbalkir.blogspot.com/2010/01/new-hadoop -api-020x.html

public static class Map extends MapReduceBase implements Mapper

MapReduceBase, Mapper and JobConf are deprecated in Hadoop 0.20.203.

What should we use now?

Edit 1 - for the Mapper and the MapReduceBase, I found that we just need to extends the Mapper

public static class Map extends Mapper
            <LongWritable, Text, Text, IntWritable> {
  private final static IntWritable one = new IntWritable(1);
  private Text word = new Text();

  public void map(LongWritable key, Text value, 
         OutputCollector<Text, IntWritable> output, 
         Reporter reporter) throws IOException {
    String line = value.toString();
    StringTokenizer tokenizer = new StringTokenizer(line);
    while (tokenizer.hasMoreTokens()) {
      word.set(tokenizer.nextToken());
      output.collect(word, one);
    }
  }
}

Edit 2 - For JobConf we should use configuration like this:

public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();
        Job job = new Job(conf);
        job.setMapperClass(WordCount.Map.class);
    }

Edit 3 - I found a good tutorial according to the new API : http://sonerbalkir.blogspot.com/2010/01/new-hadoop-api-020x.html

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

千仐 2024-12-14 15:12:09

Javadoc 包含该废弃类的使用信息:

例如 http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/JobConf.html

 Deprecated. Use Configuration instead

编辑:当您使用maven和开放类声明(F3)时maven可以自动下载源代码,你会看到javadoc注释的内容和解释。

Javadoc contains info what to use instaed of this depraceated classes:

e.g. http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/JobConf.html

 Deprecated. Use Configuration instead

Edit: When you use maven and open class declaration (F3) maven can automatically download source code and you'll see content of javadoc comments with explanations.

美胚控场 2024-12-14 15:12:09

没有太大不同新旧API之间的功能,除了旧API支持推送到map/reduce函数,而新API支持推送和拉取 API。尽管如此,新的 API 更加简洁并且易于发展。

这里是JIRA,用于介绍新的API。此外,旧 API 在 0.21 中已未弃用,并将被弃用在 发布 0.22 或 0.23

您可以找到有关新 API 或有时称为“上下文对象”的更多信息 此处此处。

There is not much different functionality wise between the old and the new API, except that the old API supports push to the map/reduce functions, while the new API supports both push and pull API. Although, the new API is much cleaner and easy to evolve.

Here is the JIRA for the introduction of the new API. Also, the old API has been un-deprecated in 0.21 and will be deprecated in release 0.22 or 0.23.

You can find more information about the new API or sometimes called the 'context objects' here and here.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文