如何在 Scala 2.9.0 中实现 Hadoop Mapper？

发布于 2024-11-07 21:05:12 字数 1210 浏览 1 评论 0原文

当我从 2.8.1 迁移到 Scala 2.9.0 时，除了 Hadoop 映射器之外，所有代码都可以正常工作。因为我有一些包装对象，所以我提炼出以下示例：

import org.apache.hadoop.mapreduce.{Mapper, Job}


object MyJob {
  def main(args:Array[String]) {
    val job = new Job(new Configuration())
    job.setMapperClass(classOf[MyMapper])

  }
}

class MyMapper extends Mapper[LongWritable,Text,Text,Text] {
  override def map(key: LongWritable, value: Text, context: Mapper[LongWritable,Text,Text,Text]#Context) {

  }
}

当我在 2.8.1 中运行它时，它运行得很好（并且我在 2.8.1 中有大量生产代码。在 2.9.0 中，我得到以下编译错误：

error: type mismatch;
found   : java.lang.Class[MyMapper](classOf[MyMapper])
required: java.lang.Class[_ <: org.apache.hadoop.mapreduce.Mapper]
job.setMapperClass(classOf[MyMapper])

当我在 Job 对象上调用 setMapperClass 时，调用失败。这是该方法的定义：

public void setMapperClass(java.lang.Class<? extends org.apache.hadoop.mapreduce.Mapper> cls) throws java.lang.IllegalStateException { /* compiled code */ }

Mapper 类本身的定义是这样的：

public class Mapper<KEYIN, VALUEIN, KEYOUT, VALUEOUT>

有人知道我做错了什么吗？在我看来，这个类型基本上是正确的：MyMapper 确实扩展了 Mapper，并且该方法需要扩展 Mapper 的东西，并且它在 2.8.1 中工作得很好......

原文

When I migrated to Scala 2.9.0 from 2.8.1, all of the code was functional except for the Hadoop mappers. Because I had some wrapper objects in the way, I distilled down to the following example:

import org.apache.hadoop.mapreduce.{Mapper, Job}


object MyJob {
  def main(args:Array[String]) {
    val job = new Job(new Configuration())
    job.setMapperClass(classOf[MyMapper])

  }
}

class MyMapper extends Mapper[LongWritable,Text,Text,Text] {
  override def map(key: LongWritable, value: Text, context: Mapper[LongWritable,Text,Text,Text]#Context) {

  }
}

When I run this in 2.8.1, it runs quite well (and I have plenty of production code in 2.8.1. In 2.9.0 I get the following compilation error:

error: type mismatch;
found   : java.lang.Class[MyMapper](classOf[MyMapper])
required: java.lang.Class[_ <: org.apache.hadoop.mapreduce.Mapper]
job.setMapperClass(classOf[MyMapper])

The failing call is when I call setMapperClass on the Job object. Here's the definition of that method:

public void setMapperClass(java.lang.Class<? extends org.apache.hadoop.mapreduce.Mapper> cls) throws java.lang.IllegalStateException { /* compiled code */ }

The definition of the Mapper class itself is this:

public class Mapper<KEYIN, VALUEIN, KEYOUT, VALUEOUT>

Does anyone have a sense of what I'm doing wrong? It looks to me like the type is fundamentally correct: MyMapper does extend Mapper, and the method wants something that extends Mapper. And it works great in 2.8.1...

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

会发光的星星闪亮亮i 2024-11-14 21:05:13

尽管看起来很愚蠢，但您可以通过在作业之前定义映射器来解决该问题。编译如下：

import org.apache.hadoop._
import org.apache.hadoop.io._
import org.apache.hadoop.conf._
import org.apache.hadoop.mapreduce._

class MyMapper extends Mapper[LongWritable,Text,Text,Text] {
  override def map(key: LongWritable, value: Text, context: Mapper[LongWritable,Text,Text,Text]#Context) {
  }
}

object MyJob {
  def main(args:Array[String]) {
    val job = new Job(new Configuration())
    job.setMapperClass(classOf[MyMapper])
  }
}

Silly as it seems, you can work around the problem by defining the Mapper before the Job. The following compiles:

import org.apache.hadoop._
import org.apache.hadoop.io._
import org.apache.hadoop.conf._
import org.apache.hadoop.mapreduce._

class MyMapper extends Mapper[LongWritable,Text,Text,Text] {
  override def map(key: LongWritable, value: Text, context: Mapper[LongWritable,Text,Text,Text]#Context) {
  }
}

object MyJob {
  def main(args:Array[String]) {
    val job = new Job(new Configuration())
    job.setMapperClass(classOf[MyMapper])
  }
}

回复收藏 0 原文

~没有更多了~