如何在 Scala 2.9.0 中实现 Hadoop Mapper?

发布于 2024-11-07 21:05:12 字数 1210 浏览 1 评论 0原文

当我从 2.8.1 迁移到 Scala 2.9.0 时,除了 Hadoop 映射器之外,所有代码都可以正常工作。因为我有一些包装对象,所以我提炼出以下示例:

import org.apache.hadoop.mapreduce.{Mapper, Job}


object MyJob {
  def main(args:Array[String]) {
    val job = new Job(new Configuration())
    job.setMapperClass(classOf[MyMapper])

  }
}

class MyMapper extends Mapper[LongWritable,Text,Text,Text] {
  override def map(key: LongWritable, value: Text, context: Mapper[LongWritable,Text,Text,Text]#Context) {

  }
}

当我在 2.8.1 中运行它时,它运行得很好(并且我在 2.8.1 中有大量生产代码。在 2.9.0 中,我得到以下编译错误:

error: type mismatch;
found   : java.lang.Class[MyMapper](classOf[MyMapper])
required: java.lang.Class[_ <: org.apache.hadoop.mapreduce.Mapper]
job.setMapperClass(classOf[MyMapper])

当我在 Job 对象上调用 setMapperClass 时,调用失败。 这是该方法的定义:

public void setMapperClass(java.lang.Class<? extends org.apache.hadoop.mapreduce.Mapper> cls) throws java.lang.IllegalStateException { /* compiled code */ }

Mapper 类本身的定义是这样的:

public class Mapper<KEYIN, VALUEIN, KEYOUT, VALUEOUT>

有人知道我做错了什么吗?在我看来,这个类型基本上是正确的:MyMapper 确实扩展了 Mapper,并且该方法需要扩展 Mapper 的东西,并且它在 2.8.1 中工作得很好......

When I migrated to Scala 2.9.0 from 2.8.1, all of the code was functional except for the Hadoop mappers. Because I had some wrapper objects in the way, I distilled down to the following example:

import org.apache.hadoop.mapreduce.{Mapper, Job}


object MyJob {
  def main(args:Array[String]) {
    val job = new Job(new Configuration())
    job.setMapperClass(classOf[MyMapper])

  }
}

class MyMapper extends Mapper[LongWritable,Text,Text,Text] {
  override def map(key: LongWritable, value: Text, context: Mapper[LongWritable,Text,Text,Text]#Context) {

  }
}

When I run this in 2.8.1, it runs quite well (and I have plenty of production code in 2.8.1. In 2.9.0 I get the following compilation error:

error: type mismatch;
found   : java.lang.Class[MyMapper](classOf[MyMapper])
required: java.lang.Class[_ <: org.apache.hadoop.mapreduce.Mapper]
job.setMapperClass(classOf[MyMapper])

The failing call is when I call setMapperClass on the Job object. Here's the definition of that method:

public void setMapperClass(java.lang.Class<? extends org.apache.hadoop.mapreduce.Mapper> cls) throws java.lang.IllegalStateException { /* compiled code */ }

The definition of the Mapper class itself is this:

public class Mapper<KEYIN, VALUEIN, KEYOUT, VALUEOUT>

Does anyone have a sense of what I'm doing wrong? It looks to me like the type is fundamentally correct: MyMapper does extend Mapper, and the method wants something that extends Mapper. And it works great in 2.8.1...

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

会发光的星星闪亮亮i 2024-11-14 21:05:13

尽管看起来很愚蠢,但您可以通过在作业之前定义映射器来解决该问题。编译如下:

import org.apache.hadoop._
import org.apache.hadoop.io._
import org.apache.hadoop.conf._
import org.apache.hadoop.mapreduce._

class MyMapper extends Mapper[LongWritable,Text,Text,Text] {
  override def map(key: LongWritable, value: Text, context: Mapper[LongWritable,Text,Text,Text]#Context) {
  }
}

object MyJob {
  def main(args:Array[String]) {
    val job = new Job(new Configuration())
    job.setMapperClass(classOf[MyMapper])
  }
}

Silly as it seems, you can work around the problem by defining the Mapper before the Job. The following compiles:

import org.apache.hadoop._
import org.apache.hadoop.io._
import org.apache.hadoop.conf._
import org.apache.hadoop.mapreduce._

class MyMapper extends Mapper[LongWritable,Text,Text,Text] {
  override def map(key: LongWritable, value: Text, context: Mapper[LongWritable,Text,Text,Text]#Context) {
  }
}

object MyJob {
  def main(args:Array[String]) {
    val job = new Job(new Configuration())
    job.setMapperClass(classOf[MyMapper])
  }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文