hadoop-streaming 示例运行失败 - 映射中的键类型不匹配

发布于 2024-12-06 17:38:03 字数 824 浏览 1 评论 0原文

I was running  $HADOOP_HOME/bin/hadoop  jar $HADOOP_HOME/hadoop-streaming.jar \
    -D stream.map.output.field.separator=. \
    -D stream.num.map.output.key.fields=4 \
    -input myInputDirs \
    -output myOutputDir \
    -mapper org.apache.hadoop.mapred.lib.IdentityMapper \
    -reducer org.apache.hadoop.mapred.lib.IdentityReducer 
What hould be the input file when IdentityMapper is the mapper?

我希望看到它可以对某些选定的键而不是整个键进行排序。我的输入文件很简单 “aabb”。 “抄送dd” 不确定我错过了什么?我总是收到这个错误 java.lang.Exception:java.io.IOException:映射中的键类型不匹配:预期为 org.apache.hadoop.io.Text,已收到 org.apache.hadoop.io.LongWritable 在 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:371) 引起原因:java.io.IOException:映射中的键类型不匹配:预期为 org.apache.hadoop.io.Text,已收到 org.apache.hadoop.io.LongWritable

I was running  $HADOOP_HOME/bin/hadoop  jar $HADOOP_HOME/hadoop-streaming.jar \
    -D stream.map.output.field.separator=. \
    -D stream.num.map.output.key.fields=4 \
    -input myInputDirs \
    -output myOutputDir \
    -mapper org.apache.hadoop.mapred.lib.IdentityMapper \
    -reducer org.apache.hadoop.mapred.lib.IdentityReducer 
What hould be the input file when IdentityMapper is the mapper?

I was hoping to see it can sort on certain selected keys and not the entire keys. My input file is simple
"aa bb".
"cc dd"
Not sure what did I miss? I always get this error
java.lang.Exception: java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.Text, recieved org.apache.hadoop.io.LongWritable
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:371)
Caused by: java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.Text, recieved org.apache.hadoop.io.LongWritable

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

对风讲故事 2024-12-13 17:38:03

这是一个已知错误,这里是 JIRA。该错误已在 Hadoop 0.21.0 中发现,但我不认为它存在于任何 Hadoop 发行版本中。如果您确实有兴趣解决此问题,您可以

  • 下载 Hadoop 的源代码(针对您正在使用的版本)
  • 从 JIRA 下载补丁并应用它
  • 构建和测试 Hadoop

以下是 有关如何应用补丁的说明

或者不使用 IdentityMapper 和 IdentityReducer,而是使用 python/perl 脚本,该脚本将从 STDIN 读取 k/v 对,然后将相同的 k/v 对写入 STDOUT,而不进行任何处理。这就像不使用 Java 创建您自己的 IdentityMapper 和 IdentityReducer 一样。

This is a known bug and here is the JIRA. The bug has been identified in Hadoop 0.21.0, but I don't think it's in any of the Hadoop release version. If you are really interested to fix this, you can

  • download the source code for Hadoop (for the release you are working)
  • download the patch from JIRA and apply it
  • build and test Hadoop

Here are the instructions on how to apply a patch.

Or instead of using an IdentityMapper and the IdentityReducder, use a python/perl scripts which will read the k/v pairs from STDIN and then write the same k/v pairs to the STDOUT without any processing. It's like creating your own IdentityMapper and the IdentityReducder not using Java.

抱猫软卧 2024-12-13 17:38:03

我正在用自己的示例尝试 Hadoop,但遇到了同样的错误。我使用 KeyValueTextInputFormat 来解决该问题。您可以查看以下博客以了解相同内容。

http://sanketraut.blogspot.in/2012 /06/hadoop-example-setting-up-hadoop-on.html

希望对您有帮助。

和平。
桑克·劳特

I was trying my hands on Hadoop with my own example, but got the same error. I used KeyValueTextInputFormat to resolve the issue. You can have a look at following blog for the same.

http://sanketraut.blogspot.in/2012/06/hadoop-example-setting-up-hadoop-on.html

Hope it helps you.

Peace.
Sanket Raut

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文