字数统计 C++ Hadoop 管道不工作
我正在尝试在 C++ 中运行 wordcount 的示例,就像此链接描述的方法一样: 在 C++ 中运行 WordCount 程序。编译工作正常,但是当我尝试运行我的程序时,出现错误:
bin/hadoop 管道 -conf ../dev/word.xml -输入 testtile.txt -输出 wordcount-out
11/06/06 14:23:40 警告 mapred.JobClient:未设置作业 jar 文件。可能找不到用户类别。请参阅 JobConf(Class) 或 JobConf#setJar(String)。
11/06/06 14:23:40 信息 mapred.FileInputFormat:要处理的总输入路径:1
2006 年 11 月 6 日 14:23:40 信息 mapred.JobClient:正在运行作业:job_201106061207_0007
11/06/06 14:23:41 信息 mapred.JobClient: 映射 0% 减少 0%
2006 年 11 月 6 日 14:23:53 信息 mapred.JobClient:任务 ID:attempt_201106061207_0007_m_000000_0,状态:失败
java.io.IOException
在 org.apache.hadoop.mapred.pipes.OutputHandler.waitForAuthentication(OutputHandler.java:188) 在 org.apache.hadoop.mapred.pipes.Application.waitForAuthentication(Application.java:194) 在 org.apache.hadoop.mapred.pipes.Application.(Application.java:149) 在 org.apache.hadoop.mapred.pipes.PipesMapRunner.run(PipesMapRunner.java:68) 在org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:435) 在 org.apache.hadoop.mapred.MapTask.run(MapTask.java:371) 在 org.apache.hadoop.mapred.Child$4.run(Child.java:259) 在 java.security.AccessController.doPrivileged(本机方法) 在 javax.security.auth.Subject.doAs(Subject.java:416) 在 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) 在 org.apache.hadoop.mapred.Child.main(Child.java:253)
attempts_201106061207_0007_m_000000_0:服务器身份验证失败。退出
我正在两个节点上的 Fedora 上运行 Hadoop,并且按照该链接中的配置说明进行操作: 在多节点集群上运行 Hadoop。我使用该命令尝试了 Hadoop 的字数统计示例:
bin/hadoop jar hadoop-examples-0.20.203.0.jar wordcount testtile.txt wordcount-out
该命令工作正常。这就是为什么我不明白为什么我的程序不起作用。所以我希望有人知道我做错了什么,或者是否有人已经解决了这个错误。
I am trying to run the example of wordcount in C++ like this link describes the way to do :
Running the WordCount program in C++. The compilation works fine, but when I tried to run my program, an error appeared :
bin/hadoop pipes -conf ../dev/word.xml -input testtile.txt -output wordcount-out
11/06/06 14:23:40 WARN mapred.JobClient: No job jar file set. User classes may not be
found. See JobConf(Class) or JobConf#setJar(String).
11/06/06 14:23:40 INFO mapred.FileInputFormat: Total input paths to process : 1
11/06/06 14:23:40 INFO mapred.JobClient: Running job: job_201106061207_0007
11/06/06 14:23:41 INFO mapred.JobClient: map 0% reduce 0%
11/06/06 14:23:53 INFO mapred.JobClient: Task Id : attempt_201106061207_0007_m_000000_0, Status : FAILED
java.io.IOException
at org.apache.hadoop.mapred.pipes.OutputHandler.waitForAuthentication(OutputHandler.java:188)
at org.apache.hadoop.mapred.pipes.Application.waitForAuthentication(Application.java:194)
at org.apache.hadoop.mapred.pipes.Application.(Application.java:149)
at org.apache.hadoop.mapred.pipes.PipesMapRunner.run(PipesMapRunner.java:68)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:435)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:371)
at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:416)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
at org.apache.hadoop.mapred.Child.main(Child.java:253)
attempt_201106061207_0007_m_000000_0: Server failed to authenticate. Exiting
I am running Hadoop on Fedora on two nodes, and I followed instructions for configurations from that link : Running Hadoop on multi-node cluster. I tried the wordcount example of Hadoop with that command :
bin/hadoop jar hadoop-examples-0.20.203.0.jar wordcount testtile.txt wordcount-out
And that command works fine. That is why I don't understand why my program did not work. So I hope that someone have an idea about what I am doing wrong, or if someone had already resolve this error.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我不知道我是否必须以这种方式回答我的问题,或者编辑我的问题。无论如何,我找到了解决方案,我只想告诉每个遇到同样错误的人。
经过几天的研究和尝试,我了解到 Fedora 和 64 位上的 Hadoop 上的 C++ 并不是一个很好的匹配。我尝试使用 ant 编译 Hadoop wordcount C++,如 wiki 中所述。但是 ant 给我一些关于 libssl 和 stdint 的错误。
首先,如果您使用的是 Fedora,则必须将 -lcrypto 添加到 .configure 中的 LIBS 变量。这是因为在链接到 libssl 时,现在必须在这些平台上明确声明对 libcrypto 的依赖关系。(请参阅 bug在 Fedora 上)。
第二个问题:ant 产生很多关于 C++ 文件的错误:要解决这个问题,您只需在文件顶部添加一个 include : stdint.h 即可。
然后就构建成功了。然后我尝试在我的 Hadoop 集群上运行 wordcount 示例,它可以工作,而我的却不能。我预计这个问题来自我刚刚更正的库,我是对的:我尝试使用 hadoop 安装目录中的库运行 Hadoop 示例,但它不起作用,并且出现了相同的错误。
这可以通过以下事实来解释:ant 重新编译了 Hadoop 所需的 C++ 库(经过我所做的更正)并使用它,而不是 Hadoop 安装目录中提供的库。
I do not know if I have to answer to my question in this way, or edit my question. Anyway I find the solution and I just want to tell it for everyone who will get the same error.
After few days of research and try, I understand that Fedora and C++ on 64bits for Hadoop is not a good match. I tried to compile the Hadoop wordcount C++ with ant like explained in the wiki. But ant gets me some error about : libssl and stdint.
First, if you are on Fedora you have to add -lcrypto to the LIBS variables in the .configure. That is cause the dependency on libcrypto must now be explicitely stated on these platform when linking to libssl.(see bug on Fedora).
Second issue : ant produces a lot of error about C++ files : to resolve that you just have to add an include : stdint.h on the top of the file.
Then the build success. I tried then to run wordcount example on my Hadoop cluster and it works, while mine did not. I expected that issue come from the library that I just corrected and I was right : I tried to run Hadoop example with library from the hadoop install directory and it did not work and I get the same error.
That could be explained by the fact that ant recompile the C++ library needed for Hadoop(with correction that I did) and used it, instead library provides in the Hadoop install directory.