Hadoop:配置对象时出错
我正在尝试运行 Terasort 基准测试,但出现以下异常:
java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.mapred.MapTask$OldOutputCollector.<init>(MapTask.java:573)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:435)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:371)
at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
at org.apache.hadoop.mapred.Child.main(Child.java:253)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
... 10 more
Caused by: java.lang.IllegalArgumentException: can't read paritions file
at org.apache.hadoop.examples.terasort.TeraSort$TotalOrderPartitioner.configure(TeraSort.java:213)
... 15 more
Caused by: java.io.FileNotFoundException: File _partition.lst does not exist.
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:371)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245)
at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:720)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1417)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1412)
at org.apache.hadoop.examples.terasort.TeraSort$TotalOrderPartitioner.readPartitions(TeraSort.java:153)
at org.apache.hadoop.examples.terasort.TeraSort$TotalOrderPartitioner.configure(TeraSort.java:210)
... 15 more
TeraGen 命令运行良好,并已为 TeraSort 创建了输入文件。这是我的输入目录的列表:
bin/hadoop fs -ls /user/hadoop/terasort-input/Warning: Maximum heap size rounded up to 1024 MB
Found 5 items
-rw-r--r-- 1 sqatest supergroup 0 2012-01-23 14:13 /user/hadoop/terasort-input/_SUCCESS
drwxr-xr-x - sqatest supergroup 0 2012-01-23 13:30 /user/hadoop/terasort-input/_logs
-rw-r--r-- 1 sqatest supergroup 129 2012-01-23 15:49 /user/hadoop/terasort-input/_partition.lst
-rw-r--r-- 1 sqatest supergroup 50000000000 2012-01-23 13:30 /user/hadoop/terasort-input/part-00000
-rw-r--r-- 1 sqatest supergroup 50000000000 2012-01-23 13:30 /user/hadoop/terasort-input/part-00001
这是我运行 terasort 的命令:
bin/hadoop jar hadoop-examples-0.20.203.0.jar terasort -libjars hadoop-examples-0.20.203.0.jar /user/hadoop/terasort-input /user/hadoop/terasort-output
我确实在我的输入目录中看到了文件 _partition.lst,我不明白为什么我收到 FileNotFoundException。
我按照以下位置提供的设置详细信息进行操作: http://www.michael-noll.com/blog/2011/04/09/benchmarking-and-stress-testing-an-hadoop-cluster-with-terasort-testdfsio-nnbench-mrbench/
I'm trying to run the Terasort benchmarks and i'm getting the following exception:
java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.mapred.MapTask$OldOutputCollector.<init>(MapTask.java:573)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:435)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:371)
at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
at org.apache.hadoop.mapred.Child.main(Child.java:253)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
... 10 more
Caused by: java.lang.IllegalArgumentException: can't read paritions file
at org.apache.hadoop.examples.terasort.TeraSort$TotalOrderPartitioner.configure(TeraSort.java:213)
... 15 more
Caused by: java.io.FileNotFoundException: File _partition.lst does not exist.
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:371)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245)
at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:720)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1417)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1412)
at org.apache.hadoop.examples.terasort.TeraSort$TotalOrderPartitioner.readPartitions(TeraSort.java:153)
at org.apache.hadoop.examples.terasort.TeraSort$TotalOrderPartitioner.configure(TeraSort.java:210)
... 15 more
The TeraGen commands run fine and have created the input files for TeraSort. Here is the listing of my input directory:
bin/hadoop fs -ls /user/hadoop/terasort-input/Warning: Maximum heap size rounded up to 1024 MB
Found 5 items
-rw-r--r-- 1 sqatest supergroup 0 2012-01-23 14:13 /user/hadoop/terasort-input/_SUCCESS
drwxr-xr-x - sqatest supergroup 0 2012-01-23 13:30 /user/hadoop/terasort-input/_logs
-rw-r--r-- 1 sqatest supergroup 129 2012-01-23 15:49 /user/hadoop/terasort-input/_partition.lst
-rw-r--r-- 1 sqatest supergroup 50000000000 2012-01-23 13:30 /user/hadoop/terasort-input/part-00000
-rw-r--r-- 1 sqatest supergroup 50000000000 2012-01-23 13:30 /user/hadoop/terasort-input/part-00001
Here is my command for running the terasort:
bin/hadoop jar hadoop-examples-0.20.203.0.jar terasort -libjars hadoop-examples-0.20.203.0.jar /user/hadoop/terasort-input /user/hadoop/terasort-output
I do see the file _partition.lst in my input directory, i dont understand why i am getting the FileNotFoundException.
I followed the setup details provided at: http://www.michael-noll.com/blog/2011/04/09/benchmarking-and-stress-testing-an-hadoop-cluster-with-terasort-testdfsio-nnbench-mrbench/
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我让它按如下方式工作:
我从 hadoop 基本目录 hadoop-1.0.0 中以本地模式运行,其下有一个输入子目录,并且我得到了与您相同的错误。
我编辑了失败的 java 文件以使其记录路径而不是文件名,重建它(“ant 二进制文件”),然后重新运行它。它正在我运行的目录中寻找该文件。我不知道它是在 hadoop 基础目录还是执行目录中查找。
...所以我在运行 terasort 的目录中创建了一个符号链接,指向输入目录中的真实文件。
这是一个廉价的技巧,但很有效。
I got this to work as follows:
I'm running in local mode from my hadoop base directory, hadoop-1.0.0 with an input subdirectory under it, and I get the same error you do.
I edited the failing java file to get it to log the path instead of the filename, rebuilt it ("ant binary"), and reran it. It was looking for the file in the directory I was running from. I have no idea if it was looking in the hadoop base dir or the execution dir.
...so I made a symbolic link in the directory I run terasort in pointing to the real file in the input directory.
It's a cheap hack, but it works.
出现问题是因为我正在 NFS 上部署作业。我更改了 hadoop.tmp.dir 以将其指向本地文件系统(/tmp),问题很快就消失了。
The problem was occurring because i was deploying the job on an NFS. I changed the hadoop.tmp.dir to point it to a local file system(/tmp) and the problem disappeared in a jiffy.
您是否设置为以伪分布式模式(或真正的集群)运行?除非您配置 Hadoop,否则它将在本地作业运行程序模式下运行(作为单个进程内的库) - Terasort 无法在 LocalJobRunner 模式下工作。在输出中查找单词 LocalJobRunner 进行检查。
以下是设置 HDFS、SSH 和 rsync 的链接:
http://hadoop.apache.org/docs/r1.1.1/single_node_setup .html#伪分布式
Have you setup to run in pseudo distributed mode (or a real cluster)? Unless you configure Hadoop, it will run in local job runner mode (as libs inside a single process) - Terasort does NOT work in LocalJobRunner mode. Look for the word LocalJobRunner in the output to check.
Here is a link to setup HDFS, SSH and rsync:
http://hadoop.apache.org/docs/r1.1.1/single_node_setup.html#PseudoDistributed
我正在使用cloudera CDH4。我的另一个 hadoop 程序也遇到了类似的问题。
相信问题在于链接外部库。
该程序在 Eclipse(本地模式)中运行良好,但是当我尝试在伪分布式模式下运行它时,收到此错误消息。
临时解决方案:
- 使用库处理选项从 Eclipse 创建了一个 jar 文件 - 将所需的库复制到生成的 JAR 旁边的子文件夹中。
- 将 JAR 文件复制到 hadoop 主目录(放置 hadoop-exampls.jar 文件的路径),
通过此修复,我能够运行 hadoop 程序,不会出现任何错误。
希望这会有所帮助
I am using cloudera CDH4. faced similar issue with my other hadoop program.
believe the issues is about linking external libraries.
The program was working fine in Eclipse (local mode) but when I tried to run it in pseudo distributed mode, got this error message.
Temporary solution:
- Created a jar file from Eclipse with library handling option - copy required libraries into a subfolder next to the generated JAR.
- Copied the JAR file to hadoop home directory (the path where hadoop-exampls.jar files is placed)
with this fix am able to run the hadoop program with out any errors.
hope this'll help