Pig 中使用 AvroStorage 的逗号分隔列表
我尝试使用逗号分隔列表在 Pig 中使用 AvroStorage 加载多个文件。我使用的语句是:
test_data= LOAD 'repo_1/part-r-00000.avro,repo_2/part-r-00000.avro' USING org.apache.pig.piggybank.storage.avro.AvroStorage();
Pig states that no input paths were returned in job.请参阅下面的堆栈跟踪。 我尝试了猪版本0.8.1-cdh3u2和0.9.1。
有人观察到同样的行为吗?这是一个错误还是一个功能?
堆栈跟踪:
rg.apache.pig.backend.executionengine.ExecException: ERROR 2118: No input paths specified in job
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:282)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
at org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
at java.lang.Thread.run(Thread.java:679)
Caused by: java.io.IOException: No input paths specified in job
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:186)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:241)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:270)
... 7 more
I tried to load several files with AvroStorage in Pig by using a comma separated list. The statement I used is:
test_data= LOAD 'repo_1/part-r-00000.avro,repo_2/part-r-00000.avro' USING org.apache.pig.piggybank.storage.avro.AvroStorage();
Pig states that no input paths were specified in job. Please see the stacktrace below.
I tried pig version0.8.1-cdh3u2 and 0.9.1.
Does anyone observe the same behavior? Is it a bug or a feature?
Stacktrace:
rg.apache.pig.backend.executionengine.ExecException: ERROR 2118: No input paths specified in job
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:282)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
at org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
at java.lang.Thread.run(Thread.java:679)
Caused by: java.io.IOException: No input paths specified in job
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:186)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:241)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:270)
... 7 more
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这些零件文件由 Pig 自动加载,因此您只需指定目录即可。
尝试
Those part files are loaded automatically by Pig, so you only need to specify the directory.
Try