Apache Pig 权限问题

发布于 2024-12-01 13:28:27 字数 1221 浏览 1 评论 0原文

我正在尝试在我的 Hadoop 集群上启动并运行 Apache Pig,但遇到了权限问题。 Pig 本身可以正常启动并连接到集群 - 在 Pig shell 中,我可以通过我的 HDFS 目录ls。然而,当我尝试实际加载数据并运行 Pig 命令时,我遇到了与权限相关的错误:

grunt> A = load 'all_annotated.txt' USING PigStorage() AS (id:long, text:chararray, lang:chararray);
grunt> DUMP A;
2011-08-24 18:11:40,961 [main] ERROR org.apache.pig.tools.grunt.Grunt - You don't have permission to perform the operation. Error from the server: org.apache.hadoop.security.AccessControlException: Permission denied: user=steven, access=WRITE, inode="":hadoop:supergroup:r-xr-xr-x
2011-08-24 18:11:40,977 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias A
Details at logfile: /Users/steven/Desktop/Hacking/hadoop/pig/pig-0.9.0/pig_1314230681326.log
grunt> 

在本例中,all_annotated.txt 是我创建的 HDFS 主目录中的一个文件,并且大多数绝对有权限;无论我尝试加载什么文件,都会出现同样的问题。但是,我认为这不是问题,因为错误本身表明 Pig 正在尝试在某个地方写入。谷歌搜索后,我发现一些邮件列表帖子表明某些 Pig Latin 语句(order 等)需要对 HDFS 文件系统上的临时目录进行写访问,该目录的位置由 控制hdfsd-site.xml 中的 hadoop.tmp.dir 属性。我不认为 load 属于该类别,但为了确定,我将 hadoop.tmp.dir 更改为指向一个目录在我的 HDFS 主目录中,问题仍然存在。

那么,有人对可能发生的事情有任何想法吗?

I'm attempting to get Apache Pig up and running on my Hadoop cluster, and am encountering a permissions problem. Pig itself is launching and connecting to the cluster just fine- from within the Pig shell, I can ls through and around my HDFS directories. However, when I try and actually load data and run Pig commands, I run into permissions-related errors:

grunt> A = load 'all_annotated.txt' USING PigStorage() AS (id:long, text:chararray, lang:chararray);
grunt> DUMP A;
2011-08-24 18:11:40,961 [main] ERROR org.apache.pig.tools.grunt.Grunt - You don't have permission to perform the operation. Error from the server: org.apache.hadoop.security.AccessControlException: Permission denied: user=steven, access=WRITE, inode="":hadoop:supergroup:r-xr-xr-x
2011-08-24 18:11:40,977 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias A
Details at logfile: /Users/steven/Desktop/Hacking/hadoop/pig/pig-0.9.0/pig_1314230681326.log
grunt> 

In this case, all_annotated.txt is a file in my HDFS home directory that I created, and most definitely have permissions to; the same problem occurs no matter what file I try to load. However, I don't think that's the problem, as the error itself indicates Pig is trying to write somewhere. Googling around, I found a few mailing list posts suggesting that certain Pig Latin statements (order, etc.) need write access to a temporary directory on the HDFS file system whose location is controlled by the hadoop.tmp.dir property in hdfsd-site.xml. I don't think load falls into that category, but just to be sure, I changed hadoop.tmp.dir to point to a directory within my HDFS home directory, and the problem persisted.

So, anybody out there have any ideas as to what might be going on?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

腻橙味 2024-12-08 13:28:27

可能是您的 pig.temp.dir 设置。它默认为 hdfs 上的 /tmp。 Pig 会在那里写入临时结果。如果你没有 /tmp 的权限,Pig 会抱怨。尝试通过 -Dpig.temp.dir 覆盖它。

Probably your pig.temp.dir setting. It defaults to /tmp on hdfs. Pig will write temporary result there. If you don't have permission to /tmp, Pig will complain. Try to override it by -Dpig.temp.dir.

糖粟与秋泊 2024-12-08 13:28:27

问题可能是 hadoop.tmp.dir 是本地文件系统上的目录,而不是 HDFS。尝试将该属性设置为您知道自己具有写入权限的本地目录。我在 Hadoop 中使用常规 MapReduce 时遇到了同样的错误。

A problem might be that hadoop.tmp.dir is a directory on your local filesystem, not HDFS. Try setting that property to a local directory you know you have write access to. I've run into the same error using regular MapReduce in Hadoop.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文