Pig 加载文件名复杂的问题

发布于 2024-11-02 16:53:59 字数 312 浏览 6 评论 0原文

我需要在 Pig 中加载文件,该文件的名称又长又复杂:

dealnews-2011-04-01T12:00:00:00.211-02:00.csv

Pig 抱怨:

ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal error. java.net.URISyntaxException: Relative path in absolute URI:

有人知道问题是什么吗?谢谢。

i need to load file in pig which has a long and complicated name:

dealnews-2011-04-01T12:00:00:00.211-02:00.csv

Pig complained:

ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal error. java.net.URISyntaxException: Relative path in absolute URI:

anyone knows what's the problem? Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

七七 2024-11-09 16:53:59

如果它从中形成 URI,则 : 是保留字符。

想一想:file://a:b ...这将被视为 FTP 登录。

您的错误消息似乎抱怨字符串解析后剩下的是相对路径(我猜最后一个冒号之后是 00.csv)。显然不再是整个文件名。

在形成 URI 之前,您需要转义文件名中的所有保留字符。
您可以在命令行上执行此操作,例如:
LS | sed -e 's/:/%3A/g'

转换文件名中的冒号。

或者您可以重命名目录中使用任何“;?:@&=+,$”的任何文件

If it's forming a URI from that, the : is a reserved character.

Think about it: file://a:b ... this would be taken as an FTP login.

Your error message seems to complain that what's left after the string is parsed is a relative path (I guess 00.csv after the last colon). Obviously no longer the whole filename.

You will need to escape any reserved characters in the filename before forming a URI.
You could do this on the command line, with for example:
ls | sed -e 's/:/%3A/g'

to transform the colons in the filename.

Or you could rename any files in the directory that use any of ";?:@&=+,$"

余生一个溪 2024-11-09 16:53:59

情况不完全相同,但我们得到:

ERROR 2999: Unexpected internal error. java.net.URISyntaxException cannot be cast to java.lang.Error
java.lang.ClassCastException: java.net.URISyntaxException cannot be cast to java.lang.Error

对于我们尝试加载的所有内容,问题是 PIG_CONF_DIR 环境变量指向一个不存在的文件夹。我们已将 .bash_profile 中的它重置为具有有效 core-site.xml 和 mapred-site.xml 的文件夹,现在一切都很好。

export PIG_CONF_DIR=/my_good_folder

not exactly the same case, but we got:

ERROR 2999: Unexpected internal error. java.net.URISyntaxException cannot be cast to java.lang.Error
java.lang.ClassCastException: java.net.URISyntaxException cannot be cast to java.lang.Error

for everything we tried to load, and the problem was that the PIG_CONF_DIR env variable was pointing to a folder that did not exist. We've reset it in the .bash_profile to a folder with valid core-site.xml and mapred-site.xml and everything's good now.

export PIG_CONF_DIR=/my_good_folder
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文