Pig 加载文件名复杂的问题
我需要在 Pig 中加载文件,该文件的名称又长又复杂:
dealnews-2011-04-01T12:00:00:00.211-02:00.csv
Pig 抱怨:
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal error. java.net.URISyntaxException: Relative path in absolute URI:
有人知道问题是什么吗?谢谢。
i need to load file in pig which has a long and complicated name:
dealnews-2011-04-01T12:00:00:00.211-02:00.csv
Pig complained:
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal error. java.net.URISyntaxException: Relative path in absolute URI:
anyone knows what's the problem? Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果它从中形成 URI,则 : 是保留字符。
想一想:file://a:b ...这将被视为 FTP 登录。
您的错误消息似乎抱怨字符串解析后剩下的是相对路径(我猜最后一个冒号之后是 00.csv)。显然不再是整个文件名。
在形成 URI 之前,您需要转义文件名中的所有保留字符。
您可以在命令行上执行此操作,例如:
LS | sed -e 's/:/%3A/g'
转换文件名中的冒号。
或者您可以重命名目录中使用任何“;?:@&=+,$”的任何文件
If it's forming a URI from that, the : is a reserved character.
Think about it: file://a:b ... this would be taken as an FTP login.
Your error message seems to complain that what's left after the string is parsed is a relative path (I guess 00.csv after the last colon). Obviously no longer the whole filename.
You will need to escape any reserved characters in the filename before forming a URI.
You could do this on the command line, with for example:
ls | sed -e 's/:/%3A/g'
to transform the colons in the filename.
Or you could rename any files in the directory that use any of ";?:@&=+,$"
情况不完全相同,但我们得到:
对于我们尝试加载的所有内容,问题是 PIG_CONF_DIR 环境变量指向一个不存在的文件夹。我们已将 .bash_profile 中的它重置为具有有效 core-site.xml 和 mapred-site.xml 的文件夹,现在一切都很好。
not exactly the same case, but we got:
for everything we tried to load, and the problem was that the PIG_CONF_DIR env variable was pointing to a folder that did not exist. We've reset it in the .bash_profile to a folder with valid core-site.xml and mapred-site.xml and everything's good now.