strsplit 问题 - Pig
我有以下元组 H1,我想将其 $0 拆分为元组。但是我总是收到错误消息:
DUMP H1:
(item32;item31;,1)
m = FOREACH H1 GENERATE STRSPLIT($0, ";", 50);
错误 1000:解析期间出错。第 1 行第 40 列有词汇错误。 遇到:之后:“\”;“
有人知道脚本出了什么问题吗?
I have following tuple H1 and I want to strsplit its $0 into tuple.However I always get an error message:
DUMP H1:
(item32;item31;,1)
m = FOREACH H1 GENERATE STRSPLIT($0, ";", 50);
ERROR 1000: Error during parsing. Lexical error at line 1, column 40.
Encountered: after : "\";"
Anyone knows what's wrong with the script?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
当猪解析例程遇到这个分号时,存在转义问题。
您可以对分号使用 unicode 转义序列:
\u003B
。但是,这也必须进行斜杠转义并放入单引号字符串中。或者,您可以根据尼尔的回答,在多行上重写命令。在所有情况下,这必须是单引号字符串。There is an escaping problem in the pig parsing routines when it encounters this semicolon.
You can use a unicode escape sequence for a semicolon:
\u003B
. However this must also be slash escaped and put in a single quoted string. Alternatively, you can rewrite the command over multiple lines, as per Neil's answer. In all cases, this must be a single quoted string.分号上的 STRSPLIT 很棘手。我通过将其放入一个块中来使其工作。
有趣的是,这就是我最初实现 STRSPLIT() 命令的方式。只有在尝试将其以分号分割之后,我才遇到了同样的问题。
STRSPLIT on a semi-colon is tricky. I got it to work by putting it inside of a block.
Funny enough, this is how I originally implemented my STRSPLIT() command. Only after trying to get it to split on a semicolon did I run into the same issue.