JOIN 之后在 FOREACH 中引用列?
A = load 'a.txt' as (id, a1);
B = load 'b.txt as (id, b1);
C = join A by id, B by id;
D = foreach C generate id,a1,b1;
dump D;
第 4 行失败: 字段投影无效。模式中不存在投影字段 [id]
我尝试更改为 A.id,但最后一行失败:错误 0:标量在输出中有多于一行。
A = load 'a.txt' as (id, a1);
B = load 'b.txt as (id, b1);
C = join A by id, B by id;
D = foreach C generate id,a1,b1;
dump D;
4th line fails on:Invalid field projection. Projected field [id] does not exist in schema
I tried to change to A.id but then the last line fails on: ERROR 0: Scalar has more than one row in the output.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您正在寻找的是“消除歧义运算符”。您想要的是
A::id
,而不是A.id
。A.id
表示“有一个 relation/bagA
,并且其架构中有一个名为id
的列”A::id
表示“有一个来自A
的记录,并且有一个名为id
的列”因此,你会这样做:
一个肮脏的选择:
只是因为我很懒,消歧变得非常奇怪当您开始一个接一个地进行多个联接时:使用唯一标识符。
What you are looking for is the "Disambiguate Operator". What you want is
A::id
, notA.id
.A.id
says "there is a relation/bagA
and there is a column calledid
in its schema"A::id
says "there is a record fromA
and that has a column calledid
"So, you would do:
A dirty alternative:
Just because I'm lazy, and disambiguation gets really weird when you start doing multiple joins one after another: use unique identifiers.