JOIN 之后在 FOREACH 中引用列?

发布于 2024-12-14 16:02:48 字数 263 浏览 2 评论 0原文

A = load 'a.txt' as (id, a1);
B = load 'b.txt as (id, b1);
C = join A by id, B by id;
D = foreach C generate id,a1,b1;
dump D;

第 4 行失败: 字段投影无效。模式中不存在投影字段 [id]

我尝试更改为 A.id,但最后一行失败:错误 0:标量在输出中有多于一行。

A = load 'a.txt' as (id, a1);
B = load 'b.txt as (id, b1);
C = join A by id, B by id;
D = foreach C generate id,a1,b1;
dump D;

4th line fails on:
Invalid field projection. Projected field [id] does not exist in schema

I tried to change to A.id but then the last line fails on: ERROR 0: Scalar has more than one row in the output.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

乱了心跳 2024-12-21 16:02:49

您正在寻找的是“消除歧义运算符”。您想要的是 A::id,而不是 A.id

A.id 表示“有一个 relation/bag A,并且其架构中有一个名为 id 的列”

A::id 表示“有一个来自 A记录,并且有一个名为 id 的列”

因此,你会这样做:

A = load 'a.txt' as (id, a1);
B = load 'b.txt as (id, b1);
C = join A by id, B by id;
D = foreach C generate A::id,a1,b1;
dump D;

一个肮脏的选择:

只是因为我很懒,消歧变得非常奇怪当您开始一个接一个地进行多个联接时:使用唯一标识符。

A = load 'a.txt' as (ida, a1);
B = load 'b.txt as (idb, b1);
C = join A by ida, B by idb;
D = foreach C generate ida,a1,b1;
dump D;

What you are looking for is the "Disambiguate Operator". What you want is A::id, not A.id.

A.id says "there is a relation/bag A and there is a column called id in its schema"

A::id says "there is a record from A and that has a column called id"

So, you would do:

A = load 'a.txt' as (id, a1);
B = load 'b.txt as (id, b1);
C = join A by id, B by id;
D = foreach C generate A::id,a1,b1;
dump D;

A dirty alternative:

Just because I'm lazy, and disambiguation gets really weird when you start doing multiple joins one after another: use unique identifiers.

A = load 'a.txt' as (ida, a1);
B = load 'b.txt as (idb, b1);
C = join A by ida, B by idb;
D = foreach C generate ida,a1,b1;
dump D;
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文