如何将表格并打印一列,该列不是列的函数,我们在

发布于 2025-02-13 17:23:49 字数 513 浏览 1 评论 0原文

我有一张桌子,让我们说如下:

A     |    B    |     C
a1    |    b1   |     c1
a2    |    b2   |     c2
a3    |    b3   |     c3
a4    |    b4   |     c4

在这里A1,A2,A3不必区分,对于B和C。不仅有3行,有数百万行,而且我不能从此表中选择 *,也不能创建一个临时表。

问题是按A列进行分组(因此我们需要每列的不同条目1行),并打印大量汇总数据,但是C列并不是列A的函数A。 C,我在其中有订单(例如Michael≫ Rosen≫ John)。我希望A中的每个独特的条目以打印其相关内容的最大值。

例如,例如A1 = 1,A2 = 1和C1 = Michael,C2 = Rosen。我想在与1相对应的行中的输出中打印迈克尔(我想在A列中每个不同的条目中的一个行,这就是为什么我要对其进行分组)。

我试图通过子句与Group By使用分区,但后来我无法完成。请帮忙!

I have a table, let us say as follows:

A     |    B    |     C
a1    |    b1   |     c1
a2    |    b2   |     c2
a3    |    b3   |     c3
a4    |    b4   |     c4

Here a1,a2,a3 need not be distinct, same for b and c. There aren't just 3 rows, there are millions, and I can't either select * from this table or create a temporary table.

The problem is to group by column A (so we want 1 row per distinct entry of this column), and print lots of aggregate data, but column C isn't a function of column A. There are just 3 distinct of entries in column C, and I have an ordering in them (say Michael > Rosen > John). I want every distinct entry in A to print the maximum of what it corresponds to.

For example, say a1 = 1, a2 = 1 and c1 = Michael, c2 = Rosen. I want to print Michael in the output in the row corresponding to 1 (and I want exactly one row per distinct entry in column A, that's why I am grouping by it).

I tried to use partition by clause along with group by, but then I couldn't accomplish it. Please help!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

入怼 2025-02-20 17:23:49

我不熟悉hiveql,但在MSSQL中,您可以将row_numberlen组合,鉴于Michael,Rosen ,可以按长度对约翰进行排序,以通过获得所需的订单。

SELECT yt.A, yt.C
FROM (SELECT A, C, ROW_NUMBER() OVER (PARTITION BY A ORDER BY LEN(C) DESC) rn
      FROM yourtable) yt
WHERE yt.rn = 1

如果您提供示例数据并能够通过len进行排序只是一个巧合,则如果您的用例更为复杂,也可以使用case

SELECT yt.A, yt.C
FROM (SELECT A, C, ROW_NUMBER() OVER (PARTITION BY A ORDER BY CASE WHEN C = 'Michael' THEN 1 WHEN C = 'Rosen' THEN 2 WHEN C = 'John' THEN 3 END ASC) rn
      FROM yourtable) yt
WHERE yt.rn = 1

I'm not familiar with hiveql but in mssql you could combine ROW_NUMBER with LEN, given that Michael, Rosen, and John can be sorted by length to get your desired ORDER BY.

SELECT yt.A, yt.C
FROM (SELECT A, C, ROW_NUMBER() OVER (PARTITION BY A ORDER BY LEN(C) DESC) rn
      FROM yourtable) yt
WHERE yt.rn = 1

If you were providing sample data and being able to sort by LEN was just a coincidence, you could also use CASE if your use case is more complex.

SELECT yt.A, yt.C
FROM (SELECT A, C, ROW_NUMBER() OVER (PARTITION BY A ORDER BY CASE WHEN C = 'Michael' THEN 1 WHEN C = 'Rosen' THEN 2 WHEN C = 'John' THEN 3 END ASC) rn
      FROM yourtable) yt
WHERE yt.rn = 1
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文