如何在Palantir中对层次数据进行排序?

发布于 2025-02-03 05:57:43 字数 1607 浏览 5 评论 0原文

可以说我有飞行数据(来自铸造学院)。

数据集:

日期Flaigh_idOrigin_StateCarrier_name
Jan000000001加利福尼亚三角洲
航空公司000000002AlabamaDelta Air
00000000003加利福尼亚2
000000004加利福尼亚西南部
...Jan启动

西南 我的汇总数据的标题看起来像:

原始状态运营商名称1月2月...
阿拉巴马州三角洲航空10...
加利福尼亚三角洲航空10...
加利福尼亚西南1 11...

我需要为每个州获得小计; 我需要对大多数航班进行排序; 我希望它通过国家对其进行排序,然后由运营商对其进行排序。

所需的输出

起源州载体名称2月...
加利福尼亚NULL21...
加利福尼亚三角洲10...
加利福尼亚西南11...
阿拉巴马州null10...
阿拉巴马州三角洲航空10...

1 不能为类别提供尺寸;

表达式 - 不提供将日期列分为列的可能性。

Lets say I have flight data (from Foundry Academy).

Starting dataset:

Dateflight_idorigin_statecarrier_name
jan000000001Californiadelta air
jan000000002Alabamadelta air
jan000000003Californiasouthwest
feb000000004Californiasouthwest
............

I'm doing monthly data aggregation by state and by carrier. Header of my aggregated data looks like this:

origin statecarrier namejanfeb...
Alabamadelta air10...
Californiadelta air10...
Californiasouthwest11...

I need to get subtotals for each state;
I need to sort by most flights;
and I want it to be sorted by states, then by carrier.

desired output

origin statecarrier namejanfeb...
Californianull21...
Californiadelta air10...
Californiasouthwest11...
Alabamanull10...
Alabamadelta air10...

PIVOT - doesn't provide subtotals for categories;

EXPRESSION - doesn't offer possibility to split date column into columns.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

没有伤那来痛 2025-02-10 05:57:43

我用轮廓解决了。不是最漂亮的解决方案,但它起作用。

我已经创建了到同一数据集的两个路径:

| Date | flight_id | origin_state | carrier_name |
| ---- | --------- | ------------ | ------------ |
| ...  | ...       | ...          | ...          |

使用第一路径来计算完整的聚合。枢轴表和切换到枢轴数据:

Switch to pivoted data: using column "date", 
grouped by "origin_state" and "carrier_name",
aggregated by Count

第二个路径用于获取次写:

Switch to pivoted data: using column "date", 
grouped by "origin_state",
aggregated by Count

之后,我在第二个数据集中添加了空列“ carrier_name”。并在此之后建立了两个数据集的结合

Add rows that appear in "second_path" by column name

,之后我添加了其他列

Add new column "order" from max("Jan") OVER (
PARTITION BY "origin_state" )

,然后对结果进行了排序。

Sort dataset by "order" descending, then by "Jan" descending

我收到结果。但是它有其他列,现在我希望更改小计的行格式。

欢迎其他方法。由于我的真实数据具有更大的层次结构级别。

I solved it on Contour. not the prettiest solution, but it works.

I've created two paths to the same dataset:

| Date | flight_id | origin_state | carrier_name |
| ---- | --------- | ------------ | ------------ |
| ...  | ...       | ...          | ...          |

1st path was used to calculate full aggregation. pivot table and switch to pivoted data:

Switch to pivoted data: using column "date", 
grouped by "origin_state" and "carrier_name",
aggregated by Count

2nd path was used to get subtotals:

Switch to pivoted data: using column "date", 
grouped by "origin_state",
aggregated by Count

Afterwards I've added empty column "carrier_name" to second dataset. And made union of both datasets

Add rows that appear in "second_path" by column name

After that I've added additional column with expression

Add new column "order" from max("Jan") OVER (
PARTITION BY "origin_state" )

After that I sorted resulting dataset.

Sort dataset by "order" descending, then by "Jan" descending

I received result. but it has additional column, and now I wish to change row formatting of subtotals.

Other approaches are welcome. as my real data has more hierarchical levels.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文