有没有一种方法可以使用等级函数在表中平坦值?

发布于 2025-01-28 16:18:15 字数 1122 浏览 4 评论 0原文

我在SQL中有一个表格,看起来像这样

-Itemactivity_idactivity_typeactivation_date
项目1活动a呼叫Jan -1-2022
项目1活动B邮件Jan -10-2022
项目1活动CPRITIST C PRINTJAN -12-2022

类似,有成千上万个项目和每个都可以进行一个或多个活动(最多5个)。 我想运行一个SQL查询以使所有记录的项目级别在项目级别上弄平数据,而所需的输出就是这样 -

项目活动1活动2活动3活动4活动5活动1日期活动2日期活动3日期活动3日期活动4日期活动5日期
项目1通话邮件打印1月1日至1-2022JAN -10-2022JAN -12-2022

根据活动日期的上升顺序,活动列(1-5)填充了活动列(1-5)。

有办法实现这一目标吗?另外,我可以在python中导入原始数据,如果有一种优雅的方法可以使用熊猫来进行转换。

请注意,列值仍然是列值,并且与PANDAS中的Undivot操作并不相同。我看到了关于熊猫中未分散的答案,但无法使用那里的答案来解决这个特殊的问题

谢谢,

I have a table in SQL which looks like this -

ITEMACTIVITY_IDACTIVITY_TYPEACTIVITY_DATE
Item 1Activity ACallJan - 1 - 2022
Item 1Activity BMailJan - 10 - 2022
Item 1Activity CPrintJan - 12 - 2022

Similarly, there are thousands of Items and each can have one or more activities (up to 5).
I want to run a SQL query to flatten the data at the Item level for all the records and the output desired is something like this -

ITEMACTIVITY 1ACTIVITY 2ACTIVITY 3ACTIVITY 4ACTIVITY 5ACTIVITY 1 DATEACTIVITY 2 DATEACTIVITY 3 DATEACTIVITY 4 DATEACTIVITY 5 DATE
Item 1CallMailPrintJan - 1 - 2022Jan - 10 - 2022Jan - 12 - 2022

The activity columns (1-5) are populated based on the ascending order of activity date.

Is there a way to achieve this? Also, I can import the raw data in Python and can do the transformation there as well if there's an elegant way to do it using Pandas.

Please note that the column values still remain as column values and it is not identical to the unpivot operation in pandas. I saw the answer on unpivoting in pandas but was not able to solve this particular problem using the answers there

Thanks in advance,

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

﹉夏雨初晴づ 2025-02-04 16:18:15

模式:

WITH cte AS (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY item ORDER BY activity_date) rn
    FROM source_table
)
SELECT t1.item,
       t1.activity_type activity_1,
--     ...
       t5.activity_type activity_5,
       t1.activity_date date_1,
--     ...
       t5.activity_date date_5
FROM cte t1
LEFT JOIN cte t2 ON t1.item = t2.item AND t2.rn = 2
LEFT JOIN cte t3 ON t1.item = t3.item AND t3.rn = 3
LEFT JOIN cte t4 ON t1.item = t4.item AND t4.rn = 4
LEFT JOIN cte t5 ON t1.item = t5.item AND t5.rn = 5
WHERE t1.rn = 1

PS。 activity_date列中数据的格式似乎是非标准的,并且可能需要转换到日期数据类型。

Pattern:

WITH cte AS (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY item ORDER BY activity_date) rn
    FROM source_table
)
SELECT t1.item,
       t1.activity_type activity_1,
--     ...
       t5.activity_type activity_5,
       t1.activity_date date_1,
--     ...
       t5.activity_date date_5
FROM cte t1
LEFT JOIN cte t2 ON t1.item = t2.item AND t2.rn = 2
LEFT JOIN cte t3 ON t1.item = t3.item AND t3.rn = 3
LEFT JOIN cte t4 ON t1.item = t4.item AND t4.rn = 4
LEFT JOIN cte t5 ON t1.item = t5.item AND t5.rn = 5
WHERE t1.rn = 1

PS. The format of the data in activity_date column seems to be non-standard, and the convertion to DATE datatype may be required.

梦太阳 2025-02-04 16:18:15

您正在寻找枢轴,而不是不散文。

但是就您而言,N Path也有效:

SELECT * 
FROM NPath
       ( ON (
               SELECT ITEM, ACTIVITY_TYPE, ACTIVITY_DATE
               FROM tab
             )
         PARTITION BY ITEM                   -- group by column
         ORDER BY ACTIVITY_DATE              -- order within list
         USING                                      
           MODE (NonOverlapping)             -- required syntax 
           Symbols (True AS T)               -- every row
           Pattern ('T*')                    --   is aggregated
           RESULT(First (item OF T) AS item  -- group by column
                 ,First (ACTIVITY_TYPE OF T) AS activity_1_type
                 ,NTH (ACTIVITY_TYPE,2 OF T) AS activity_2_type
                 ,NTH (ACTIVITY_TYPE,3 OF T) AS activity_3_type
                 ,NTH (ACTIVITY_TYPE,4 OF T) AS activity_4_type
                 ,NTH (ACTIVITY_TYPE,5 OF T) AS activity_5_type
                 ,First (ACTIVITY_DATE OF T) AS activity_1_date
                 ,NTH (ACTIVITY_DATE,2 OF T) AS activity_2_date
                 ,NTH (ACTIVITY_DATE,3 OF T) AS activity_3_date
                 ,NTH (ACTIVITY_DATE,4 OF T) AS activity_4_date
                 ,NTH (ACTIVITY_DATE,5 OF T) AS activity_5_date
                 ,Count(* OF T)
                 )
        )
;

You're looking for PIVOT, not UNPIVOT.

But in your case NPath works, too:

SELECT * 
FROM NPath
       ( ON (
               SELECT ITEM, ACTIVITY_TYPE, ACTIVITY_DATE
               FROM tab
             )
         PARTITION BY ITEM                   -- group by column
         ORDER BY ACTIVITY_DATE              -- order within list
         USING                                      
           MODE (NonOverlapping)             -- required syntax 
           Symbols (True AS T)               -- every row
           Pattern ('T*')                    --   is aggregated
           RESULT(First (item OF T) AS item  -- group by column
                 ,First (ACTIVITY_TYPE OF T) AS activity_1_type
                 ,NTH (ACTIVITY_TYPE,2 OF T) AS activity_2_type
                 ,NTH (ACTIVITY_TYPE,3 OF T) AS activity_3_type
                 ,NTH (ACTIVITY_TYPE,4 OF T) AS activity_4_type
                 ,NTH (ACTIVITY_TYPE,5 OF T) AS activity_5_type
                 ,First (ACTIVITY_DATE OF T) AS activity_1_date
                 ,NTH (ACTIVITY_DATE,2 OF T) AS activity_2_date
                 ,NTH (ACTIVITY_DATE,3 OF T) AS activity_3_date
                 ,NTH (ACTIVITY_DATE,4 OF T) AS activity_4_date
                 ,NTH (ACTIVITY_DATE,5 OF T) AS activity_5_date
                 ,Count(* OF T)
                 )
        )
;
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文