枢轴内的窗口函数,可能吗?

发布于 2024-11-20 00:02:28 字数 908 浏览 4 评论 0原文

我有这个表

在此处输入图像描述

我需要创建一个数据透视表,将情感显示为列,并按平均情感级别分组按用户 ID、用户日期、情感。例如,对于 user_id = 1、user_date = 2011-07-13 且情感 = 'Anger',平均情感级别应为 4.0。

我创建了一个枢轴:

select USER_ID, user_date,  
AVG(case emotion when 'Anger' then convert(float, emotion_level) else 0 end)  as Anger,
AVG(case emotion when 'Sadness' then convert(float, emotion_level) else 0 end) as Sadness,
AVG(case emotion when 'Interest' then convert(float, emotion_level) else 0 end) as Interest
from emotions group by USER_ID, user_date;

which half work,但计算所有情绪中的平均情感水平,但不计算按用户、日期和情绪分组的情绪。

我对第一个用户的结果 + 情感 = '愤怒' = 2,但它应该是 4。

在此处输入图像描述

我猜,我应该使用窗口函数(over(partition by user_id,user_date,emotion)),但无法运行语法。

有可能吗?

我在产品中使用 PostgreSQL 9,但上面的示例是用 SQL Server 编写的。

I have this table

enter image description here

And I need to create a pivot, that displays emotions as columns, with average emotion_level grouped by user_id, user_date, emotion. For example, for user_id = 1, user_date = 2011-07-13 and emotion = 'Anger', the average emotion_level should be 4.0.

I create a pivot:

select USER_ID, user_date,  
AVG(case emotion when 'Anger' then convert(float, emotion_level) else 0 end)  as Anger,
AVG(case emotion when 'Sadness' then convert(float, emotion_level) else 0 end) as Sadness,
AVG(case emotion when 'Interest' then convert(float, emotion_level) else 0 end) as Interest
from emotions group by USER_ID, user_date;

Which half-works, but calculates average emotion_level among all emotions, but not for emotions grouped by user, date and emotion.

My result for first user + emotion = 'Anger' = 2, but it should be 4.

enter image description here

I guess, I should use the window function (over (partition by user_id, user_date, emotion)), but can't get the syntax run.

Is it possible at all?

I'm using PostgreSQL 9 in prod, but the above example is written in SQL Server.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

爱格式化 2024-11-27 00:02:28
WITH    q (id, user_id, user_date, emotion, emotion_level) AS
        (
        VALUES
        (1, 1, '2011-07-13'::DATE, 'Anger', 3),
        (2, 1, '2011-07-13'::DATE, 'Anger', 5),
        (3, 1, '2011-07-13'::DATE, 'Sadness', 2),
        (4, 1, '2011-07-13'::DATE, 'Interest', 2),
        (5, 2, '2011-07-13'::DATE, 'Anger', 1),
        (6, 2, '2011-07-13'::DATE, 'Sadness', 4),
        (7, 2, '2011-07-13'::DATE, 'Sadness', 5),
        (8, 2, '2011-07-13'::DATE, 'Interest', 3),
        (9, 3, '2011-07-13'::DATE, 'Anger', 1),
        (10, 3, '2011-07-13'::DATE, 'Sadness', 3),
        (11, 3, '2011-07-13'::DATE, 'Interest', 4),
        (12, 3, '2011-07-13'::DATE, 'Interest', 5)
        ) 
SELECT  user_id, user_date,
        AVG(CASE emotion WHEN 'Anger' THEN emotion_level END)::numeric(3, 2) AS Anger,
        AVG(CASE emotion WHEN 'Sadness' THEN emotion_level END)::numeric(3, 2) AS Sadness,
        AVG(CASE emotion WHEN 'Interest' THEN emotion_level END)::numeric(3, 2) AS Interest
FROM    q
GROUP BY
        user_id, user_date
ORDER BY
        user_id, user_date

问题在于您最初使用的表达式:

AVG(case emotion when 'Interest' then convert(float, emotion_level) else 0 end)

对给定日期给定用户的所有记录进行平均,并将非兴趣条目视为0,而它们应该被视为NULL 因此它们不会对Interest 平均值做出贡献。

WITH    q (id, user_id, user_date, emotion, emotion_level) AS
        (
        VALUES
        (1, 1, '2011-07-13'::DATE, 'Anger', 3),
        (2, 1, '2011-07-13'::DATE, 'Anger', 5),
        (3, 1, '2011-07-13'::DATE, 'Sadness', 2),
        (4, 1, '2011-07-13'::DATE, 'Interest', 2),
        (5, 2, '2011-07-13'::DATE, 'Anger', 1),
        (6, 2, '2011-07-13'::DATE, 'Sadness', 4),
        (7, 2, '2011-07-13'::DATE, 'Sadness', 5),
        (8, 2, '2011-07-13'::DATE, 'Interest', 3),
        (9, 3, '2011-07-13'::DATE, 'Anger', 1),
        (10, 3, '2011-07-13'::DATE, 'Sadness', 3),
        (11, 3, '2011-07-13'::DATE, 'Interest', 4),
        (12, 3, '2011-07-13'::DATE, 'Interest', 5)
        ) 
SELECT  user_id, user_date,
        AVG(CASE emotion WHEN 'Anger' THEN emotion_level END)::numeric(3, 2) AS Anger,
        AVG(CASE emotion WHEN 'Sadness' THEN emotion_level END)::numeric(3, 2) AS Sadness,
        AVG(CASE emotion WHEN 'Interest' THEN emotion_level END)::numeric(3, 2) AS Interest
FROM    q
GROUP BY
        user_id, user_date
ORDER BY
        user_id, user_date

The problem was that the expressions you originally used:

AVG(case emotion when 'Interest' then convert(float, emotion_level) else 0 end)

averaged over all records for a given user at given date and treated non-Interest entries as 0, while they should be treated as NULL so they would not contribute to Interest average.

ぃ双果 2024-11-27 00:02:28

我首先将表中“emotion_level”的数据类型更新为“float”,然后使用pivot执行操作。如果数据类型未更新,则所需值为 int 数据类型。

select user_id, user_date, Anger, Sadness, Interest 
from (select user_id,user_date,emotion, emotion_level 
from emotions) as emo
pivot(avg(emotion_level) 
for emotion in (Anger, Sadness, Interest)) as P;

以下链接中上述查询的结果 https://i.sstatic.net/y6ML3.png

I first updated the datatype of 'emotion_level' to 'float' in the table and then performed the operation using pivot. If the datatype is not updated, the required values are of int datatype.

select user_id, user_date, Anger, Sadness, Interest 
from (select user_id,user_date,emotion, emotion_level 
from emotions) as emo
pivot(avg(emotion_level) 
for emotion in (Anger, Sadness, Interest)) as P;

The result from the above query in the link below https://i.sstatic.net/y6ML3.png

眼波传意 2024-11-27 00:02:28

我会定义一个 CTE,然后加入它 - 您的主要问题可能是您没有在 emotion_level分组。试试这个:

WITH average_emotion (user_id, date, emotion, average_level) as 
                     (SELECT user_id, user_date, emotion, AVG(convert(float, emotion_level))
                      FROM emotions
                      GROUP BY user_id, user_date, emotion)
SELECT a.user_id, a.user_date, COALESCE(b.average_level, 0) as Anger
FROM emotions as a
LEFT JOIN average_emotion as b
ON b.user_id = a.user_id
AND b.date = a.user_date
AND b.emotion = 'Anger'
GROUP BY a.user_id, a.user_date

然后,每次需要新列时,只需添加额外的左连接即可。

有多种方法可以对其进行重组,主要与数据模型的其余部分和所需的输出有关。

I'd define a CTE, then join to it - your main problem is probably the fact that you're not grouping on emotion_level. Try this:

WITH average_emotion (user_id, date, emotion, average_level) as 
                     (SELECT user_id, user_date, emotion, AVG(convert(float, emotion_level))
                      FROM emotions
                      GROUP BY user_id, user_date, emotion)
SELECT a.user_id, a.user_date, COALESCE(b.average_level, 0) as Anger
FROM emotions as a
LEFT JOIN average_emotion as b
ON b.user_id = a.user_id
AND b.date = a.user_date
AND b.emotion = 'Anger'
GROUP BY a.user_id, a.user_date

Then, just add additional left joins every time you need a new column.

There's a number of ways to restructure this, mostly relating to the rest of your data model, and desired output.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文