返回介绍

solution / 3000-3099 / 3056.Snaps Analysis / README

发布于 2024-06-17 01:02:57 字数 5466 浏览 0 评论 0 收藏 0

3056. Snaps Analysis

English Version

题目描述

Table: Activities

+---------------+---------+
| Column Name   | Type  |
+---------------+---------+
| activity_id   | int   |
| user_id     | int   |
| activity_type | enum  |
| time_spent  | decimal |
+---------------+---------+
activity_id is column of unique values for this table.
activity_type is an ENUM (category) type of ('send', 'open'). 
This table contains activity id, user id, activity type and time spent.

Table: Age

+-------------+------+
| Column Name | Type |
+-------------+------+
| user_id   | int  |
| age_bucket  | enum |
+-------------+------+
user_id is the column of unique values for this table.
age_bucket is an ENUM (category) type of ('21-25', '26-30', '31-35'). 
This table contains user id and age group.

Write a solution to calculate the percentage of the total time spent on sending and opening snaps for each age group. Precentage should be rounded to 2 decimal places.

Return _the result table __in any order._

The result format is in the following example.

 

Example 1:

Input: 
Activities table:
+-------------+---------+---------------+------------+
| activity_id | user_id | activity_type | time_spent |
+-------------+---------+---------------+------------+
| 7274    | 123   | open      | 4.50     | 
| 2425    | 123   | send      | 3.50     | 
| 1413    | 456   | send      | 5.67     | 
| 2536    | 456   | open      | 3.00     | 
| 8564    | 456   | send      | 8.24     | 
| 5235    | 789   | send      | 6.24     | 
| 4251    | 123   | open      | 1.25     | 
| 1435    | 789   | open      | 5.25     | 
+-------------+---------+---------------+------------+
Age table:
+---------+------------+
| user_id | age_bucket | 
+---------+------------+
| 123   | 31-35    | 
| 789   | 21-25    | 
| 456   | 26-30    | 
+---------+------------+
Output: 
+------------+-----------+-----------+
| age_bucket | send_perc | open_perc |
+------------+-----------+-----------+
| 31-35    | 37.84   | 62.16   |
| 26-30    | 82.26   | 17.74   |
| 21-25    | 54.31   | 45.69   |
+------------+-----------+-----------+
Explanation: 
For age group 31-35:
  - There is only one user belonging to this group with the user ID 123.
  - The total time spent on sending snaps by this user is 3.50, and the time spent on opening snaps is 4.50 + 1.25 = 5.75.
  - The overall time spent by this user is 3.50 + 5.75 = 9.25.
  - Therefore, the sending snap percentage will be (3.50 / 9.25) * 100 = 37.84, and the opening snap percentage will be (5.75 / 9.25) * 100 = 62.16.
For age group 26-30: 
  - There is only one user belonging to this group with the user ID 456. 
  - The total time spent on sending snaps by this user is 5.67 + 8.24 = 13.91, and the time spent on opening snaps is 3.00. 
  - The overall time spent by this user is 13.91 + 3.00 = 16.91. 
  - Therefore, the sending snap percentage will be (13.91 / 16.91) * 100 = 82.26, and the opening snap percentage will be (3.00 / 16.91) * 100 = 17.74.
For age group 21-25: 
  - There is only one user belonging to this group with the user ID 789. 
  - The total time spent on sending snaps by this user is 6.24, and the time spent on opening snaps is 5.25. 
  - The overall time spent by this user is 6.24 + 5.25 = 11.49. 
  - Therefore, the sending snap percentage will be (6.24 / 11.49) * 100 = 54.31, and the opening snap percentage will be (5.25 / 11.49) * 100 = 45.69.
All percentages in output table rounded to the two decimal places.

解法

方法一:等值连接 + 分组求和

我们可以通过等值连接,将 Activities 表和 Age 表按照 user_id 进行连接,然后再按照 age_bucket 进行分组,最后计算每个年龄段的发送和打开的百分比。

# Write your MySQL query statement below
SELECT
  age_bucket,
  ROUND(100 * SUM(IF(activity_type = 'send', time_spent, 0)) / SUM(time_spent), 2) AS send_perc,
  ROUND(100 * SUM(IF(activity_type = 'open', time_spent, 0)) / SUM(time_spent), 2) AS open_perc
FROM
  Activities
  JOIN Age USING (user_id)
GROUP BY 1;
import pandas as pd


def snap_analysis(activities: pd.DataFrame, age: pd.DataFrame) -> pd.DataFrame:
  merged_df = pd.merge(activities, age, on="user_id")
  total_time_per_age_activity = (
    merged_df.groupby(["age_bucket", "activity_type"])["time_spent"]
    .sum()
    .reset_index()
  )
  pivot_df = total_time_per_age_activity.pivot(
    index="age_bucket", columns="activity_type", values="time_spent"
  ).reset_index()
  pivot_df = pivot_df.fillna(0)
  pivot_df["send_perc"] = round(
    100 * pivot_df["send"] / (pivot_df["send"] + pivot_df["open"]), 2
  )
  pivot_df["open_perc"] = round(
    100 * pivot_df["open"] / (pivot_df["send"] + pivot_df["open"]), 2
  )
  return pivot_df[["age_bucket", "send_perc", "open_perc"]]

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
    我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
    原文