返回介绍

solution / 3000-3099 / 3060.User Activities within Time Bounds / README

发布于 2024-06-17 01:02:57 字数 4555 浏览 0 评论 0 收藏 0

3060. User Activities within Time Bounds

English Version

题目描述

Table: Sessions

+---------------+----------+
| Column Name   | Type   |
+---------------+----------+
| user_id     | int    |
| session_start | datetime |
| session_end   | datetime |
| session_id  | int    |
| session_type  | enum   |
+---------------+----------+
session_id is column of unique values for this table.
session_type is an ENUM (category) type of (Viewer, Streamer).
This table contains user id, session start, session end, session id and session type.

Write a solution to find the the users who have had at least one consecutive session of the same type (either 'Viewer' or 'Streamer') with a maximum gap of 12 hours between sessions.

Return _the result table ordered by _user_id_ in ascending order._

The result format is in the following example.

 

Example:

Input: 
Sessions table:
+---------+---------------------+---------------------+------------+--------------+
| user_id | session_start     | session_end     | session_id | session_type | 
+---------+---------------------+---------------------+------------+--------------+
| 101   | 2023-11-01 08:00:00 | 2023-11-01 09:00:00 | 1      | Viewer     |  
| 101   | 2023-11-01 10:00:00 | 2023-11-01 11:00:00 | 2      | Streamer   |   
| 102   | 2023-11-01 13:00:00 | 2023-11-01 14:00:00 | 3      | Viewer     | 
| 102   | 2023-11-01 15:00:00 | 2023-11-01 16:00:00 | 4      | Viewer     | 
| 101   | 2023-11-02 09:00:00 | 2023-11-02 10:00:00 | 5      | Viewer     | 
| 102   | 2023-11-02 12:00:00 | 2023-11-02 13:00:00 | 6      | Streamer   | 
| 101   | 2023-11-02 13:00:00 | 2023-11-02 14:00:00 | 7      | Streamer   | 
| 102   | 2023-11-02 16:00:00 | 2023-11-02 17:00:00 | 8      | Viewer     | 
| 103   | 2023-11-01 08:00:00 | 2023-11-01 09:00:00 | 9      | Viewer     | 
| 103   | 2023-11-02 20:00:00 | 2023-11-02 23:00:00 | 10     | Viewer     | 
| 103   | 2023-11-03 09:00:00 | 2023-11-03 10:00:00 | 11     | Viewer     | 
+---------+---------------------+---------------------+------------+--------------+
Output: 
+---------+
| user_id |
+---------+
| 102   |
| 103   |
+---------+
Explanation:
- User ID 101 will not be included in the final output as they do not have any consecutive sessions of the same session type.
- User ID 102 will be included in the final output as they had two viewer sessions with session IDs 3 and 4, respectively, and the time gap between them was less than 12 hours.
- User ID 103 participated in two viewer sessions with a gap of less than 12 hours between them, identified by session IDs 10 and 11. Therefore, user 103 will be included in the final output.
Output table is ordered by user_id in increasing order.

解法

方法一:窗口函数 + 时间函数

我们先使用 LAG 窗口函数,找到每个用户相同类型的会话的上一个会话的结束时间,记为 prev_session_end。然后我们使用 TIMESTAMPDIFF 函数计算当前会话的开始时间与上一个会话的结束时间的时间差,如果时间差小于等于 12 小时,那么这个用户就符合题目要求。

# Write your MySQL query statement below
WITH
  T AS (
    SELECT
      user_id,
      session_start,
      LAG(session_end) OVER (
        PARTITION BY user_id, session_type
        ORDER BY session_end
      ) AS prev_session_end
    FROM Sessions
  )
SELECT DISTINCT
  user_id
FROM T
WHERE TIMESTAMPDIFF(HOUR, prev_session_end, session_start) <= 12;
import pandas as pd


def user_activities(sessions: pd.DataFrame) -> pd.DataFrame:
  sessions = sessions.sort_values(by=["user_id", "session_start"])
  sessions["prev_session_end"] = sessions.groupby(["user_id", "session_type"])[
    "session_end"
  ].shift(1)
  sessions_filtered = sessions[
    sessions["session_start"] - sessions["prev_session_end"]
    <= pd.Timedelta(hours=12)
  ]
  return pd.DataFrame({"user_id": sessions_filtered["user_id"].unique()})

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
    我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
    原文