返回介绍

高级条件语句

发布于 2024-08-17 15:49:00 字数 25033 浏览 0 评论 0 收藏 0

筛选限定昵称成就值活跃日期的用户(较难)

描述

现有用户信息表 user_infouid 用户 ID, nick_name 昵称, achievement 成就值, level 等级, job 职业方向, register_time 注册时间):

iduidnick_nameachievementleveljobregister_time
11001牛客 1 号10002算法2020-01-01 10:00:00
21002牛客 2 号12003算法2020-01-01 10:00:00
31003进击的 3 号22005算法2020-01-01 10:00:00
41004牛客 4 号25006算法2020-01-01 10:00:00
51005牛客 5 号30007C++2020-01-01 10:00:00

试卷作答记录表 exam_recorduid 用户 ID, exam_id 试卷 ID, start_time 开始作答时间, submit_time 交卷时间, score 得分):

iduidexam_idstart_timesubmit_timescore
1100190012020-01-02 09:01:012020-01-02 09:21:5980
3100190022021-02-02 19:01:012021-02-02 19:30:0187
2100190012021-05-02 10:01:01(NULL)(NULL)
4100190012021-06-02 19:01:012021-06-02 19:32:0020
6100190022021-09-01 12:01:01(NULL)(NULL)
5100190022021-09-05 19:01:012021-09-05 19:40:0189
11100290012020-01-01 12:01:012020-01-01 12:31:0181
12100290022020-02-01 12:01:012020-02-01 12:31:0182
13100290022020-02-02 12:11:012020-02-02 12:31:0183
7100290022021-05-05 18:01:012021-05-05 18:59:0290
16100290012021-09-06 12:01:012021-09-06 12:21:0180
17100290012021-09-06 12:01:01(NULL)(NULL)
18100290012021-09-07 12:01:01(NULL)(NULL)
8100390032021-02-06 12:01:01(NULL)(NULL)
9100390012021-09-07 10:01:012021-09-07 10:31:0189
10100490022021-08-06 12:01:01(NULL)(NULL)
14100590012021-02-01 11:01:012021-02-01 11:31:0184
15100690012021-02-01 11:01:012021-02-01 11:31:0184

题目练习记录表 practice_recorduid 用户 ID, question_id 题目 ID, submit_time 提交时间, score 得分):

iduidquestion_idsubmit_timescore
1100180012021-08-02 11:41:0160
2100280012021-09-02 19:30:0150
3100280012021-09-02 19:20:0170
4100280022021-09-02 19:38:0170
5100380022021-09-01 19:38:0180

请找到昵称以『牛客』开头『号』结尾、成就值在 1200~2500 之间,且最近一次活跃(答题或作答试卷)在 2021 年 9 月的用户信息。

由示例数据结果输出如下:

uidnick_nameachievement
1002牛客 2 号1200

解释 :昵称以『牛客』开头『号』结尾且成就值在 1200~2500 之间的有 1002、1004;

1002 最近一次试卷区活跃为 2021 年 9 月,最近一次题目区活跃为 2021 年 9 月;1004 最近一次试卷区活跃为 2021 年 8 月,题目区未活跃。

因此最终满足条件的只有 1002。

思路

先根据条件列出主要查询语句

昵称以『牛客』开头『号』结尾: nick_name LIKE "牛客%号"

成就值在 1200~2500 之间: achievement BETWEEN 1200 AND 2500

第三个条件因为限定了为 9 月,所以直接写就行: ( date_format( record.submit_time, '%Y%m' )= 202109 OR date_format( pr.submit_time, '%Y%m' )= 202109 )

答案

SELECT DISTINCT u_info.uid,
                u_info.nick_name,
                u_info.achievement
FROM user_info u_info
LEFT JOIN exam_record record ON record.uid = u_info.uid
LEFT JOIN practice_record pr ON u_info.uid = pr.uid
WHERE u_info.nick_name LIKE "牛客%号"
  AND u_info.achievement BETWEEN 1200
  AND 2500
  AND (date_format(record.submit_time, '%Y%m')= 202109
       OR date_format(pr.submit_time, '%Y%m')= 202109)
GROUP BY u_info.uid

筛选昵称规则和试卷规则的作答记录(较难)

描述

现有用户信息表 user_infouid 用户 ID, nick_name 昵称, achievement 成就值, level 等级, job 职业方向, register_time 注册时间):

iduidnick_nameachievementleveljobregister_time
11001牛客 1 号19002算法2020-01-01 10:00:00
21002牛客 2 号12003算法2020-01-01 10:00:00
31003牛客 3 号 ♂22005算法2020-01-01 10:00:00
41004牛客 4 号25006算法2020-01-01 10:00:00
51005牛客 555 号20007C++2020-01-01 10:00:00
6100666666630006C++2020-01-01 10:00:00

试卷信息表 examination_infoexam_id 试卷 ID, tag 试卷类别, difficulty 试卷难度, duration 考试时长, release_time 发布时间):

idexam_idtagdifficultydurationrelease_time
19001C++hard602020-01-01 10:00:00
29002c#hard802020-01-01 10:00:00
39003SQLmedium702020-01-01 10:00:00

试卷作答记录表 exam_recorduid 用户 ID, exam_id 试卷 ID, start_time 开始作答时间, submit_time 交卷时间, score 得分):

iduidexam_idstart_timesubmit_timescore
1100190012020-01-02 09:01:012020-01-02 09:21:5980
2100190012021-05-02 10:01:01(NULL)(NULL)
4100190012021-06-02 19:01:012021-06-02 19:32:0020
3100190022021-02-02 19:01:012021-02-02 19:30:0187
5100190022021-09-05 19:01:012021-09-05 19:40:0189
6100190022021-09-01 12:01:01(NULL)(NULL)
11100290012020-01-01 12:01:012020-01-01 12:31:0181
16100290012021-09-06 12:01:012021-09-06 12:21:0180
17100290012021-09-06 12:01:01(NULL)(NULL)
18100290012021-09-07 12:01:01(NULL)(NULL)
7100290022021-05-05 18:01:012021-05-05 18:59:0290
12100290022020-02-01 12:01:012020-02-01 12:31:0182
13100290022020-02-02 12:11:012020-02-02 12:31:0183
9100390012021-09-07 10:01:012021-09-07 10:31:0189
8100390032021-02-06 12:01:01(NULL)(NULL)
10100490022021-08-06 12:01:01(NULL)(NULL)
14100590012021-02-01 11:01:012021-02-01 11:31:0184
15100690012021-02-01 11:01:012021-09-01 11:31:0184

找到昵称以"牛客"+纯数字+"号"或者纯数字组成的用户对于字母 c 开头的试卷类别(如 C,C++,c#等)的已完成的试卷 ID 和平均得分,按用户 ID、平均分升序排序。由示例数据结果输出如下:

uidexam_idavg_score
1002900181
1002900285
1005900184
1006900184

解释:昵称满足条件的用户有 1002、1004、1005、1006;

c 开头的试卷有 9001、9002;

满足上述条件的作答记录中,1002 完成 9001 的得分有 81、80,平均分为 81(80.5 取整四舍五入得 81);

1002 完成 9002 的得分有 90、82、83,平均分为 85;

思路

还是老样子,既然给出了条件,就先把各个条件先写出来

找到昵称以"牛客"+纯数字+"号"或者纯数字组成的用户: 我最开始是这么写的: nick_name LIKE '牛客%号' OR nick_name REGEXP '^[0-9]+$' ,如果表中有个 “牛客 H 号” ,那也能通过。

所以这里还得用正则: nick_name LIKE '^牛客[0-9]+号'

对于字母 c 开头的试卷类别: e_info.tag LIKE 'c%' 或者 tag regexp '^c|^C' 第一个也能匹配到大写 C

答案

SELECT UID,
       exam_id,
       ROUND(AVG(score), 0) avg_score
FROM exam_record
WHERE UID IN
    (SELECT UID
     FROM user_info
     WHERE nick_name RLIKE "^牛客[0-9]+号 $"
       OR nick_name RLIKE "^[0-9]+$")
  AND exam_id IN
    (SELECT exam_id
     FROM examination_info
     WHERE tag RLIKE "^[cC]")
  AND score IS NOT NULL
GROUP BY UID,exam_id
ORDER BY UID,avg_score;

根据指定记录是否存在输出不同情况(困难)

描述

现有用户信息表 user_infouid 用户 ID, nick_name 昵称, achievement 成就值, level 等级, job 职业方向, register_time 注册时间):

iduidnick_nameachievementleveljobregister_time
11001牛客 1 号190算法2020-01-01 10:00:00
21002牛客 2 号12003算法2020-01-01 10:00:00
31003进击的 3 号220算法2020-01-01 10:00:00
41004牛客 4 号250算法2020-01-01 10:00:00
51005牛客 555 号20007C++2020-01-01 10:00:00
6100666666630006C++2020-01-01 10:00:00

试卷作答记录表 exam_recorduid 用户 ID, exam_id 试卷 ID, start_time 开始作答时间, submit_time 交卷时间, score 得分):

iduidexam_idstart_timesubmit_timescore
1100190012020-01-02 09:01:012020-01-02 09:21:5980
2100190012021-05-02 10:01:01(NULL)(NULL)
3100190022021-02-02 19:01:012021-02-02 19:30:0187
4100190022021-09-01 12:01:01(NULL)(NULL)
5100190032021-09-02 12:01:01(NULL)(NULL)
6100190042021-09-03 12:01:01(NULL)(NULL)
7100290012020-01-01 12:01:012020-01-01 12:31:0199
8100290032020-02-01 12:01:012020-02-01 12:31:0182
9100290032020-02-02 12:11:01(NULL)(NULL)
10100290022021-05-05 18:01:01(NULL)(NULL)
11100290012021-09-06 12:01:01(NULL)(NULL)
12100390032021-02-06 12:01:01(NULL)(NULL)
13100390012021-09-07 10:01:012021-09-07 10:31:0189

请你筛选表中的数据,当有任意一个 0 级用户未完成试卷数大于 2 时,输出每个 0 级用户的试卷未完成数和未完成率(保留 3 位小数);若不存在这样的用户,则输出所有有作答记录的用户的这两个指标。结果按未完成率升序排序。

由示例数据结果输出如下:

uidincomplete_cntincomplete_rate
100400.000
100310.500
100140.667

解释 :0 级用户有 1001、1003、1004;他们作答试卷数和未完成数分别为:6:4、2:1、0:0;

存在 1001 这个 0 级用户未完成试卷数大于 2,因此输出这三个用户的未完成数和未完成率(1004 未作答过试卷,未完成率默认填 0,保留 3 位小数后是 0.000);

结果按照未完成率升序排序。

附:如果 1001 不满足『未完成试卷数大于 2』,则需要输出 1001、1002、1003 的这两个指标,因为试卷作答记录表里只有这三个用户的作答记录。

思路

先把可能满足条件 “0 级用户未完成试卷数大于 2” 的 SQL 写出来

SELECT ui.uid UID
FROM user_info ui
LEFT JOIN exam_record er ON ui.uid = er.uid
WHERE ui.uid IN
    (SELECT ui.uid
     FROM user_info ui
     LEFT JOIN exam_record er ON ui.uid = er.uid
     WHERE er.submit_time IS NULL
       AND ui.LEVEL = 0 )
GROUP BY ui.uid
HAVING sum(IF(er.submit_time IS NULL, 1, 0)) > 2

然后再分别写出两种情况的 SQL 查询语句:

情况 1. 查询存在条件要求的 0 级用户的试卷未完成率

SELECT
    tmp1.uid uid,
    sum(
    IF
    ( er.submit_time IS NULL AND er.start_time IS NOT NULL, 1, 0 )) incomplete_cnt,
    round(
        sum(
        IF
        ( er.submit_time IS NULL AND er.start_time IS NOT NULL, 1, 0 ))/ count( tmp1.uid ),
        3
    ) incomplete_rate
FROM
    (
    SELECT DISTINCT
        ui.uid
    FROM
        user_info ui
        LEFT JOIN exam_record er ON ui.uid = er.uid
    WHERE
        er.submit_time IS NULL
        AND ui.LEVEL = 0
    ) tmp1
    LEFT JOIN exam_record er ON tmp1.uid = er.uid
GROUP BY
    tmp1.uid
ORDER BY
    incomplete_rate

情况 2. 查询不存在条件要求时所有有作答记录的 yong 用户的试卷未完成率

SELECT
    ui.uid uid,
    sum( CASE WHEN er.submit_time IS NULL AND er.start_time IS NOT NULL THEN 1 ELSE 0 END ) incomplete_cnt,
    round(
        sum(
        IF
        ( er.submit_time IS NULL AND er.start_time IS NOT NULL, 1, 0 ))/ count( ui.uid ),
        3
    ) incomplete_rate
FROM
    user_info ui
    JOIN exam_record er ON ui.uid = er.uid
GROUP BY
    ui.uid
ORDER BY
    incomplete_rate

拼在一起,就是答案

WITH host_user AS
  (SELECT ui.uid UID
   FROM user_info ui
   LEFT JOIN exam_record er ON ui.uid = er.uid
   WHERE ui.uid IN
       (SELECT ui.uid
        FROM user_info ui
        LEFT JOIN exam_record er ON ui.uid = er.uid
        WHERE er.submit_time IS NULL
          AND ui.LEVEL = 0 )
   GROUP BY ui.uid
   HAVING sum(IF (er.submit_time IS NULL, 1, 0))> 2),
     tt1 AS
  (SELECT tmp1.uid UID,
                   sum(IF (er.submit_time IS NULL
                           AND er.start_time IS NOT NULL, 1, 0)) incomplete_cnt,
                   round(sum(IF (er.submit_time IS NULL
                                 AND er.start_time IS NOT NULL, 1, 0))/ count(tmp1.uid), 3) incomplete_rate
   FROM
     (SELECT DISTINCT ui.uid
      FROM user_info ui
      LEFT JOIN exam_record er ON ui.uid = er.uid
      WHERE er.submit_time IS NULL
        AND ui.LEVEL = 0 ) tmp1
   LEFT JOIN exam_record er ON tmp1.uid = er.uid
   GROUP BY tmp1.uid
   ORDER BY incomplete_rate),
     tt2 AS
  (SELECT ui.uid UID,
                 sum(CASE
                         WHEN er.submit_time IS NULL
                              AND er.start_time IS NOT NULL THEN 1
                         ELSE 0
                     END) incomplete_cnt,
                 round(sum(IF (er.submit_time IS NULL
                               AND er.start_time IS NOT NULL, 1, 0))/ count(ui.uid), 3) incomplete_rate
   FROM user_info ui
   JOIN exam_record er ON ui.uid = er.uid
   GROUP BY ui.uid
   ORDER BY incomplete_rate)
  (SELECT tt1.*
   FROM tt1
   LEFT JOIN
     (SELECT UID
      FROM host_user) t1 ON 1 = 1
   WHERE t1.uid IS NOT NULL )
UNION ALL
  (SELECT tt2.*
   FROM tt2
   LEFT JOIN
     (SELECT UID
      FROM host_user) t2 ON 1 = 1
   WHERE t2.uid IS NULL)

V2 版本(根据上面做出的改进,答案缩短了,逻辑更强):

SELECT
    ui.uid,
    SUM(
    IF
    ( start_time IS NOT NULL AND score IS NULL, 1, 0 )) AS incomplete_cnt,#3.试卷未完成数
    ROUND( AVG( IF ( start_time IS NOT NULL AND score IS NULL, 1, 0 )), 3 ) AS incomplete_rate #4.未完成率

FROM
    user_info ui
    LEFT JOIN exam_record USING ( uid )
WHERE
CASE

        WHEN (#1.当有任意一个 0 级用户未完成试卷数大于 2 时
        SELECT
            MAX( lv0_incom_cnt )
        FROM
            (
            SELECT
                SUM(
                IF
                ( score IS NULL, 1, 0 )) AS lv0_incom_cnt
            FROM
                user_info
                JOIN exam_record USING ( uid )
            WHERE
                LEVEL = 0
            GROUP BY
                uid
            ) table1
            )> 2 THEN
            uid IN ( #1.1 找出每个 0 级用户
            SELECT uid FROM user_info WHERE LEVEL = 0 ) ELSE uid IN ( #2.若不存在这样的用户,找出有作答记录的用户
            SELECT DISTINCT uid FROM exam_record )
        END
        GROUP BY
            ui.uid
    ORDER BY
    incomplete_rate #5.结果按未完成率升序排序

各用户等级的不同得分表现占比(较难)

描述

现有用户信息表 user_infouid 用户 ID, nick_name 昵称, achievement 成就值, level 等级, job 职业方向, register_time 注册时间):

iduidnick_nameachievementleveljobregister_time
11001牛客 1 号190算法2020-01-01 10:00:00
21002牛客 2 号12003算法2020-01-01 10:00:00
31003牛客 3 号 ♂220算法2020-01-01 10:00:00
41004牛客 4 号250算法2020-01-01 10:00:00
51005牛客 555 号20007C++2020-01-01 10:00:00
6100666666630006C++2020-01-01 10:00:00

试卷作答记录表 exam_record(uid 用户 ID, exam_id 试卷 ID, start_time 开始作答时间, submit_time 交卷时间, score 得分):

iduidexam_idstart_timesubmit_timescore
1100190012020-01-02 09:01:012020-01-02 09:21:5980
2100190012021-05-02 10:01:01(NULL)(NULL)
3100190022021-02-02 19:01:012021-02-02 19:30:0175
4100190022021-09-01 12:01:012021-09-01 12:11:0160
5100190032021-09-02 12:01:012021-09-02 12:41:0190
6100190012021-06-02 19:01:012021-06-02 19:32:0020
7100190022021-09-05 19:01:012021-09-05 19:40:0189
8100190042021-09-03 12:01:01(NULL)(NULL)
9100290012020-01-01 12:01:012020-01-01 12:31:0199
10100290032020-02-01 12:01:012020-02-01 12:31:0182
11100290032020-02-02 12:11:012020-02-02 12:41:0176

为了得到用户试卷作答的定性表现,我们将试卷得分按分界点[90,75,60]分为优良中差四个得分等级(分界点划分到左区间),请统计不同用户等级的人在完成过的试卷中各得分等级占比(结果保留 3 位小数),未完成过试卷的用户无需输出,结果按用户等级降序、占比降序排序。

由示例数据结果输出如下:

levelscore_graderatio
30.667
30.333
00.500
00.167
00.167
00.167

解释:完成过试卷的用户有 1001、1002;完成了的试卷对应的用户等级和分数等级如下:

uidexam_idscorelevelscore_grade
10019001800
10019002750
10019002600
10019003900
10019001200
10019002890
10029001993
10029003823
10029003763

因此 0 级用户(只有 1001)的各分数等级比例为:优 1/6,良 1/6,中 1/6,差 3/6;3 级用户(只有 1002)各分数等级比例为:优 1/3,良 2/3。结果保留 3 位小数。

思路

先把 “将试卷得分按分界点[90,75,60]分为优良中差四个得分等级” 这个条件写出来,这里可以用到 case when

CASE
        WHEN a.score >= 90 THEN
        '优'
        WHEN a.score < 90 AND a.score >= 75 THEN
        '良'
        WHEN a.score < 75 AND a.score >= 60 THEN
    '中' ELSE '差'
END

这题的关键点就在于这,其他剩下的就是条件拼接了

答案

SELECT a.LEVEL,
       a.score_grade,
       ROUND(a.cur_count / b.total_num, 3) AS ratio
FROM
  (SELECT b.LEVEL AS LEVEL,
          (CASE
               WHEN a.score >= 90 THEN '优'
               WHEN a.score < 90
                    AND a.score >= 75 THEN '良'
               WHEN a.score < 75
                    AND a.score >= 60 THEN '中'
               ELSE '差'
           END) AS score_grade,
          count(1) AS cur_count
   FROM exam_record a
   LEFT JOIN user_info b ON a.uid = b.uid
   WHERE a.submit_time IS NOT NULL
   GROUP BY b.LEVEL,
            score_grade) a
LEFT JOIN
  (SELECT b.LEVEL AS LEVEL,
          count(b.LEVEL) AS total_num
   FROM exam_record a
   LEFT JOIN user_info b ON a.uid = b.uid
   WHERE a.submit_time IS NOT NULL
   GROUP BY b.LEVEL) b ON a.LEVEL = b.LEVEL
ORDER BY a.LEVEL DESC,
         ratio DESC

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
    我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
    原文