简化SQL查询

发布于 2025-02-12 12:35:39 字数 1618 浏览 0 评论 0原文

我正在使用 https://www.bixi.com/en/开放数据并被要求找到“数据集中每年一年组合的平均旅行数量”。表

一个	是	的	：	我	示例	这
的	要	查询
	15 17:16:00	6315	2016-04-15 17:37:00	6315	1293	0

我过去成功回答问题的查询如下，但是我该如何简化它？

SELECT
    daily_trips_2016.avg_daily_trips AS avg_daily_trips_2016,
    daily_trips_2017.avg_daily_trips AS avg_daily_trips_2017,
    daily_trips_2016.month
FROM
    (
    SELECT -- This sub-query determines the average number of daily trips per month for the year 2016
        ROUND(COUNT(*) / (COUNT(DISTINCT DAY(start_date))), 0) AS avg_daily_trips,
        MONTH(start_date) AS month
    FROM trips
    WHERE YEAR(start_date) = 2016
    GROUP BY month) AS daily_trips_2016
LEFT JOIN
    (
    SELECT -- This sub-query determines the average number of daily trips per month for the year 2017
        ROUND(COUNT(*) / (COUNT(DISTINCT DAY(start_date))), 0) AS avg_daily_trips,
        MONTH(start_date) AS month
    FROM trips
    WHERE YEAR(start_date) = 2017
    GROUP BY month) AS daily_trips_2017
ON daily_trips_2016.month = daily_trips_2017.month; -- Since both sub-queries share the "month" column, the JOIN happens on this column
-- and we can see the daily averages side by side for each year

原文

I'm using the Bixi public dataset found at https://www.bixi.com/en/open-data and have been asked to find "the average number of trips a day for each year-month combination in the dataset". Here's an example of the table from which I'm querying:

id	start_date	start_station_code	end_date	end_station_code	duration_sec	is_member
85843	2016-04-15 00:00:00	6315	2016-04-15 00:05:00	6315	349	1
85844	2016-04-15 17:16:00	6315	2016-04-15 17:37:00	6315	1293	0

The query that I used to successfully answer the question was as follows, but how can I simplify it?:

SELECT
    daily_trips_2016.avg_daily_trips AS avg_daily_trips_2016,
    daily_trips_2017.avg_daily_trips AS avg_daily_trips_2017,
    daily_trips_2016.month
FROM
    (
    SELECT -- This sub-query determines the average number of daily trips per month for the year 2016
        ROUND(COUNT(*) / (COUNT(DISTINCT DAY(start_date))), 0) AS avg_daily_trips,
        MONTH(start_date) AS month
    FROM trips
    WHERE YEAR(start_date) = 2016
    GROUP BY month) AS daily_trips_2016
LEFT JOIN
    (
    SELECT -- This sub-query determines the average number of daily trips per month for the year 2017
        ROUND(COUNT(*) / (COUNT(DISTINCT DAY(start_date))), 0) AS avg_daily_trips,
        MONTH(start_date) AS month
    FROM trips
    WHERE YEAR(start_date) = 2017
    GROUP BY month) AS daily_trips_2017
ON daily_trips_2016.month = daily_trips_2017.month; -- Since both sub-queries share the "month" column, the JOIN happens on this column
-- and we can see the daily averages side by side for each year

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

枉心 2025-02-19 12:35:39

使用条件聚合：

SELECT ROUND(SUM(YEAR(start_date) = 2016) / COUNT(DISTINCT CASE WHEN YEAR(start_date) = 2016 THEN DAY(start_date) END), 0)  AS avg_daily_trips_2016,
       ROUND(SUM(YEAR(start_date) = 2017) / COUNT(DISTINCT CASE WHEN YEAR(start_date) = 2017 THEN DAY(start_date) END), 0)  AS avg_daily_trips_2017,
       MONTH(start_date) AS month
FROM trips
WHERE YEAR(start_date) IN (2016, 2017)
GROUP BY month;

Use conditional aggregation:

SELECT ROUND(SUM(YEAR(start_date) = 2016) / COUNT(DISTINCT CASE WHEN YEAR(start_date) = 2016 THEN DAY(start_date) END), 0)  AS avg_daily_trips_2016,
       ROUND(SUM(YEAR(start_date) = 2017) / COUNT(DISTINCT CASE WHEN YEAR(start_date) = 2017 THEN DAY(start_date) END), 0)  AS avg_daily_trips_2017,
       MONTH(start_date) AS month
FROM trips
WHERE YEAR(start_date) IN (2016, 2017)
GROUP BY month;

回复收藏 0 原文

等往事风中吹 2025-02-19 12:35:39

查找“每年平均每年旅行的平均数量
数据集中的组合”。

SELECT ALL
    EXTRACT(YEAR_MONTH FROM start_date) AS period,
    COUNT(*) / COUNT(DISTINCT DATE(start_date)) AS avg_by_day
FROM trips
GROUP BY 1
ORDER BY 1 ASC

。Yyyymm /平均值1行

find "the average number of trips a day for each year-month
combination in the dataset".

SELECT ALL
    EXTRACT(YEAR_MONTH FROM start_date) AS period,
    COUNT(*) / COUNT(DISTINCT DATE(start_date)) AS avg_by_day
FROM trips
GROUP BY 1
ORDER BY 1 ASC

1 row by YYYYMM / average

回复收藏 0 原文

~没有更多了~