为什么需要团体?
我正在学习,但是汇总是出现的另一个关键字。但是,为什么需要分组以及确切的分组。
SELECT
usertype,
concat(start_station_name, " to ", end_station_name) AS route,
count (*) AS num_trips, --counting all trips (gives distinct value?)
Round(AVG(Cast(tripduration AS int64)/60),2) AS duration --/60 to make into minutes not seconds and the 2 is for decimal place
FROM `bigquery-public-data.new_york_citibike.citibike_trips`
GROUP BY
start_station_name, end_station_name, usertype
Order by
num_trips desc
LIMIT 10
I'm learning, but aggregated was another keyword that came up. But why does it need to be grouped and what exactly gets grouped.
SELECT
usertype,
concat(start_station_name, " to ", end_station_name) AS route,
count (*) AS num_trips, --counting all trips (gives distinct value?)
Round(AVG(Cast(tripduration AS int64)/60),2) AS duration --/60 to make into minutes not seconds and the 2 is for decimal place
FROM `bigquery-public-data.new_york_citibike.citibike_trips`
GROUP BY
start_station_name, end_station_name, usertype
Order by
num_trips desc
LIMIT 10
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这就是查询所做的(从概念上 - 实际实现可能会有所不同)
start_station_name
,end_station_name
和usertype
对它们进行分组。换句话说,对于起始站,终点站和用户类型的每种不同组合,我们都有一个单独的组。例如,如果有3个位置(a
,b
和c
)和2个用户类型(1
和2
),假设没有一个位置到自身的旅行,您会看到这样的组:num_trips
,仅是组中的行数,持续时间
,这是所有行的tripduration/60
的舍入平均值在小组中。num_trips
对组(不是组内的行,而是组本身的行)进行排序。This is what the query does (conceptually - the actual implementation may vary)
start_station_name
,end_station_name
andusertype
. In other words, for every different combination of start station, end station, and user type, we have a separate group. For example, if there were 3 locations (A
,B
andC
) and 2 user types (1
and2
), and assuming there were no trips from a location to itself, you would see groups like these:num_trips
, which is just the number of rows in the group, andduration
, which is the rounded average oftripduration/60
for all the rows in the group.num_trips
.