mysql row_number()性能
对不起。这是一个重写的问题,因为没有答案。
我将其从rank()更改为row_number(),因为不需要相同的等级。
我目前正在研究查询,但我没有达到速度,所以我问一个问题。
我想找到的是“按Dataid计数每个类别中的最多名称”。
data_category { id:int主键auto_increment, dataid:int, 名称:Varchar2(200), 类别:varchar2(200) }
,可以重复输入。
select dataId, name, category, count(*) as cnt
, row_number() over (partition by dataId, category order by count(*) desc) as "ranking"
from data_category
group by dataId, name, category
在上述查询中,有和没有等级部分的速度有差异。
如果没有行_number,则需要0.0053,如果有的话,则需要0.5秒。 结果约为260,000行。
select *, row_number() over (partition by A.dataId, A.category order by cnt desc) as "ranking"
from(
select dataId, name, category, count(*) as cnt
from data_category
group by dataId, name, category
) A
以上查询也得出几乎相同的速度。将其包裹在精选语句中的那一刻,由于临时的表创建,它正在放慢速度。
数据约为40万例。
索引是(Dataid,类别,名称)。
我想通过数据ID提取5个名称数量最高的名称,但我无法弄清查询的方向。
我将使用此查询创建“视图”。 因此,没有其他条款。 之后,我想通过与另一个表一起使用它来使用它。
即使您不编写查询,我也真诚地要求一个简单的提示。
输出:
Dataid | 类别 | 名称 | 等级 |
---|---|---|---|
1 | CATE1 | NAME1_1 | 1 |
1 | CATE1 | NAME1_2 | 2 |
1 | CATE1 | CATE1 NAME1 name1_3 | 4 |
1 1 | CATE2 | NAME2_1 | 1 |
1 | CATE2 | NAME2 NAME2 NAME2_5 | 2 |
1 | CATE2 | NAME22_3 | NAME1_5 3 |
1 | 3 2 CATE2 | NAME2_3 | 3 |
2 | CATE1 CATE1 CATE1 | NAME3_1 NAME3_1 | 2 |
NAME3 | CATE3 NAME3 | _9 | 1 |
感谢您。
解释
#Query 1-2:没有rank_row
ID | select_type | 表 | 类型 | 键 | 行 | 过滤 | 额外 |
---|---|---|---|---|---|---|---|
1 | 简单 | data_cateogory | index | idx_39EA6497EA6CA40D9E783E | 432344 | 100.00 | 使用index; |
#Query 1-1:使用Rank_row
ID | Select_Type | 表 | 类型 | 键 | 行 | 过滤 | 额外 |
---|---|---|---|---|---|---|---|
1 | 简单 | data_cateogory | index | idx_39EA6497EA6CA40D9E783E | 432344 | 100.00 | 使用索引;使用临时;使用filesort |
Sorry. This is a rewritten question because there was no answer.
I changed it from rank() to row_number() because the same rank is not needed.
I'm currently working on a query, but I'm not getting the speed so I'm asking a question.
What I want to find is 'count the most names in each category by DataID'.
Data_Category {
id : INT PRIMARY KEY AUTO_INCREMENT,
DataId: INT,
name: varchar2(200),
category: varchar2(200)
}
and it is possible to input in duplicate.
select dataId, name, category, count(*) as cnt
, row_number() over (partition by dataId, category order by count(*) desc) as "ranking"
from data_category
group by dataId, name, category
In the above query, there is a difference in speed with and without the rank part.
If there is no row_number, it takes 0.0053, and if there is, it takes 0.5 seconds.
The result is about 260,000 rows.
select *, row_number() over (partition by A.dataId, A.category order by cnt desc) as "ranking"
from(
select dataId, name, category, count(*) as cnt
from data_category
group by dataId, name, category
) A
The above query also yields almost the same speed. The moment it is wrapped in a select statement, it is slowing down due to the temporary table creation.
The data is about 400,000 cases.
The index is (dataId, category, name) .
I want to extract 5 names with the highest number of names in each category by data ID, but I can't figure out the direction of the query.
And I will create 'view' using this query.
So there are no additional where clause.
After that, I want to use it by joining it with another table.
Even if you don't write a query, I sincerely ask for a simple hint..!
Output:
dataId | category | name | rank |
---|---|---|---|
1 | cate1 | name1_1 | 1 |
1 | cate1 | name1_2 | 2 |
1 | cate1 | name1_5 | 3 |
1 | cate1 | name1_3 | 4 |
1 | cate2 | name2_1 | 1 |
1 | cate2 | name2_5 | 2 |
1 | cate2 | name2_3 | 3 |
2 | cate1 | name3_1 | 1 |
2 | cate3 | name3_9 | 1 |
Thank you.
Explain
#Query 1-2 : Without rank_row
id | select_type | table | type | key | rows | filtered | extra |
---|---|---|---|---|---|---|---|
1 | SIMPLE | data_cateogory | index | IDX_39ea6497ea9ca40d9e783e | 432344 | 100.00 | Using index; |
#Query 1-1 : With rank_row
id | select_type | table | type | key | rows | filtered | extra |
---|---|---|---|---|---|---|---|
1 | SIMPLE | data_cateogory | index | IDX_39ea6497ea9ca40d9e783e | 432344 | 100.00 | Using index; Using temporary; Using filesort |
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这会给您您需要的东西吗?
或者,如果
name
具有重复项,请替换count(*)
bycount(distract name)
Does this give you what you need?
Or, if
name
has duplicates, replaceCOUNT(*)
byCOUNT(DISTINCT name)