添加唯一约束会减慢速度吗?

发布于 2024-09-25 18:29:56 字数 5996 浏览 1 评论 0原文

我的表中有三列。

+-----------+-----------------------+------+-----+---------+-------+
| Field     | Type                  | Null | Key | Default | Extra |
+-----------+-----------------------+------+-----+---------+-------+
| hash      | mediumint(8) unsigned | NO   | PRI | 0       |       | 
| nums      | int(10) unsigned      | NO   | PRI | 0       |       | 
| acc       | smallint(5) unsigned  | NO   | PRI | 0       |       | 
+-----------+-----------------------+------+-----+---------+-------+

我预计数据中有重复项,因此我继续添加了一个唯一约束:

ALTER TABLE nt_accs ADD UNIQUE(hash,nums,acc);

我有大约 5 亿行要插入到该表中,并且该表已使用 nums 上的 RANGE 划分为大约 20 个分区。

  1. 唯一约束会减慢插入速度吗?这与仅将两者设为主键而不是施加唯一约束有何不同?
  2. 我有很多使用 hash 和 nums 列的 GROUP BY 类型查询。我是否继续添加转换索引 和/或只需添加单独的索引?

编辑:

在分区和插入一些测试数据后解释计划

1. mysql> explain partitions select * from nt_accs;
+----+-------------+-----------+---------------------------------------------------------------------------+-------+---------------+----------+---------+------+------+-------------+
| id | select_type | table     | partitions                                                                | type  | possible_keys | key      | key_len | ref  | rows | Extra       |
+----+-------------+-----------+---------------------------------------------------------------------------+-------+---------------+----------+---------+------+------+-------------+
|  1 | SIMPLE      | nt_accs   | p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12,p13,p14,p15,p16,p17,p18,p19,p20 | index | NULL          | hash     | 7       | NULL |   10 | Using index | 
+----+-------------+-----------+---------------------------------------------------------------------------+-------+---------------+----------+---------+------+------+-------------+
1 row in set (0.00 sec)



2. mysql> explain partitions select * from nt_accs WHERE nums=1504887570;
+----+-------------+-----------+------------+-------+---------------+----------+---------+------+------+--------------------------+
| id | select_type | table     | partitions | type  | possible_keys | key      | key_len | ref  | rows | Extra                    |
+----+-------------+-----------+------------+-------+---------------+----------+---------+------+------+--------------------------+
|  1 | SIMPLE      | nt_accs   | p7         | index | NULL          | hash     | 7       | NULL |   10 | Using where; Using index | 
+----+-------------+-----------+------------+-------+---------------+----------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

3. mysql> explain partitions select * from nt_accs WHERE hash=2347200;
+----+-------------+-----------+---------------------------------------------------------------------------+------+---------------+----------+---------+-------+------+-------------+
| id | select_type | table     | partitions                                                                | type | possible_keys | key      | key_len | ref   | rows | Extra       |
+----+-------------+-----------+---------------------------------------------------------------------------+------+---------------+----------+---------+-------+------+-------------+
|  1 | SIMPLE      | nt_accs  | p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12,p13,p14,p15,p16,p17,p18,p19,p20 | ref  | hash          | hash     | 3       | const |   27 | Using index | 
+----+-------------+-----------+---------------------------------------------------------------------------+------+---------------+----------+---------+-------+------+-------------+
1 row in set (0.00 sec)

4. mysql> EXPLAIN PARTITIONS SELECT hash, count(distinct nums) FROM nt_accs GROUP BY hash;
+----+-------------+-----------+---------------------------------------------------------------------------+-------+---------------+----------+---------+------+------+-------------+
| id | select_type | table     | partitions                                                                | type  | possible_keys | key      | key_len | ref  | rows | Extra       |
+----+-------------+-----------+---------------------------------------------------------------------------+-------+---------------+----------+---------+------+------+-------------+
|  1 | SIMPLE      | nt_accs   | p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12,p13,p14,p15,p16,p17,p18,p19,p20 | index | NULL          | hash     | 7       | NULL |   10 | Using index | 
+----+-------------+-----------+---------------------------------------------------------------------------+-------+---------------+----------+---------+------+------+-------------+
1 row in set (0.00 sec)

5. mysql> EXPLAIN PARTITIONS SELECT nums, count(distinct hash) FROM nt_accs GROUP BY nums;
+----+-------------+-----------+---------------------------------------------------------------------------+-------+---------------+----------+---------+------+------+-----------------------------+
| id | select_type | table     | partitions                                                                | type  | possible_keys | key      | key_len | ref  | rows | Extra                       |
+----+-------------+-----------+---------------------------------------------------------------------------+-------+---------------+----------+---------+------+------+-----------------------------+
|  1 | SIMPLE      | nt_accs   | p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12,p13,p14,p15,p16,p17,p18,p19,p20 | index | NULL          | hash     | 7       | NULL |   10 | Using index; Using filesort | 
+----+-------------+-----------+---------------------------------------------------------------------------+-------+---------------+----------+---------+------+------+-----------------------------+
1 row in set (0.00 sec)

我对第一个和第二个查询非常满意,但我不确定第三个、第四个和第五个查询的性能。此时我还能做些什么来优化这个吗?

I have three columns in my table.

+-----------+-----------------------+------+-----+---------+-------+
| Field     | Type                  | Null | Key | Default | Extra |
+-----------+-----------------------+------+-----+---------+-------+
| hash      | mediumint(8) unsigned | NO   | PRI | 0       |       | 
| nums      | int(10) unsigned      | NO   | PRI | 0       |       | 
| acc       | smallint(5) unsigned  | NO   | PRI | 0       |       | 
+-----------+-----------------------+------+-----+---------+-------+

I am expecting duplicates in my data so I went ahead and added a unique constraint:

ALTER TABLE nt_accs ADD UNIQUE(hash,nums,acc);

I have about 500 million rows to insert into this table and this table has been paritioned using a RANGE on nums into about 20 partitions.

  1. Does the unique constraint slow down inserts? How does this differ in just making both a Primary Key instead of imposing a unique constraint?
  2. I have a lot of GROUP BY type queries using both the hash and nums columns. Do I go ahead and add a convering index on and or do I just add individual indexes?

EDIT:

Explain plan after partitioning and inserting some test data

1. mysql> explain partitions select * from nt_accs;
+----+-------------+-----------+---------------------------------------------------------------------------+-------+---------------+----------+---------+------+------+-------------+
| id | select_type | table     | partitions                                                                | type  | possible_keys | key      | key_len | ref  | rows | Extra       |
+----+-------------+-----------+---------------------------------------------------------------------------+-------+---------------+----------+---------+------+------+-------------+
|  1 | SIMPLE      | nt_accs   | p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12,p13,p14,p15,p16,p17,p18,p19,p20 | index | NULL          | hash     | 7       | NULL |   10 | Using index | 
+----+-------------+-----------+---------------------------------------------------------------------------+-------+---------------+----------+---------+------+------+-------------+
1 row in set (0.00 sec)



2. mysql> explain partitions select * from nt_accs WHERE nums=1504887570;
+----+-------------+-----------+------------+-------+---------------+----------+---------+------+------+--------------------------+
| id | select_type | table     | partitions | type  | possible_keys | key      | key_len | ref  | rows | Extra                    |
+----+-------------+-----------+------------+-------+---------------+----------+---------+------+------+--------------------------+
|  1 | SIMPLE      | nt_accs   | p7         | index | NULL          | hash     | 7       | NULL |   10 | Using where; Using index | 
+----+-------------+-----------+------------+-------+---------------+----------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

3. mysql> explain partitions select * from nt_accs WHERE hash=2347200;
+----+-------------+-----------+---------------------------------------------------------------------------+------+---------------+----------+---------+-------+------+-------------+
| id | select_type | table     | partitions                                                                | type | possible_keys | key      | key_len | ref   | rows | Extra       |
+----+-------------+-----------+---------------------------------------------------------------------------+------+---------------+----------+---------+-------+------+-------------+
|  1 | SIMPLE      | nt_accs  | p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12,p13,p14,p15,p16,p17,p18,p19,p20 | ref  | hash          | hash     | 3       | const |   27 | Using index | 
+----+-------------+-----------+---------------------------------------------------------------------------+------+---------------+----------+---------+-------+------+-------------+
1 row in set (0.00 sec)

4. mysql> EXPLAIN PARTITIONS SELECT hash, count(distinct nums) FROM nt_accs GROUP BY hash;
+----+-------------+-----------+---------------------------------------------------------------------------+-------+---------------+----------+---------+------+------+-------------+
| id | select_type | table     | partitions                                                                | type  | possible_keys | key      | key_len | ref  | rows | Extra       |
+----+-------------+-----------+---------------------------------------------------------------------------+-------+---------------+----------+---------+------+------+-------------+
|  1 | SIMPLE      | nt_accs   | p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12,p13,p14,p15,p16,p17,p18,p19,p20 | index | NULL          | hash     | 7       | NULL |   10 | Using index | 
+----+-------------+-----------+---------------------------------------------------------------------------+-------+---------------+----------+---------+------+------+-------------+
1 row in set (0.00 sec)

5. mysql> EXPLAIN PARTITIONS SELECT nums, count(distinct hash) FROM nt_accs GROUP BY nums;
+----+-------------+-----------+---------------------------------------------------------------------------+-------+---------------+----------+---------+------+------+-----------------------------+
| id | select_type | table     | partitions                                                                | type  | possible_keys | key      | key_len | ref  | rows | Extra                       |
+----+-------------+-----------+---------------------------------------------------------------------------+-------+---------------+----------+---------+------+------+-----------------------------+
|  1 | SIMPLE      | nt_accs   | p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12,p13,p14,p15,p16,p17,p18,p19,p20 | index | NULL          | hash     | 7       | NULL |   10 | Using index; Using filesort | 
+----+-------------+-----------+---------------------------------------------------------------------------+-------+---------------+----------+---------+------+------+-----------------------------+
1 row in set (0.00 sec)

I am perfectly fine with the first and second queries but I'm not sure about the performance of the 3rd, 4th and 5th. Is there anything else I can do at this point to optimize this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

别想她 2024-10-02 18:29:56

唯一约束是否会减慢插入速度?这与仅将两者设为主键而不是施加唯一约束有何不同?

是的,索引(MySQL 将唯一约束实现为索引)会减慢插入速度。
主键也是如此,这就是为什么需要高插入负载的表(即:用于日志记录)没有定义主键,以使插入速度更快。

我有很多使用 hash 和 nums 列的 GROUP BY 类型查询。我是否继续添加一个转换索引,或者只是添加单个索引?

唯一确定的方法是测试和测试。检查解释计划。

更新

根据提供的解释计划,我不认为对第三个和第二个的担忧。第四个版本。 MySQL 每种 select_type 只能使用一个索引。第五个版本可能受益于覆盖索引。

附录

只是想确保您知道:

ALTER TABLE nt_accs ADD UNIQUE(hash, nums, acc);

...意味着三列值的组合将是唯一的。 IE:这些是有效的,唯一约束将允许:

hash  nums  acc
----------------
1     1     1
1     1     2
1     2     1
2     1     1

Does the unique constraint slow down inserts? How does this differ in just making both a Primary Key instead of imposing a unique constraint?

Yes, an index (MySQL implements a unique constraint as an index) will slow down inserts.
The same goes a primary key, which is why tables expecting high insertion loads (IE: for logging) do not have a primary key defined--to make insertions faster.

I have a lot of GROUP BY type queries using both the hash and nums columns. Do I go ahead and add a convering index on and or do I just add individual indexes?

The only way to definitely know is to test & check the EXPLAIN plan.

UPDATE

In light of the provided explain plans, I don't see the concern for 3rd & 4th versions. MySQL can only use one index per select_type. The fifth version might benefit from a covering index.

Addendum

Just want to make sure that you are aware that:

ALTER TABLE nt_accs ADD UNIQUE(hash, nums, acc);

...means the combination of the three column values will be unique. IE: These are valid, the unique constraint will allow:

hash  nums  acc
----------------
1     1     1
1     1     2
1     2     1
2     1     1
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文