MySQL 中的复合主键性能缺陷
我们有一个表,其复合主键由三个字段组成(在 MySQL 5.1 中)。该表每秒有近 200 次插入和 200 次选择,表的大小约为 100 万行,并且还在不断增加。
我的问题是:“复合主键”是否会降低该表上的插入和选择的性能?
我应该使用简单的自动递增 INT ID 字段而不是复合主键吗? (我认为答案与MySQL处理多列索引的方式有很大关系)
We have a table with a composite Primary key consisting of three fields (and it is in MySQL 5.1). There are near 200 inserts and 200 selects per second on this table, and the size of the table is around 1 million rows and it is increasing.
My question is: does the "Composite Primary Key" decrease the performance of the Inserts and Selects on this table?
Should I be using a simple Auto-Increasing INT ID field instead of a Composite Primary Key? (I think the answer is very much related to the way MySQL handles the Indexes on multiple columns)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
INSERT
和UPDATE
性能变化不大:(INT)
和(INT, INT)
性能几乎相同键。复合
PRIMARY KEY
的SELECT
性能取决于许多因素。如果您的表是
InnoDB
,则该表将隐式聚集在PRIMARY KEY
值上。这意味着如果两个值都包含键,则搜索这两个值将会更快:不需要额外的键查找。
假设您的查询是这样的:
并且表布局是这样的:
,引擎只需要在表本身中查找确切的键值。
如果您使用自动增量字段作为假 id:
,那么引擎首先需要在索引
ix_mytable_col1_col2
中查找(col1, col2)
的值,检索索引中的行指针(id
的值),并在表本身中通过id
进行另一次查找。然而,对于 MyISAM 表来说,这没有什么区别,因为 MyISAM 表是堆组织的,行指针只是文件偏移量。
在这两种情况下,都会创建相同的索引(对于
PRIMARY KEY
或UNIQUE KEY
),并以相同的方式使用。INSERT
andUPDATE
performance varies little: it will be almost same for(INT)
and(INT, INT)
keys.SELECT
performance of compositePRIMARY KEY
depends on many factors.If your table is
InnoDB
, then the table is implicitly clustered on thePRIMARY KEY
value.That means that searches for both values will be faster if the both values comprise the key: no extra key lookup will be required.
Assuming your query is something like this:
and the table layout is this:
, the engine will just need to lookup the exact key value in the table itself.
If you use an autoincrement field as a fake id:
, then the engine will need, first, to lookup the values of
(col1, col2)
in the indexix_mytable_col1_col2
, retrieve the row pointer from the index (the value ofid
) and make another lookup byid
in the table itself.For
MyISAM
tables, however, this makes no difference, becauseMyISAM
tables are heap organized and the row pointer is just file offset.In both cases, a same index will be created (for
PRIMARY KEY
or forUNIQUE KEY
) and will be used in same way.如果是InnoDB,复合主键将包含在每个二级索引的每个条目中。
这意味着
这些当然是,分别是缺点和优点。
复合主键不一定是坏事,有时它们确实很有帮助,因为 InnoDB 将它们聚集在一起 - 这意味着可以使用比非聚集索引所需少得多的 IO 操作来满足 PK 上的(磁盘绑定)范围扫描。
当然,如果您在其他表中有外键,它们会更宽,并且需要包含主表中的整个键。
但总的来说,我想说的是,不会。拥有复合主键本身不会导致问题。然而,拥有“大”主键(例如大varchar)可能会起作用,如果这超过了集群和能够使用覆盖索引的优势。
If it's InnoDB, the composite primary key will be included in each entry in each of the secondary indexes.
This means that
These are of course, a disadvantage and an advantage respectively.
Composite primary keys are not necessarily bad, sometimes they can be really helpful because InnoDB clusters them - which means that (disc-bound) range scans over the PK can be satisfied using far fewer IO operations than would be required on a non-clustered index.
Of course if you've got foreign keys in other tables, they're wider as well as they need to include the whole key from your main table.
But I'd say on balance, generally, no. Having a composite primary key does NOT cause a problem by itself. Having a "big" primary key (e.g. big varchars) may do however, if that outweighs the advantages of clustering and being able to use covering indexes.
SELECT
的速度,尽管这种影响几乎可以忽略不计,不值得担心。INSERT
速度,而且您肯定已经做了足够多的INSERT
来担心它。如果它是 MyISAM 表(其中INSERT
锁定该表),那么与 InnoDB 表相比,这一点更值得关注。如果通过使用 auto_increment 主键,您可以使这些列保持未索引状态,那么您将从更改中受益。但是,如果您仍然需要对这三列进行索引(例如,如果您需要对它们的组合强制执行唯一性),那么它不会对您的性能产生任何影响。SELECT
s a tiny bit, though the effect is pretty much negligible and not worth worrying about.INSERT
s, and you certainly are doing enoughINSERT
s to worry about it. This is much more of a concern if it's a MyISAM table, where anINSERT
locks the table, than if it's an InnoDB table. If, by going with the auto_increment primary key, you would be able to leave those columns unindexed, you would benefit from the change. If you would still need to keep those three columns indexed, though (for example, if you need to enforce uniqueness on the combination of them), it isn't going to do anything for you performance-wise.