当前位置：文江博客话题详情

你能在 MySQL 中设计一个简单的例子，其中聚集索引是解决方案，而不是普通索引吗？

发布于 2024-08-26 09:18:53 字数 29 浏览 5 评论 0原文

我不明白聚集索引的意义，我们什么时候会受益？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

菊凝晚露 2024-09-02 09:18:53

聚集索引

聚集索引意味着记录根据索引按顺序物理存储（至少彼此靠近）。当您按顺序从每条记录中检索各个列时，聚集索引最为重要，因为数据库引擎不必跳转来获取下一条记录。相反，记录是按顺序存储的，因此记录之间的查找时间是最短的。

当读取索引中彼此靠近的多个记录时，聚集索引最为重要。

默认情况下，对于 InnoDB，主索引是聚集索引。

聚集索引的用例

如果您正在进行增量搜索，例如 Google 和 Yahoo 搜索，当您开始键入时，您会看到与您迄今为止键入的内容相匹配的前几条记录，则性能最重要的。如果您只返回结果集中的单个索引列，则不需要聚集索引，但我们假设您还想返回每个 key_word 的命中数，从而强制数据库引擎访问实际行。由于您想要返回连续的行，因此应该按顺序存储它们以获得最佳性能。

SELECT key_word, hits FROM keywords
WHERE key_word LIKE 'britney s%'
ORDER BY key_word
LIMIT 10

您希望主键（聚集索引）位于 key_word 上。

与非聚集索引的比较

所有索引在物理上都按顺序存储（实际上是一个 btree，但基本上），因此如果您只返回存储在索引中的列，您仍然可以获得相同的好处。这是因为索引列的实际值存储在索引中，因此MySQL将使用索引值而不是读取记录。但是，如果您开始检索不属于索引的列，那么您还希望按顺序存储实际记录，例如它们具有聚集索引。

关于聚集索引的 MySQL 文档

通过集群访问一行
索引很快，因为行数据是
在索引所在的同一页面上
搜索线索。如果表很大，则
聚集索引架构经常
节省磁盘 I/O 操作
与存储组织相比
使用不同的页面存储行数据
从索引记录。（例如，
MyISAM 使用一个文件来存储数据行
另一个用于索引记录。）
在InnoDB中，记录是非聚集的
索引（也称为二级索引
索引）包含主键
不在其中的行的列
二级索引。 InnoDB使用这个
用于搜索的主键值
聚集索引中的行。如果
主键长，次要键长
索引使用更多空间，所以它是
初级较短的优势
关键。

MySQL 聚集索引和二级索引

Clustered indexes

A clustered index means that the records are physically stored in order (at least near each other), based on the index. Clustered indexes are most important when you are retrieving various columns from each record, in order, because the database engine does not have to jump around to get the next record. Instead, the records are stored sequentially, therefore the seek time between records is at its minimum.

Clustered indexes are most important when reading multiple records that appear near each other in the index.

By default, with InnoDB, your primary index is a clustered index.

Use case for clustered indexes

If you were doing an incremental search like the Google and Yahoo search, where as you start typing, you see the first few records that match what you've typed so far, performance is paramount. If you were returning just a single indexed column in the result set, you wouldn't need a clustered index, but let's pretend that you also want to return the number of hits for each key_word, forcing the database engine to access the actual row. Since you want to return sequential rows, you should store them sequentially for optimal performance.

SELECT key_word, hits FROM keywords
WHERE key_word LIKE 'britney s%'
ORDER BY key_word
LIMIT 10

You'd want your primary key (clustered index) to be on key_word.

Comparison to nonclustered indexes

All indexes are physically stored in order (a btree actually, but basically), so if you are returning just the column that is stored in the index, you're still getting the same benefit. That is because the indexed column's actual value is stored in the index, therefore MySQL will use the index value instead of reading the record. However, if you start retrieving columns that aren't part of the index, this is where you'd also want the actual records stored in order, such as they are with a clustered index.

MySQL Documentation on clustered indexes

Accessing a row through the clustered
index is fast because the row data is
on the same page where the index
search leads. If a table is large, the
clustered index architecture often
saves a disk I/O operation when
compared to storage organizations that
store row data using a different page
from the index record. (For example,
MyISAM uses one file for data rows and
another for index records.)
In InnoDB, the records in nonclustered
indexes (also called secondary
indexes) contain the primary key
columns for the row that are not in
the secondary index. InnoDB uses this
primary key value to search for the
row in the clustered index. If the
primary key is long, the secondary
indexes use more space, so it is
advantageous to have a short primary
key.

MySQL Clustered and Secondary Indexes

回复收藏 0 原文

半寸时光 2024-09-02 09:18:53

我能想到的最好的例子是定期查询交易日期的报告表。我将在 TransactionDate 列上放置一个聚集索引，并根据查询优化添加任何其他所需的索引。

因此，诸如 select sum (amount) from transactiondetails where TransactionDate > 这样的查询'jan 01 2010' 和 TransactionDate < 'feb 01 2010' 将使用聚集索引进行搜索，并以更有效的方式得出结果。

回复收藏 0 原文