当前位置：文江博客话题详情

聚集索引和非聚集索引之间的混淆。包含5个疑点

发布于 2024-09-17 16:43:03 字数 437 浏览 3 评论 0原文

聚集索引和非聚集索引都适用于 B-Tree 吗？我读到聚集索引会影响数据在表中物理存储的方式，而使用非聚集索引会创建列的单独副本并按排序顺序存储。另外，Sql Server 默认在主键上创建聚集索引。

这是否意味着：

1）非聚集索引比聚集索引占用更多空间，因为列的单独副本存储在非聚集中？

2）当我们有基于两列（学生姓名，分数）的主键时，聚集索引和非聚集索引如何工作？

3）索引只有2种吗？如果是这样，那么什么是位图索引？我似乎在 Sql Server Management Studio 中找不到任何此类索引类型，但在我的数据仓库书中提到了所有这些类型。

4）在主键上创建聚集索引还是非聚集索引有效？

5) 假设我们在名称上创建聚集索引，即数据按名称排序顺序物理存储，然后创建一条新记录。新记录将如何在表中找到它的位置？

提前致谢：）

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

橘香 2024-09-24 16:43:03

索引是与实际数据页分开存储的结构，并且仅包含指向数据页的指针。在 SQL Server 中，索引是 B 树。

聚集索引根据为索引定义的列对表中的数据页进行排序和存储。在 SQL Server 2005 中，您可以向索引添加其他列，因此当您具有复合主键时，这应该不是问题。您可以将聚集索引想象成一组带有文件夹的文件柜。在第一次抽奖中，您拥有以 A 开头的文档，在该抽奖的第一个文件夹中，您可能拥有以 AA 到 AC 开头的文档，依此类推。那么要搜索“Spider”，你可以直接跳到S抽，寻找包含“SP”的文件夹，很快就能找到你要找的东西。但很明显，如果您通过一个索引对所有文档进行物理排序，那么您就无法通过另一个索引对同一组文档进行物理排序。因此，每个表只有一个聚集索引。

非聚集索引是一种独立的结构，很像目录或书后的索引。所以我想我只具体回答了您的一些问题：

是的，索引确实占用空间，但没有原始表那么多。这就是为什么您必须仔细选择索引的原因。由于必须维护索引，更新操作的性能也会受到影响。
您的书将提及索引的所有理论类型。位图索引在数据仓库应用程序或具有一些不同值（例如一周中的几天等）的数据中非常有用。因此它们通常不用于基本 RDBMS 中。我知道 Oracle 有一些实现，但我对此了解不多。
我认为索引的效率取决于字段的使用方式。预计表中的大部分数据扫描将在主键上完成，那么主键上的索引就有意义。您通常会为查询的where子句或连接条件中出现的列添加索引。
插入时必须维护索引，因此系统必须完成一些额外的工作来重新排列内容。

Indexes are structures stored separately from the actual datapages and simply contain pointers to the datapages. In SQL Server indexes are B-Trees.

Clustered indexes sort and store the datapages in the table according to the columns defined for the index. In SQL Server 2005 you can add additional columns to an index so it should not be a problem when you have composite primary keys. You can think of a clustered index like a set of filing cabinets with folders. In the first draw you have documents starting with A and in the first folder of that draw you may have documents starting from AA to AC and so on. To search for "Spider" then, you can jump straight to the S draw and look for the folder containing "SP" and quickly find what you are looking for. But it is obvious that if you sort all documents physically by one index then you cannot physically sort the same set of documents by another index. Hence, only one clustered index per table.

A Non Clustered index is a separate structure much like the table of contents or the index at the back of a book. So I think I have only answered some of your questions specifically:

Yes the index does occupy space but not as much as the original table. That is why you must choose your indexes carefully. There is also a small performance hit for update operations since the index has to be maintained.
Your book will mention all the theoretical types of indexes. Bitmap indexes are useful in data warehousing applications or for data that has a few distinct values like days of the week etc. So they are not generally used in your basic RDBMS. I know that Oracle has some implementations but I don't know much about that.
I think that efficiency of an index is determined by how the field is used. It is expected that the majority of the data scanning in your table will be done on the primary key then an index on the primary key makes sense. You usually add indexes to columns that appear in the where clause or the join condition of your queries.
On insert the index has to be maintained, so there is a little extra work that has to be done by the system to rearrange things a bit.

回复收藏 0 原文

~没有更多了~