聚集索引和非聚集索引有什么区别?

发布于 2024-07-04 12:45:05 字数 49 浏览 6 评论 0原文

聚集索引非聚集索引有什么区别?

What are the differences between a clustered and a non-clustered index?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(13

淡看悲欢离合 2024-07-11 12:45:05

聚集索引

  • 聚集索引检索速度更快,插入速度更慢
    并更新。
  • 一张表只能有一个聚集索引。
  • 不需要额外的空间来存储逻辑结构。
  • 确定数据在磁盘上的存储顺序。

非聚集索引

  • 非聚集索引检索数据的速度较慢,但​​检索数据的速度较快
    插入和更新。

  • 一个表可以有多个非聚集索引。

  • 需要额外的空间来存储逻辑结构。

  • 不影响磁盘上存储数据的顺序。

Clustered Indexes

  • Clustered Indexes are faster for retrieval and slower for insertion
    and update.
  • A table can have only one clustered index.
  • Don't require extra space to store logical structure.
  • Determines the order of storing the data on the disk.

Non-Clustered Indexes

  • Non-clustered indexes are slower in retrieving data and faster in
    insertion and update.

  • A table can have multiple non-clustered indexes.

  • Require extra space to store logical structure.

  • Has no effect of order of storing data on the disk.

甚是思念 2024-07-11 12:45:05

// 复制自MSDN,第二点非聚集索引其他答案中没有明确提及。

聚集

  • 聚集索引对表或视图中的数据行进行排序和存储
    基于他们的关键价值观。 这些是包含在
    索引定义。 每个表只能有一个聚集索引,
    因为数据行本身只能以一种顺序存储。
  • 表中数据行唯一按排序顺序存储的时间是
    当表包含聚集索引时。 当一个表有一个
    有聚集索引的表称为聚集表。 如果一张桌子
    没有聚集索引,其数据行存储在无序中
    称为堆的结构。

非聚集

  • 非聚集索引具有独立于数据行的结构。 A
    非聚集索引包含非聚集索引键值和
    每个键值条目都有一个指向包含该键值的数据行的指针
    核心价值。
  • 从非聚集索引中的索引行到数据行的指针
    称为行定位器。 行定位器的结构取决于
    数据页是否存储在堆或聚簇表中。
    对于堆来说,行定位器是指向行的指针。 对于一个集群
    表中,行定位器是聚集索引键。

// Copied from MSDN, the second point of non-clustered index is not clearly mentioned in the other answers.

Clustered

  • Clustered indexes sort and store the data rows in the table or view
    based on their key values. These are the columns included in the
    index definition. There can be only one clustered index per table,
    because the data rows themselves can be stored in only one order.
  • The only time the data rows in a table are stored in sorted order is
    when the table contains a clustered index. When a table has a
    clustered index, the table is called a clustered table. If a table
    has no clustered index, its data rows are stored in an unordered
    structure called a heap.

Nonclustered

  • Nonclustered indexes have a structure separate from the data rows. A
    nonclustered index contains the nonclustered index key values and
    each key value entry has a pointer to the data row that contains the
    key value.
  • The pointer from an index row in a nonclustered index to a data row
    is called a row locator. The structure of the row locator depends on
    whether the data pages are stored in a heap or a clustered table.
    For a heap, a row locator is a pointer to the row. For a clustered
    table, the row locator is the clustered index key.
千鲤 2024-07-11 12:45:05

您可能已经了解了上述帖子中的理论部分:

-聚集索引,因为我们可以直接看到点来记录,即它是直接的,因此搜索所需的时间更少。 此外,它不会占用任何额外的内存/空间来存储索引

- 而在非聚集索引中,由于其间接性质,它间接指向聚集索引,然后它将访问实际记录访问需要更多时间。而且它需要自己的内存/空间来存储索引

在此处输入图像描述

You might have gone through theory part from the above posts:

-The clustered Index as we can see points directly to record i.e. its direct so it takes less time for a search. Additionally it will not take any extra memory/space to store the index

-While, in non-clustered Index, it indirectly points to the clustered Index then it will access the actual record, due to its indirect nature it will take some what more time to access.Also it needs its own memory/space to store the index

enter image description here

我ぃ本無心為│何有愛 2024-07-11 12:45:05

聚集索引本质上是索引列中数据的排序副本。

聚集索引的主要优点是,当您的查询(查找)在索引中定位数据时,不需要额外的 IO 来检索该数据。

维护聚集索引的开销(尤其是在频繁更新的表中)可能会导致性能不佳,因此最好创建非聚集索引。

A clustered index is essentially a sorted copy of the data in the indexed columns.

The main advantage of a clustered index is that when your query (seek) locates the data in the index then no additional IO is needed to retrieve that data.

The overhead of maintaining a clustered index, especially in a frequently updated table, can lead to poor performance and for that reason it may be preferable to create a non-clustered index.

晨与橙与城 2024-07-11 12:45:05

索引数据库有两部分:一组以任意顺序排列的物理记录,以及一组索引,它们标识应读取记录以产生按某种标准排序的结果的顺序。 如果物理排列和索引之间没有相关性,那么按顺序读出所有记录可能需要进行大量独立的单记录读取操作。 由于数据库可能能够在比读取两个非连续记录更短的时间内读取数十个连续记录,因此如果索引中连续的记录也连续存储在磁盘上,则性能可能会得到提高。 指定索引是聚集的将导致数据库做出一些努力(不同的数据库在多少方面有所不同)来排列事物,以便索引中连续的记录组在磁盘上也是连续的。

例如,如果从一个空的非集群数据库开始并以随机顺序添加 10,000 条记录,则这些记录可能会按照添加顺序添加到末尾。 按索引顺序读取数据库需要 10,000 次单记录读取。 然而,如果使用集群数据库,系统可能会在添加每条记录时检查前一条记录是否是自己存储的; 如果发现情况如此,它可能会将该记录与新记录一起写入数据库末尾。 然后,它可以查看移动记录曾经驻留的槽之前的物理记录,并查看其后的记录是否是单独存储的。 如果发现情况确实如此,它可以将该记录移至该位置。 使用这种方法会导致许多记录成对分组在一起,从而可能使顺序读取速度几乎翻倍。

实际上,集群数据库使用比这更复杂的算法。 不过,需要注意的一个关键问题是,更新数据库所需的时间和顺序读取数据库所需的时间之间存在权衡。 维护集群数据库将显着增加以任何影响排序顺序的方式添加、删除或更新记录所需的工作量。 如果数据库的顺序读取频率远高于更新频率,则集群可能是一个巨大的胜利。 如果经常更新但很少按顺序读取,则集群可能会造成很大的性能消耗,特别是如果将项目添加到数据库的顺序与其在集群索引方面的排序顺序无关。

An indexed database has two parts: a set of physical records, which are arranged in some arbitrary order, and a set of indexes which identify the sequence in which records should be read to yield a result sorted by some criterion. If there is no correlation between the physical arrangement and the index, then reading out all the records in order may require making lots of independent single-record read operations. Because a database may be able to read dozens of consecutive records in less time than it would take to read two non-consecutive records, performance may be improved if records which are consecutive in the index are also stored consecutively on disk. Specifying that an index is clustered will cause the database to make some effort (different databases differ as to how much) to arrange things so that groups of records which are consecutive in the index will be consecutive on disk.

For example, if one were to start with an empty non-clustered database and add 10,000 records in random sequence, the records would likely be added at the end in the order they were added. Reading out the database in order by the index would require 10,000 one-record reads. If one were to use a clustered database, however, the system might check when adding each record whether the previous record was stored by itself; if it found that to be the case, it might write that record with the new one at the end of the database. It could then look at the physical record before the slots where the moved records used to reside and see if the record that followed that was stored by itself. If it found that to be the case, it could move that record to that spot. Using this sort of approach would cause many records to be grouped together in pairs, thus potentially nearly doubling sequential read speed.

In reality, clustered databases use more sophisticated algorithms than this. A key thing to note, though, is that there is a tradeoff between the time required to update the database and the time required to read it sequentially. Maintaining a clustered database will significantly increase the amount of work required to add, remove, or update records in any way that would affect the sorting sequence. If the database will be read sequentially much more often than it will be updated, clustering can be a big win. If it will be updated often but seldom read out in sequence, clustering can be a big performance drain, especially if the sequence in which items are added to the database is independent of their sort order with regard to the clustered index.

錯遇了你 2024-07-11 12:45:05

优点:

聚集索引非常适合范围(例如 select * from my_table where my_key 位于 @min 和 @max 之间)

在某些情况下,如果您使用 orderby 语句,DBMS 将不需要进行排序工作。

缺点:

聚集索引会减慢插入速度,因为如果新键不按顺序排列,则在放入记录时必须修改记录的物理布局。

Pros:

Clustered indexes work great for ranges (e.g. select * from my_table where my_key between @min and @max)

In some conditions, the DBMS will not have to do work to sort if you use an orderby statement.

Cons:

Clustered indexes are can slow down inserts because the physical layouts of the records have to be modified as records are put in if the new keys are not in sequential order.

世俗缘 2024-07-11 12:45:05

集群基本上意味着数据按照表中的物理顺序排列。 这就是为什么每桌只能吃一份。

非聚集意味着它“只是”一个逻辑顺序。

Clustered basically means that the data is in that physical order in the table. This is why you can have only one per table.

Unclustered means it's "only" a logical order.

孤芳又自赏 2024-07-11 12:45:05

聚集索引实际上描述了记录在磁盘上物理存储的顺序,因此只能有一个索引。

非聚集索引定义的逻辑顺序与磁盘上的物理顺序不匹配。

A clustered index actually describes the order in which records are physically stored on the disk, hence the reason you can only have one.

A Non-Clustered Index defines a logical order that does not match the physical order on disk.

独﹏钓一江月 2024-07-11 12:45:05

聚集索引

  • 一张表中只能有一个聚集索引
  • 对记录进行排序并按照顺序进行物理存储
  • 数据检索比非聚集索引快
  • 不需要额外的空间来存储逻辑结构

非聚集索引

  • 可以有任意多个表中的非聚集索引
  • 不影响物理顺序。 为数据行创建逻辑顺序并使用指向物理数据文件的指针
  • 数据插入/更新比聚集索引更快
  • 使用额外的空间来存储逻辑结构

除了这些差异之外,您还必须知道当表是非聚集时(当表不聚集时)没有聚集索引)数据文件是无序的,它使用堆数据结构作为数据结构。

Clustered Index

  • Only one clustered index can be there in a table
  • Sort the records and store them physically according to the order
  • Data retrieval is faster than non-clustered indexes
  • Do not need extra space to store logical structure

Non Clustered Index

  • There can be any number of non-clustered indexes in a table
  • Do not affect the physical order. Create a logical order for data rows and use pointers to physical data files
  • Data insertion/update is faster than clustered index
  • Use extra space to store logical structure

Apart from these differences you have to know that when table is non-clustered (when the table doesn't have a clustered index) data files are unordered and it uses Heap data structure as the data structure.

骑趴 2024-07-11 12:45:05

聚集索引物理存储在表上。 这意味着它们是最快的,并且每个表只能有一个聚集索引。

非聚集索引是单独存储的,您可以拥有任意数量的索引。

最好的选择是在最常用的唯一列(通常是 PK)上设置聚集索引。 您的表中应该始终有一个经过精心挑选的聚集索引,除非出现一个非常令人信服的原因(想不出一个原因,但是嘿,它可能就在那里)不这样做。

Clustered indexes are stored physically on the table. This means they are the fastest and you can only have one clustered index per table.

Non-clustered indexes are stored separately, and you can have as many as you want.

The best option is to set your clustered index on the most used unique column, usually the PK. You should always have a well selected clustered index in your tables, unless a very compelling reason--can't think of a single one, but hey, it may be out there--for not doing so comes up.

总攻大人 2024-07-11 12:45:05

聚集索引

  1. 一张表只能有一个聚集索引。
  2. 通常在主键上进行。
  3. 聚集索引的叶节点包含数据页。

非聚集索引

  1. 一张表只能有249个非聚集索引(直到sql version 2005以后版本最多支持999个非聚集索引)。
  2. 通常在任意键上进行。
  3. 非聚集索引的叶节点不包含数据页。 相反,叶节点包含索引行。

Clustered Index

  1. There can be only one clustered index for a table.
  2. Usually made on the primary key.
  3. The leaf nodes of a clustered index contain the data pages.

Non-Clustered Index

  1. There can be only 249 non-clustered indexes for a table(till sql version 2005 later versions support upto 999 non-clustered indexes).
  2. Usually made on the any key.
  3. The leaf node of a nonclustered index does not consist of the data pages. Instead, the leaf nodes contain index rows.
皇甫轩 2024-07-11 12:45:05

聚集索引对磁盘上的数据进行物理排序。 这意味着索引不需要额外的数据,但只能有一个聚集索引(显然)。 使用聚集索引访问数据是最快的。

所有其他索引必须是非聚集索引。 非聚集索引具有来自索引列的数据的副本,这些数据与指向实际数据行的指针保持有序(如果存在则指向聚集索引)。 这意味着通过非聚集索引访问数据必须经过额外的间接层。 但是,如果您仅选择索引列中可用的数据,则可以直接从重复的索引数据中获取数据(这就是为什么最好只选择您需要的列而不使用 *)

Clustered indexes physically order the data on the disk. This means no extra data is needed for the index, but there can be only one clustered index (obviously). Accessing data using a clustered index is fastest.

All other indexes must be non-clustered. A non-clustered index has a duplicate of the data from the indexed columns kept ordered together with pointers to the actual data rows (pointers to the clustered index if there is one). This means that accessing data through a non-clustered index has to go through an extra layer of indirection. However if you select only the data that's available in the indexed columns you can get the data back directly from the duplicated index data (that's why it's a good idea to SELECT only the columns that you need and not use *)

凉栀 2024-07-11 12:45:05

聚集索引

  • 每个表仅一个 比
  • 非聚集索引读取速度更快,因为数据按索引顺序物理存储

非聚集索引

  • 每个表可以使用多次
  • 插入和更新操作比聚集索引更快

这两种类型的索引都会提高选择数据时的性能使用索引但会减慢更新和插入操作的字段。

由于插入和更新速度较慢,应在通常增量的字段(即 Id 或 Timestamp)上设置聚集索引。

如果索引的选择性高于 95%,SQL Server 通常只会使用索引。

Clustered Index

  • Only one per table
  • Faster to read than non clustered as data is physically stored in index order

Non Clustered Index

  • Can be used many times per table
  • Quicker for insert and update operations than a clustered index

Both types of index will improve performance when select data with fields that use the index but will slow down update and insert operations.

Because of the slower insert and update clustered indexes should be set on a field that is normally incremental ie Id or Timestamp.

SQL Server will normally only use an index if its selectivity is above 95%.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文