我应该使用哪种非聚集索引?

发布于 2024-09-02 12:48:54 字数 1934 浏览 5 评论 0原文

这里我正在研究SQL Server Management Studio上的非聚集索引。

我创建了一个包含超过 100 万条记录的表。该表有一个主键。

CREATE TABLE [dbo].[Customers](
    [CustomerId] [int] IDENTITY(1,1) NOT NULL,
    [CustomerName] [varchar](100) NOT NULL,
    [Deleted] [bit] NOT NULL,
    [Active] [bit] NOT NULL,
 CONSTRAINT [PK_Customers] PRIMARY KEY CLUSTERED 
(
    [CustomerId] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF,     ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

这是我将用来查看显示的执行计划的查询:

SELECT CustomerName FROM Customers

好吧,在没有附加非聚集索引的情况下执行此命令,它会导致执行计划向我显示:

I/O cost = 3.45646
Operator cost = 4.57715

现在我正在尝试查看是否可以为了提高性能,所以我为此表创建了一个非聚集索引:

1) 第一个非聚集索引

CREATE NONCLUSTERED INDEX [IX_CustomerID_CustomerName] ON [dbo].[Customers] 
(
    [CustomerId] ASC,
    [CustomerName] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF,
IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON,
ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

再次对 Customers 表执行 select,执行计划向我显示:

I/O cost = 2.79942
Operator cost = 3.92001

看起来更好。现在我删除了刚刚创建的非聚集索引,以便创建一个新索引:

2) 第一个非聚集索引

CREATE NONCLUSTERED INDEX [IX_CustomerIDIncludeCustomerName] ON [dbo].[Customers] 
(
    [CustomerId] ASC
)
INCLUDE ( [CustomerName]) WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF,  
SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF,   
ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

使用这个新的非聚集索引,我执行了 select 语句再次执行计划向我显示相同结果:

I/O cost = 2.79942
Operator cost = 3.92001

那么,我应该使用哪个非聚集索引? 为什么 I/O 和 Operator 的执行计划成本相同? 我做错了什么还是这是预期的?

谢谢

Here I am studying nonclustered indexes on SQL Server Management Studio.

I've created a table with more than 1 million records. This table has a primary key.

CREATE TABLE [dbo].[Customers](
    [CustomerId] [int] IDENTITY(1,1) NOT NULL,
    [CustomerName] [varchar](100) NOT NULL,
    [Deleted] [bit] NOT NULL,
    [Active] [bit] NOT NULL,
 CONSTRAINT [PK_Customers] PRIMARY KEY CLUSTERED 
(
    [CustomerId] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF,     ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

This is the query I'll be using to see what execution plan is showing:

SELECT CustomerName FROM Customers

Well, executing this command with no additional non-clustered index, it leads the execution plan to show me:

I/O cost = 3.45646
Operator cost = 4.57715

Now I'm trying to see if it's possible to improve performance, so I've created a non-clustered index for this table:

1) First non-clustered index

CREATE NONCLUSTERED INDEX [IX_CustomerID_CustomerName] ON [dbo].[Customers] 
(
    [CustomerId] ASC,
    [CustomerName] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF,
IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON,
ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

Executing again the select against Customers table, the execution plan shows me:

I/O cost = 2.79942
Operator cost = 3.92001

It seems better. Now I've deleted this just created non-clustered index, in order to create a new one:

2) First non-clustered index

CREATE NONCLUSTERED INDEX [IX_CustomerIDIncludeCustomerName] ON [dbo].[Customers] 
(
    [CustomerId] ASC
)
INCLUDE ( [CustomerName]) WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF,  
SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF,   
ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

With this new non-clustered index, I've executed the select statement again and the execution plan shows me the same result:

I/O cost = 2.79942
Operator cost = 3.92001

So, which non-clustered index should I use?
Why the costs are the same on execution plan for I/O and Operator?
Am I doing something wrong or this is expected?

thank you

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

幸福不弃 2024-09-09 12:48:54

这是因为“CustomerName”在第二个索引中被 INCLUDE 编辑(请参阅 关于已包含的列)。

基本上,两个索引的工作方式完全相同 - 它们是第一个索引列与 WHERE 子句不匹配的覆盖索引。

这意味着这两种情况下的查询都将扫描索引但不接触表。

我希望对于该特定查询性能更好的索引将是仅针对 CustomerName 的索引。

This is because of "CustomerName" being INCLUDE-ed in the second index (see this about INLCUDEd columns).

Basically, BOTH indexes work the same exact way for you - they are covered indexes with the first index column NOT matching the WHERE clause.

Meaning that the query in both cases will be scanning the index but NOT touching the table.

The index that I'd expect ti be better performing for that specific query would be an index on CustomerName alone.

软糖 2024-09-09 12:48:54

在通过使用 WHERE 过滤结果、使用 ORDER 对结果进行排序或将结果连接到索引列上的另一个表来使用索引之前,您不会注意到有或没有索引的太大差异。

尝试在没有索引的情况下执行如下查询:

SELECT *
FROM Customers
WHERE CustomerName = 'Marcus Adams'

然后在 CustomerName 列上添加索引并重试。

您还需要表中有足够的行,以便数据库系统实际使用索引,并且足够多的行,以便您会注意到扫描行和使用索引之间的区别。

You're not going to notice much difference with or without indexes until you make use of the index by filtering the results using WHERE, ordering the results with ORDER, or joining the results to another table on the indexed column.

Try doing a query like this without an index:

SELECT *
FROM Customers
WHERE CustomerName = 'Marcus Adams'

Then add an index on the CustomerName column and try again.

You'll also need enough rows in the table that the database system will actually use the index, and enough that you'll notice the difference between scanning the rows and using the index.

第七度阳光i 2024-09-09 12:48:54

实际上,您的两个非聚集索引都没有多大意义。

要点是:聚集索引的列(在您的例子中为 CustomerId)- 已包含在您拥有的每个非聚集索引的每个条目中。毕竟,如果找到条目,则该聚类列将用于实际数据查找。因此,将其添加到非聚集索引通常是多余的,而且只是浪费空间。

问题更多:如何选择想要显示的行? WHERE 子句中将显示哪些列?

如果您在这里找到一个模式(例如您总是按City进行选择),那么满足您需求的非聚集索引就是

CREATE NONCLUSTERED INDEX [IX_Customer_City] ON [dbo].[Customers] 
(
    [City] ASC
)
INCLUDE ( [CustomerName]) 

这样,您为SQL Server提供了一种方法来轻松查找匹配的行给定的城市,并包括您想要返回的列 (CustomerName),允许 SQL Server 直接从索引页获取必要的信息(使其成为所谓的覆盖索引< /code> - 它涵盖您的查询,例如返回所需的所有信息) - 您不需要执行“书签查找”,例如从实际数据页面获取整个 Customer 数据行(通过 CustomerId 找到它,它也位于非聚集索引中,因为它是聚集键)。

Neither of your two nonclustered indices makes a lot of sense, really.

The point is this: the column(s) of the clustered index - in your case CustomerId - is already included in every single entry of every single non-clustered index you have. That clustering column after all is what is used for the actual data lookup, if an entry is found. So adding that to a non-clustered index typically is superfluous and just a waste of space.

The question is more: how do you select those rows that you want to have displayed? What columns will show up in the WHERE clause?

If you find a pattern here (e.g. you always select by e.g. City), then the non-clustered index to suit your needs would be

CREATE NONCLUSTERED INDEX [IX_Customer_City] ON [dbo].[Customers] 
(
    [City] ASC
)
INCLUDE ( [CustomerName]) 

That way, you give SQL Server a method to easily find rows that match a given city, and including the column that you want to have returned (CustomerName) allows SQL Server to get the necessary information directly from the index page (making it a so-called covering index - it covers your query, e.g. returns all the information needed) - you won't need to do a "bookmark lookup", e.g. fetch the entire Customer data row from the actual data pages (finding it via the CustomerId which is in the non-clustered index, too, since it's the clustering key).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文