使用索引查询排序速度非常慢
我有一个大约有 350 万行的数据库表。该表保存合同数据记录,其中包含金额、日期和一些与其他表相关的 ID(VendorId、AgencyId、StateId),这是数据库表:
CREATE TABLE [dbo].[VendorContracts]
(
[Id] [uniqueidentifier] NOT NULL,
[ContractDate] [datetime2](7) NOT NULL,
[ContractAmount] [decimal](19, 4) NULL,
[VendorId] [uniqueidentifier] NOT NULL,
[AgencyId] [uniqueidentifier] NOT NULL,
[StateId] [uniqueidentifier] NOT NULL,
[CreatedBy] [nvarchar](max) NULL,
[CreatedDate] [datetime2](7) NOT NULL,
[LastModifiedBy] [nvarchar](max) NULL,
[LastModifiedDate] [datetime2](7) NULL,
[IsActive] [bit] NOT NULL,
CONSTRAINT [PK_VendorContracts]
PRIMARY KEY CLUSTERED ([Id] ASC)
WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
我的网站上有一个页面,用户可以在其中过滤分页网格按 VendorId
和 ContractDate
排序,并按 ContractAmount
或 ContractDate
排序。这是 EF Core 在按 ContractAmount
为这个拥有超过一百万条记录的特定供应商排序时生成的查询:
DECLARE @__vendorId_0 uniqueIdentifier = 'f39c7198-b05a-477e-b7bc-cb189c5944c0';
DECLARE @__startDate_1 datetime2 = '2017-01-01T07:00:00.0000000';
DECLARE @__endDate_2 datetime2 = '2018-01-02T06:59:59.0000000';
DECLARE @__p_3 int = 0;
DECLARE @__p_4 int = 50;
SELECT [v].[Id], [v].[AdminFee], [v].[ContractAmount], [v].[ContractDate], [v].[PONumber], [v].[PostalCode], [v].[AgencyId], [v].[StateId], [v].[VendorId]
FROM [VendorContracts] AS [v]
WHERE (([v].[VendorId] = @__vendorId_0) AND ([v].[ContractDate] >= @__startDate_1)) AND ([v].[ContractDate] <= @__endDate_2)
ORDER BY [v].[ContractAmount] ASC
OFFSET @__p_3 ROWS FETCH NEXT @__p_4 ROWS ONLY
当我运行此查询时,无论是排序 ASC
还是排序,都需要 50 秒DESC
或偏移数千,始终为 50 秒。
如果我查看我的执行计划,我会发现它确实使用了我的索引,但排序成本导致查询花费如此长的时间
这是我的索引:
CREATE NONCLUSTERED INDEX [IX_VendorContracts_VendorIdAndContractDate] ON [dbo].[VendorContracts]
(
[VendorId] ASC,
[ContractDate] DESC
)
INCLUDE([ContractAmount],[AdminFee],[PONumber],[PostalCode],[AgencyId],[StateId])
WITH (STATISTICS_NORECOMPUTE = OFF, DROP_EXISTING = OFF, ONLINE = OFF, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF)
奇怪的是我有一个相似索引按 ContractDate
排序,并且即使在拥有数百万条记录的供应商上,也能在不到一秒的时间内返回结果。
我的索引有问题吗?或者按 decimal
数据类型排序是否非常密集?
I have a database table with about 3.5 million rows. The table holds contract data records, with an amount, a date, and some IDs related to other tables (VendorId, AgencyId, StateId), this is the database table:
CREATE TABLE [dbo].[VendorContracts]
(
[Id] [uniqueidentifier] NOT NULL,
[ContractDate] [datetime2](7) NOT NULL,
[ContractAmount] [decimal](19, 4) NULL,
[VendorId] [uniqueidentifier] NOT NULL,
[AgencyId] [uniqueidentifier] NOT NULL,
[StateId] [uniqueidentifier] NOT NULL,
[CreatedBy] [nvarchar](max) NULL,
[CreatedDate] [datetime2](7) NOT NULL,
[LastModifiedBy] [nvarchar](max) NULL,
[LastModifiedDate] [datetime2](7) NULL,
[IsActive] [bit] NOT NULL,
CONSTRAINT [PK_VendorContracts]
PRIMARY KEY CLUSTERED ([Id] ASC)
WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
I have a page on my site where the user can filter a paged grid by VendorId
and ContractDate
, and sort by the ContractAmount
or ContractDate
. This is the query that EF Core produces when sorting by ContractAmount
for this particular vendor that has over a million records:
DECLARE @__vendorId_0 uniqueIdentifier = 'f39c7198-b05a-477e-b7bc-cb189c5944c0';
DECLARE @__startDate_1 datetime2 = '2017-01-01T07:00:00.0000000';
DECLARE @__endDate_2 datetime2 = '2018-01-02T06:59:59.0000000';
DECLARE @__p_3 int = 0;
DECLARE @__p_4 int = 50;
SELECT [v].[Id], [v].[AdminFee], [v].[ContractAmount], [v].[ContractDate], [v].[PONumber], [v].[PostalCode], [v].[AgencyId], [v].[StateId], [v].[VendorId]
FROM [VendorContracts] AS [v]
WHERE (([v].[VendorId] = @__vendorId_0) AND ([v].[ContractDate] >= @__startDate_1)) AND ([v].[ContractDate] <= @__endDate_2)
ORDER BY [v].[ContractAmount] ASC
OFFSET @__p_3 ROWS FETCH NEXT @__p_4 ROWS ONLY
When I run this, it takes 50s, whether sorting ASC
or DESC
or offsetting by thousands, it's always 50s.
If I look at my Execution Plan, I see that it does use my index, but the Sort Cost is what's making the query take so long
This is my index:
CREATE NONCLUSTERED INDEX [IX_VendorContracts_VendorIdAndContractDate] ON [dbo].[VendorContracts]
(
[VendorId] ASC,
[ContractDate] DESC
)
INCLUDE([ContractAmount],[AdminFee],[PONumber],[PostalCode],[AgencyId],[StateId])
WITH (STATISTICS_NORECOMPUTE = OFF, DROP_EXISTING = OFF, ONLINE = OFF, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF)
The strange thing is that I have a similar index for sorting by ContractDate
, and that one returns results in less than a second, even on the vendor that has millions of records.
Is there something wrong with my index? Or is sorting by a decimal
data type just incredibly intensive?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
的索引
您有一个允许精确查找谓词
。
SQL Server 估计有 6,657 行将与此谓词匹配并且需要进行排序,因此它会请求适合该行数的内存授予。
实际上,对于您看到问题的参数值,有近 50 万个已排序,并且内存授予不足,并且排序溢出到磁盘。
50 秒处理 10,299 个溢出页面听起来仍然出乎意料地慢,但我认为您很可能使用的是 Azure SQL 数据库中某些非常低的 SKU?
解决该问题的一些可能的解决方案可能是
OPTIMIZE FOR
提示)。这对于较小的供应商来说意味着过多的内存授予,但这可能意味着其他查询必须等待内存授予。OPTION (RECOMPILE)
,以便针对传递的特定参数值重新编译每次调用。这意味着理论上每次执行都会获得适当的内存授予,但代价是花费更多的编译时间。VendorId, ContractAmount INCLUDE (ContractDate)
上有索引,则可以查找VendorId = @__vendorId_0
部分,并在ContractAmount
中读取索引命令。一旦找到 50 行与ContractDate BETWEEN @__startDate_1 AND @__endDate_2
谓词匹配,查询执行就可以停止。不过,如果没有提示,SQL Server 可能不会选择此执行计划。我不确定通过 EF 应用查询提示有多容易,但如果您设法让所需的计划出现在那里,您可以考虑通过查询存储强制执行计划。
You have an index that allows the
predicate to be seeked exactly.
SQL Server estimates that 6,657 rows will match this predicate and need to be sorted so it requests a memory grant suitable for that amount of rows.
In reality for the parameter values where you see the problem nearly half a million are sorted and the memory grant is insufficient and the sort spills to disc.
50 seconds for 10,299 spilled pages does still sound unexpectedly slow but I assume you may well be on some very low SKU in Azure SQL Database?
Some possible solutions to resolve the issue might be to
OPTIMIZE FOR
hint). This will mean an excessive memory grant for smaller vendors though which may mean other queries have to incur memory grant waits.OPTION (RECOMPILE)
so every invocation is recompiled for the specific parameter values passed. This means in theory every execution will get an appropriate memory grant at the cost of more time spent in compilation.VendorId, ContractAmount INCLUDE (ContractDate)
then theVendorId = @__vendorId_0
part can be seeked and the index read inContractAmount
order. Once 50 rows have been found that match theContractDate BETWEEN @__startDate_1 AND @__endDate_2
predicate then query execution can stop. SQL Server might not choose this execution plan without hints though.I'm not sure how easy or otherwise it is to apply query hints through EF but you could look at forcing a plan via query store if you manage to get the desired plan to appear there.