使用索引查询排序速度非常慢

发布于 2025-01-15 06:45:42 字数 2577 浏览 1 评论 0原文

我有一个大约有 350 万行的数据库表。该表保存合同数据记录,其中包含金额、日期和一些与其他表相关的 ID(VendorId、AgencyId、StateId),这是数据库表:

CREATE TABLE [dbo].[VendorContracts]
(
    [Id] [uniqueidentifier] NOT NULL,   
    [ContractDate] [datetime2](7) NOT NULL,
    [ContractAmount] [decimal](19, 4) NULL, 
    [VendorId] [uniqueidentifier] NOT NULL,
    [AgencyId] [uniqueidentifier] NOT NULL,
    [StateId] [uniqueidentifier] NOT NULL,

    [CreatedBy] [nvarchar](max) NULL,
    [CreatedDate] [datetime2](7) NOT NULL,
    [LastModifiedBy] [nvarchar](max) NULL,
    [LastModifiedDate] [datetime2](7) NULL,
    [IsActive] [bit] NOT NULL,

    CONSTRAINT [PK_VendorContracts] 
        PRIMARY KEY CLUSTERED ([Id] ASC)
                WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, 
                      OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

我的网站上有一个页面,用户可以在其中过滤分页网格按 VendorIdContractDate 排序,并按 ContractAmountContractDate 排序。这是 EF Core 在按 ContractAmount 为这个拥有超过一百万条记录的特定供应商排序时生成的查询:

DECLARE @__vendorId_0 uniqueIdentifier = 'f39c7198-b05a-477e-b7bc-cb189c5944c0';
DECLARE @__startDate_1 datetime2 = '2017-01-01T07:00:00.0000000';
DECLARE @__endDate_2 datetime2 = '2018-01-02T06:59:59.0000000';
DECLARE @__p_3 int = 0;
DECLARE @__p_4 int = 50;

SELECT [v].[Id], [v].[AdminFee], [v].[ContractAmount], [v].[ContractDate], [v].[PONumber], [v].[PostalCode], [v].[AgencyId], [v].[StateId], [v].[VendorId]
FROM [VendorContracts] AS [v]
WHERE (([v].[VendorId] = @__vendorId_0) AND ([v].[ContractDate] >= @__startDate_1)) AND ([v].[ContractDate] <= @__endDate_2)
ORDER BY [v].[ContractAmount] ASC
OFFSET @__p_3 ROWS FETCH NEXT @__p_4 ROWS ONLY

当我运行此查询时,无论是排序 ASC 还是排序,都需要 50 秒DESC 或偏移数千,始终为 50 秒。

如果我查看我的执行计划,我会发现它确实使用了我的索引,但排序成本导致查询花费如此长的时间

在此处输入图像描述

这是我的索引:

CREATE NONCLUSTERED INDEX [IX_VendorContracts_VendorIdAndContractDate] ON [dbo].[VendorContracts]
(
    [VendorId] ASC,
    [ContractDate] DESC
)
INCLUDE([ContractAmount],[AdminFee],[PONumber],[PostalCode],[AgencyId],[StateId]) 
WITH (STATISTICS_NORECOMPUTE = OFF, DROP_EXISTING = OFF, ONLINE = OFF, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF)

奇怪的是我有一个相似索引按 ContractDate 排序,并且即使在拥有数百万条记录的供应商上,也能在不到一秒的时间内返回结果。

我的索引有问题吗?或者按 decimal 数据类型排序是否非常密集?

I have a database table with about 3.5 million rows. The table holds contract data records, with an amount, a date, and some IDs related to other tables (VendorId, AgencyId, StateId), this is the database table:

CREATE TABLE [dbo].[VendorContracts]
(
    [Id] [uniqueidentifier] NOT NULL,   
    [ContractDate] [datetime2](7) NOT NULL,
    [ContractAmount] [decimal](19, 4) NULL, 
    [VendorId] [uniqueidentifier] NOT NULL,
    [AgencyId] [uniqueidentifier] NOT NULL,
    [StateId] [uniqueidentifier] NOT NULL,

    [CreatedBy] [nvarchar](max) NULL,
    [CreatedDate] [datetime2](7) NOT NULL,
    [LastModifiedBy] [nvarchar](max) NULL,
    [LastModifiedDate] [datetime2](7) NULL,
    [IsActive] [bit] NOT NULL,

    CONSTRAINT [PK_VendorContracts] 
        PRIMARY KEY CLUSTERED ([Id] ASC)
                WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, 
                      OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

I have a page on my site where the user can filter a paged grid by VendorId and ContractDate, and sort by the ContractAmount or ContractDate. This is the query that EF Core produces when sorting by ContractAmount for this particular vendor that has over a million records:

DECLARE @__vendorId_0 uniqueIdentifier = 'f39c7198-b05a-477e-b7bc-cb189c5944c0';
DECLARE @__startDate_1 datetime2 = '2017-01-01T07:00:00.0000000';
DECLARE @__endDate_2 datetime2 = '2018-01-02T06:59:59.0000000';
DECLARE @__p_3 int = 0;
DECLARE @__p_4 int = 50;

SELECT [v].[Id], [v].[AdminFee], [v].[ContractAmount], [v].[ContractDate], [v].[PONumber], [v].[PostalCode], [v].[AgencyId], [v].[StateId], [v].[VendorId]
FROM [VendorContracts] AS [v]
WHERE (([v].[VendorId] = @__vendorId_0) AND ([v].[ContractDate] >= @__startDate_1)) AND ([v].[ContractDate] <= @__endDate_2)
ORDER BY [v].[ContractAmount] ASC
OFFSET @__p_3 ROWS FETCH NEXT @__p_4 ROWS ONLY

When I run this, it takes 50s, whether sorting ASC or DESC or offsetting by thousands, it's always 50s.

If I look at my Execution Plan, I see that it does use my index, but the Sort Cost is what's making the query take so long

enter image description here

This is my index:

CREATE NONCLUSTERED INDEX [IX_VendorContracts_VendorIdAndContractDate] ON [dbo].[VendorContracts]
(
    [VendorId] ASC,
    [ContractDate] DESC
)
INCLUDE([ContractAmount],[AdminFee],[PONumber],[PostalCode],[AgencyId],[StateId]) 
WITH (STATISTICS_NORECOMPUTE = OFF, DROP_EXISTING = OFF, ONLINE = OFF, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF)

The strange thing is that I have a similar index for sorting by ContractDate, and that one returns results in less than a second, even on the vendor that has millions of records.

Is there something wrong with my index? Or is sorting by a decimal data type just incredibly intensive?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

我的痛♀有谁懂 2025-01-22 06:45:42

的索引

VendorId = @__vendorId_0 and ContractDate BETWEEN @__startDate_1 AND @__endDate_2

您有一个允许精确查找谓词

输入图片此处的描述

SQL Server 估计有 6,657 行将与此谓词匹配并且需要进行排序,因此它会请求适合该行数的内存授予。

实际上,对于您看到问题的参数值,有近 50 万个已排序,并且内存授予不足,并且排序溢出到磁盘。

50 秒处理 10,299 个溢出页面听起来仍然出乎意料地慢,但我认为您很可能使用的是 Azure SQL 数据库中某些非常低的 SKU?

解决该问题的一些可能的解决方案可能是

  1. 强制它使用针对最大供应商和宽日期范围的参数值编译的执行计划(例如使用OPTIMIZE FOR提示)。这对于较小的供应商来说意味着过多的内存授予,但这可能意味着其他查询必须等待内存授予。
  2. 使用OPTION (RECOMPILE),以便针对传递的特定参数值重新编译每次调用。这意味着理论上每次执行都会获得适当的内存授予,但代价是花费更多的编译时间。
  3. 根本不需要排序。如果您在 VendorId, ContractAmount INCLUDE (ContractDate) 上有索引,则可以查找 VendorId = @__vendorId_0 部分,并在 ContractAmount 中读取索引命令。一旦找到 50 行与 ContractDate BETWEEN @__startDate_1 AND @__endDate_2 谓词匹配,查询执行就可以停止。不过,如果没有提示,SQL Server 可能不会选择此执行计划。

我不确定通过 EF 应用查询提示有多容易,但如果您设法让所需的计划出现在那里,您可以考虑通过查询存储强制执行计划。

You have an index that allows the

VendorId = @__vendorId_0 and ContractDate BETWEEN @__startDate_1 AND @__endDate_2

predicate to be seeked exactly.

enter image description here

SQL Server estimates that 6,657 rows will match this predicate and need to be sorted so it requests a memory grant suitable for that amount of rows.

In reality for the parameter values where you see the problem nearly half a million are sorted and the memory grant is insufficient and the sort spills to disc.

50 seconds for 10,299 spilled pages does still sound unexpectedly slow but I assume you may well be on some very low SKU in Azure SQL Database?

Some possible solutions to resolve the issue might be to

  1. Force it to use an execution plan that is compiled for parameter values with your largest vendor and wide date range (e.g. with OPTIMIZE FOR hint). This will mean an excessive memory grant for smaller vendors though which may mean other queries have to incur memory grant waits.
  2. Use OPTION (RECOMPILE) so every invocation is recompiled for the specific parameter values passed. This means in theory every execution will get an appropriate memory grant at the cost of more time spent in compilation.
  3. Remove the need for a sort at all. If you have an index on VendorId, ContractAmount INCLUDE (ContractDate) then the VendorId = @__vendorId_0 part can be seeked and the index read in ContractAmount order. Once 50 rows have been found that match the ContractDate BETWEEN @__startDate_1 AND @__endDate_2 predicate then query execution can stop. SQL Server might not choose this execution plan without hints though.

I'm not sure how easy or otherwise it is to apply query hints through EF but you could look at forcing a plan via query store if you manage to get the desired plan to appear there.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文