SQL Server 索引 - 非常大的表,带有针对非常小范围值的 where 子句 - 我是否需要为 where 子句建立索引?

发布于 2024-12-23 17:28:33 字数 1854 浏览 0 评论 0原文

我正在设计一个带有单个表的数据库,用于我需要实施解决方案的特殊场景。不久之后,该表将拥有数亿行,但每行将相当紧凑。即使有很多行,我也需要插入、更新和选择速度又好又快,所以我需要为作业选择最佳索引。

我的表如下所示:

create table dbo.Domain
(
    Name varchar(255) not null,
    MetricType smallint not null, -- very small range of values, maybe 10-20 at most
    Priority smallint not null, -- extremely small range of values, generally 1-4
    DateToProcess datetime not null,
    DateProcessed datetime null,

    primary key(Name, MetricType)
);

选择查询将如下所示:

select Name from Domain
where MetricType = @metricType
    and DateProcessed is null
    and DateToProcess < GETUTCDATE()
order by Priority desc, DateToProcess asc

第一种更新类型将如下所示:

merge into Domain as target
using @myTablePrm as source
on source.Name = target.Name
    and source.MetricType = target.MetricType
when matched then
    update set
        DateToProcess = source.DateToProcess,
        Priority = source.Priority,
        DateProcessed = case -- set to null if DateToProcess is in the future
            when DateToProcess < DateProcessed then DateProcessed
            else null end
when not matched then
    insert (Name, MetricType, Priority, DateToProcess)
    values (source.Name, source.MetricType, source.Priority, source.DateToProcess);

第二种更新类型将如下所示:

update Domain
set DateProcessed = source.DateProcessed
from @myTablePrm source
where Name = source.Name and MetricType = @metricType

这些是实现最佳插入、更新和选择速度的最佳索引吗?

-- for the order by clause in the select query
create index IX_Domain_PriorityQueue
    on Domain(Priority desc, DateToProcess asc)
    where DateProcessed is null;

-- for the where clause in the select query
create index IX_Domain_MetricType
    on Domain(MetricType asc);

I am designing a database with a single table for a special scenario I need to implement a solution for. The table will have several hundred million rows after a short time, but each row will be fairly compact. Even when there are a lot of rows, I need insert, update and select speeds to be nice and fast, so I need to choose the best indexes for the job.

My table looks like this:

create table dbo.Domain
(
    Name varchar(255) not null,
    MetricType smallint not null, -- very small range of values, maybe 10-20 at most
    Priority smallint not null, -- extremely small range of values, generally 1-4
    DateToProcess datetime not null,
    DateProcessed datetime null,

    primary key(Name, MetricType)
);

A select query will look like this:

select Name from Domain
where MetricType = @metricType
    and DateProcessed is null
    and DateToProcess < GETUTCDATE()
order by Priority desc, DateToProcess asc

The first type of update will look like this:

merge into Domain as target
using @myTablePrm as source
on source.Name = target.Name
    and source.MetricType = target.MetricType
when matched then
    update set
        DateToProcess = source.DateToProcess,
        Priority = source.Priority,
        DateProcessed = case -- set to null if DateToProcess is in the future
            when DateToProcess < DateProcessed then DateProcessed
            else null end
when not matched then
    insert (Name, MetricType, Priority, DateToProcess)
    values (source.Name, source.MetricType, source.Priority, source.DateToProcess);

The second type of update will look like this:

update Domain
set DateProcessed = source.DateProcessed
from @myTablePrm source
where Name = source.Name and MetricType = @metricType

Are these the best indexes for optimal insert, update and select speed?

-- for the order by clause in the select query
create index IX_Domain_PriorityQueue
    on Domain(Priority desc, DateToProcess asc)
    where DateProcessed is null;

-- for the where clause in the select query
create index IX_Domain_MetricType
    on Domain(MetricType asc);

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

戈亓 2024-12-30 17:28:33

观察:

  • 您的更新应该使用 PK
  • 为什么不使用tinyint(范围0-255)来使行更窄?
  • 您需要日期时间吗?你可以使用smalledatetime吗?

想法:

  • 您的 SELECT 查询没有索引来覆盖它。您需要一个 (DateToProcess, MetricType, Priority DESC) INCLUDE (Name) WHERE DateProcessed IS NULL
    `:您必须尝试键列顺序才能获得最佳顺序

  • 您可以扩展该索引,以便也为每个 MetricType 提供过滤索引(保留 DateProcessed IS NULL 过滤器)。当我确实有数百万行要测试时,我会在另一行之后执行此操作

Observations:

  • Your updates should use the PK
  • Why not use tinyint (range 0-255) to make the rows even narrower?
  • Do you need datetime? Can you use smalledatetime?

Ideas:

  • Your SELECT query doesn't have an index to cover it. You need one on (DateToProcess, MetricType, Priority DESC) INCLUDE (Name) WHERE DateProcessed IS NULL
    `: you'll have to experiment with key column order to get the best one

  • You could extent that index to have a filtered indexes per MetricType too (keeping DateProcessed IS NULL filter). I'd do this after the other one when I do have millions of rows to test with

北凤男飞 2024-12-30 17:28:33

我怀疑您的最佳性能来自于在 PriorityMetricType 上没有索引。对于索引来说,基数可能太低,无法发挥多大作用。

DateToProcess 上的索引几乎肯定会有所帮助,因为该列中的基数很高,并且它用于 WHEREORDER BY > 条款。我首先从那开始。

DateProcessed 上的索引是否有帮助还有待讨论。这取决于您期望此列中 NULL 值的百分比。与往常一样,最好的选择是使用一些真实数据检查查询计划。

I suspect that your best performance will come from having no indexes on Priority and MetricType. The cardinality is likely too low for the indexes to do much good.

An index on DateToProcess will almost certainly help, as there is lilely to be high cardinality in that column and it is used in a WHERE and ORDER BY clause. I would start with that first.

Whether an index on DateProcessed will help is up for debate. That depends on what percentage of NULL values you expect for this column. Your best bet, as usual, is to examine the query plan with some real data.

太傻旳人生 2024-12-30 17:28:33

在表架构部分中,您已突出显示“MetricType”是两个主键之一,因此它肯定应该与“名称”列一起建立索引。至于“Priority”和“DateToProcess”字段,因为它们将出现在 where 子句中,将它们编入索引不会有什么坏处,但我不建议“DateProcessed”索引上的 where 子句为 null ,仅对一组数据建立索引不是一个好主意,请将其删除并为所有这些列建立索引。

In the table schema section, you have highlighted that 'MetricType' is one of two Primary keys, therefore this should definately be indexed along with the Name column. As for the 'Priority' and 'DateToProcess' fields as these will be present in a where clause it can't hurt to have them indexed also but I don't recommend the where clause you have on that index of 'DateProcessed' is null, indexing just a set of the data is not a good idea, remove this and index the whole of both those columns.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文