在更新SQL Server之前检查更改数据时的性能增益

发布于 2025-02-13 03:06:31 字数 2370 浏览 0 评论 0原文

我们在Azure SQL Server上的表上进行了很多插入/更新。

是否有任何首先加载某些列（如果更新）并在进行更新之前与存储的数据相比是否更改的收益？

为了避免记录，当数据相同时，编写和索引更新。

顺便说一句：这是在存储过程中，而参数（此处未显示）是新数据。

附加信息：大多数情况下，将有一个更新，没有更改数据，因为这链接到一个网络钩，该网络钩在我们使用的数据没有更改时多次发射。我想大概有4-5个电话中有1个与数据更改的电话。

'''

DECLARE @CurrentCustomerDisplayName nvarchar(50);
DECLARE @CurrentCancelledAt datetimeoffset(7);
DECLARE @CurrentCustomerId bigint;
DECLARE @CurrentTrackingNo nvarchar(50);
DECLARE @CurrentTags nvarchar(2000);
DECLARE @CurrentToCountryCode nchar(2);


SET TRANSACTION ISOLATION LEVEL SERIALIZABLE

BEGIN TRANSACTION
    -- First load the current row data from the table
    SELECT TOP(1) 
        @CurrentCustomerDisplayName = CustomerDisplayName,
        @CurrentCancelledAt = CancelledAt,
        @CurrentCustomerId = CustomerId,
        @CurrentTrackingNo = TrackingNo,
        @CurrentTags = Tags,
        @CurrentToCountryCode = ToCountryCode
    FROM dbo.Orders with (UPDLOCK)
    where  OrderId = @orderId;

    -- Check if exist
    if @@ROWCOUNT = 0
    BEGIN
        -- Do insert
    END
    ELSE
    BEGIN
        -- Check if data has changed
        IF ((@CurrentCustomerDisplayName IS NULL AND @customerDisplayName IS NULL) OR (@CurrentCustomerDisplayName IS NOT NULL AND @customerDisplayName IS NOT NULL AND @CurrentCustomerDisplayName = @customerDisplayName))
            AND ((@CurrentCancelledAt IS NULL AND @cancelledAt IS NULL) OR (@CurrentCancelledAt IS NOT NULL AND @cancelledAt IS NOT NULL AND @CurrentCancelledAt = @cancelledAt))
            AND ((@CurrentCustomerId IS NULL AND @customerId IS NULL) OR (@CurrentCustomerId IS NOT NULL AND @customerId IS NOT NULL AND @CurrentCustomerId = @customerId))
            AND ((@CurrentTrackingNo IS NULL AND @trackingNo IS NULL) OR (@CurrentTrackingNo IS NOT NULL AND @trackingNo IS NOT NULL AND @CurrentTrackingNo = @trackingNo))
            AND ((@CurrentTags IS NULL AND @tags IS NULL) OR (@CurrentTags IS NOT NULL AND @tags IS NOT NULL AND @CurrentTags = @tags))
            AND ((@CurrentToCountryCode IS NULL AND @toCountryCode IS NULL) OR (@CurrentToCountryCode IS NOT NULL AND @toCountryCode IS NOT NULL AND @CurrentToCountryCode = @toCountryCode))
        BEGIN
            -- Do nothing, identical data
        END
        ELSE
        BEGIN
            -- Do UPDATE
        END
    END
COMMIT

'''

原文

We are doing quite a lot of insert/updates on a table on Azure SQL Server.

Is there any gain of loading some columns first (if update) and checking if they have changed compared to the stored data before doing an UPDATE?

To avoid logging, writing and index updates when the data is identical.

BTW: This is in a stored procedure, and the parameters (not shown here) are the new data.

Additional info: Most of the time there will be an UPDATE with no data changed because this is linked to a webhook which fire many times when the data we use is not changed. I would guess maybe 1 in 4-5 calls are with changed data.

'''

DECLARE @CurrentCustomerDisplayName nvarchar(50);
DECLARE @CurrentCancelledAt datetimeoffset(7);
DECLARE @CurrentCustomerId bigint;
DECLARE @CurrentTrackingNo nvarchar(50);
DECLARE @CurrentTags nvarchar(2000);
DECLARE @CurrentToCountryCode nchar(2);


SET TRANSACTION ISOLATION LEVEL SERIALIZABLE

BEGIN TRANSACTION
    -- First load the current row data from the table
    SELECT TOP(1) 
        @CurrentCustomerDisplayName = CustomerDisplayName,
        @CurrentCancelledAt = CancelledAt,
        @CurrentCustomerId = CustomerId,
        @CurrentTrackingNo = TrackingNo,
        @CurrentTags = Tags,
        @CurrentToCountryCode = ToCountryCode
    FROM dbo.Orders with (UPDLOCK)
    where  OrderId = @orderId;

    -- Check if exist
    if @@ROWCOUNT = 0
    BEGIN
        -- Do insert
    END
    ELSE
    BEGIN
        -- Check if data has changed
        IF ((@CurrentCustomerDisplayName IS NULL AND @customerDisplayName IS NULL) OR (@CurrentCustomerDisplayName IS NOT NULL AND @customerDisplayName IS NOT NULL AND @CurrentCustomerDisplayName = @customerDisplayName))
            AND ((@CurrentCancelledAt IS NULL AND @cancelledAt IS NULL) OR (@CurrentCancelledAt IS NOT NULL AND @cancelledAt IS NOT NULL AND @CurrentCancelledAt = @cancelledAt))
            AND ((@CurrentCustomerId IS NULL AND @customerId IS NULL) OR (@CurrentCustomerId IS NOT NULL AND @customerId IS NOT NULL AND @CurrentCustomerId = @customerId))
            AND ((@CurrentTrackingNo IS NULL AND @trackingNo IS NULL) OR (@CurrentTrackingNo IS NOT NULL AND @trackingNo IS NOT NULL AND @CurrentTrackingNo = @trackingNo))
            AND ((@CurrentTags IS NULL AND @tags IS NULL) OR (@CurrentTags IS NOT NULL AND @tags IS NOT NULL AND @CurrentTags = @tags))
            AND ((@CurrentToCountryCode IS NULL AND @toCountryCode IS NULL) OR (@CurrentToCountryCode IS NOT NULL AND @toCountryCode IS NOT NULL AND @CurrentToCountryCode = @toCountryCode))
        BEGIN
            -- Do nothing, identical data
        END
        ELSE
        BEGIN
            -- Do UPDATE
        END
    END
COMMIT

'''

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

苄①跕圉湢 2025-02-20 03:06:32

通过不更新不必要的行，可以获得性能的增长，但是您可能会通过将过程分为多个语句而失去此性能。

使用当前的代码，假设大多数操作是插入或更新（带有更改），您将始终在orders表（Check + UPSERT）上至少执行两个查询。您可以从插入物开始，然后对此进行检查，以便如果要插入所需的操作，则只需一个查询即可比两个查询更有效。例如，

SET XACT_ABORT ON;
BEGIN TRANSACTION

INSERT dbo.Orders (CustomerDisplayName, CancelledAt, CustomerId, TrackingNo, Tags, ToCountryCode)
SELECT  @CustomerDisplayName, @CancelledAt, @CustomerId, @TrackingNo, @Tags, @CurrentToCountryCode
WHERE   NOT EXISTS (SELECT 1 FROM dbo.Orders WITH (UPDLOCK, SERIALIZABLE) WHERE OrderId = @orderId)

IF @@ROWCOUNT = 0
BEGIN
    UPDATE  dbo.Orders
    SET     CustomerDisplayName = @CustomerDisplayName,
            CancelledAt = @CancelledAt,
            CustomerId = @CustomerId,
            TrackingNo = @TrackingNo,
            Tags= @Tags,
            ToCountryCode = @ToCountryCode
    WHERE   OrderId = @orderId
    AND     NOT EXISTS 
            (   SELECT  CustomerDisplayName, CancelledAt, CustomerId, TrackingNo, Tags, ToCountryCode
                INTERSECT
                SELECT  @CustomerDisplayName, @CancelledAt, @CustomerId, @TrackingNo, @Tags, @CurrentToCountryCode
            );
END

COMMIT TRANSACTION;

上述几个参考文献将提供更多的阅读。对于常规插入/更新模式，请参见请停止使用这是亚伦·伯特兰（Aaron Bertrand）的抗议。在上面的WHERE子句上使用不存在更多内容（... Intersect ...）请参阅Paul White的以下文章：无证件查询计划：平等比较。它实际上是ANSI标准的SQL-Server 与不同。

与基于绩效的问题通常情况一样，您的里程可能会有所不同。如果您对存储过程的电话中有99％是更新，那么您可能会发现这不是最有效的方法，并且在插入之前进行更新更好（在Aaron的文章中进行了讨论），或者大多数更新您可能会发现首先进行更改检查是最有效的。

There is a performance gain by not updating unnecessary rows, but you're likely going to lose out on this performance by splitting the process up into multiple statements.

With your current code, assuming most actions are either an insert or an update (with changes) you'll always execute at least two queries on the Orders table (Check + Upsert). You could start off with an insert and make this your check for if the row exists, that way if the required action is to insert, then you only require one query which will be more efficient than two. e.g.

SET XACT_ABORT ON;
BEGIN TRANSACTION

INSERT dbo.Orders (CustomerDisplayName, CancelledAt, CustomerId, TrackingNo, Tags, ToCountryCode)
SELECT  @CustomerDisplayName, @CancelledAt, @CustomerId, @TrackingNo, @Tags, @CurrentToCountryCode
WHERE   NOT EXISTS (SELECT 1 FROM dbo.Orders WITH (UPDLOCK, SERIALIZABLE) WHERE OrderId = @orderId)

IF @@ROWCOUNT = 0
BEGIN
    UPDATE  dbo.Orders
    SET     CustomerDisplayName = @CustomerDisplayName,
            CancelledAt = @CancelledAt,
            CustomerId = @CustomerId,
            TrackingNo = @TrackingNo,
            Tags= @Tags,
            ToCountryCode = @ToCountryCode
    WHERE   OrderId = @orderId
    AND     NOT EXISTS 
            (   SELECT  CustomerDisplayName, CancelledAt, CustomerId, TrackingNo, Tags, ToCountryCode
                INTERSECT
                SELECT  @CustomerDisplayName, @CancelledAt, @CustomerId, @TrackingNo, @Tags, @CurrentToCountryCode
            );
END

COMMIT TRANSACTION;

A couple of references for the above that will provide a bit more reading. For the general INSERT/UPDATE pattern see Please stop using this UPSERT anti-pattern by Aaron Bertrand. For a bit more on the where clause above using NOT EXISTS ( ... INTERSECT ...) see the following article from Paul White: Undocumented Query Plans: Equality Comparisons. It is effectively the SQL-Server of the ANSI standard IS DISTINCT FROM.

As is often the case with performance based questions your mileage may vary. If 99% of your calls to the stored procedure are updates, then you may find that this is not the most efficient approach, and that doing the update before the insert is better (this is discussed in Aaron's article), or if most of those updates you may find that doing a check for changes first is the most efficient.

回复收藏 0 原文

~没有更多了~