Simple SQL Update statement not being transactional

638 Views Asked by At

Using SQL 2016.

I have an Orders table:

 OrderID int identity
 NumberOfItems int

And an Items table:

ItemId int identity
OrderId int
DateUpdated datetime

An order is created and an OrderId is assigned via identity. Then I have to assign the "Freshest" "NumberOfItems" Items to it from the Items table. Freshest meaning, they have been updated the most recently, according to the DateUpdated date. Items are "assigned" by updating their OrderId to the OrderId in question.

I have this SQL to assign the items in a transactional fashion (@OrderID and @NumberOfItems are input parameters):

    UPDATE Items
        SET OrderId = @OrderId
        WHERE ItemId IN
               (SELECT TOP(@NumberOfItems) ItemId FROM Items
                WHERE OrderId IS NULL -- not already assigned
                ORDER BY DateStatusUpdated DESC -- freshest first
               )

Should be good right? This should transactionally assign Items to Orders and no matter how frequently or concurrently this statement is run against the server, the same Item should never be re-assigned to another order once it's already been assigned. This should be guaranteed by pretty much any relational database ever implemented due to the necessary transactional nature of the single UPDATE statement used.

Well, it had been working that way for several tens of millions of orders. Then, last night, two orders came in about 50ms apart (which isn't particularly close for this application), and one Item was assigned to OrderN and then the same item re-assigned to OrderN+1!!

What could have possibly caused this to happen?

2

There are 2 best solutions below

5
Vitaly Borisov On

Technically there are two calls to the Item table. Try this refactored query which does the same with single call:

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
;WITH LimitedUpdate AS (
    SELECT TOP(@NumberOfItems) i.OrderId
    FROM Items i WITH (UPDLOCK)
    WHERE i.OrderId IS NULL
    ORDER BY i.DateStatusUpdated DESC
)
UPDATE LimitedUpdate SET OrderId = @OrderId
;

Keep original query:

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
UPDATE Items WITH (UPDLOCK)
SET OrderId = @OrderId
WHERE ItemId IN
        (SELECT TOP(@NumberOfItems) ItemId FROM Items WITH (UPDLOCK)
        WHERE OrderId IS NULL -- not already assigned
        ORDER BY DateStatusUpdated DESC -- freshest first
        )
;
6
Vladimir Baranov On

READ COMMITTED isolation level doesn't provide the guarantee that you want, as you have already discovered yourself the hard way.

In short, engine executes SELECT part first, without locking the table/rows. It locks the table/rows only when performing the UPDATE phase, but this step happens later.

So, two queries running simultaneously may read the same rows and then try to UPDATE.

No matter how you try to rewrite the query there will always be two separate phases - SELECT, then UPDATE. You can see them in the execution plan.

The "easy fix" is to use the SERIALIZABLE isolation level for the whole query, but it may be overkill and some hint (such as UPDLOCK may be enough).

I don't have much experience with these hints, though.