Locks are not applying and duplicates are being inserted. Which lock to apply and how to avoid duplicate inserts?

74 Views Asked by At

Here I have TResult table that inserts TestResult data into table using a procedure call.

If a test already exists with the same employeeid and collectedtime etc. with below conditions, cancel the insert.

This is being done to help prevent duplicate records from getting into the TResult table.

I tried NOLOCK and TABLOCKX in the IF condition. Here when the table is locked from a job, and different users are calling this procedure with the same data, in some rare scenarios, then it is inserting duplicates.

I should avoid the duplicates as per below condition.

When we have different data from 2 different procedure calls, it is fine to insert. But when we have same data from 2 different procedure calls when the table is locked, it is inserting duplicates.

Here I have few more columns also in the actual scenario. So, it is not possible to add a composite key constraints here to avoid duplicates.

How to avoid duplicate inserts? Any idea?

How can I move this to WHERE clause? So that I can get rid of extra IF condition?

/*
--Table creation Script
CREATE TABLE [dbo].[TResult]
(
    [TResultID] [bigint] IDENTITY(1,1) NOT NULL,
    [ClientID] [numeric](18, 0) NULL,
    [EmployeeId] [numeric](18, 0) NULL,
    [PolicyId] [numeric](18, 0) NULL,
    [TestSourceId] [int] NULL,
    [TestPanelId] [numeric](18, 0) NULL,
    [TestStatusId] [int] NULL,
    [CollectedTime] [datetime] NULL,
    [CreateDateTime] [datetime] NULL,

    CONSTRAINT [PK_TResult] 
        PRIMARY KEY CLUSTERED ([TResultID] ASC)
                WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, 
                      IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, 
                      ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
*/

-- stored procedure

CREATE OR ALTER PROCEDURE [temsweb].[TResult_Add]
    (@ClientId bigint,
     @EmployeeId numeric(18,0),
     @PolicyId numeric(18,0), 
     @TestSourceId bigint,
     @TestPanelId numeric(18,0),
     @TestStatusId bigint,
     @CollectedTime datetime = NULL,
     @retval bigint OUTPUT)
AS
BEGIN    
    DECLARE @TResultId  BIGINT
 
    IF EXISTS (SELECT * FROM TResult --WITH (TABLOCKX)
               WHERE employeeid = @EmployeeId 
                 AND CollectedTime = @CollectedTime 
                 AND (PolicyId = @PolicyId)
                 AND TestPanelId = @TestPanelId
                 AND TestStatusId <> 12) -- refusal
    BEGIN
        IF @TestSourceId NOT IN (3,12) OR @TestStatusId <> 13
        BEGIN
            SET @retval = -1
            RETURN
        END
    END

    -- SET @retval = -1

    -- Insert TResult
    INSERT INTO TResult WITH (TABLOCKX) 
        (ClientId, EmployeeId, PolicyId, 
         TestSourceId, TestPanelId, TestStatusId,   
         CollectedTime, CreateDateTime) 
        SELECT 
            @ClientId, @EmployeeId, @PolicyId,
            @TestSourceId, @TestPanelId, @TestStatusId,
            @CollectedTime, GETDATE()           
    
    -- Return the OrderId of the newly created Order
    SELECT @TResultID = SCOPE_IDENTITY()

    IF (@TResultID > 0 ) 
        SET @retval = @TResultID
END
GO

-------------------- Window1 ---------------------------
--Testing and mimicing Mutiple Users
-- (Lock the Table for 1 minute)
-- Block ( Lock ) table :
BEGIN TRAN  
SELECT TOP (1) 1 FROM TResult WITH (TABLOCKX)
WAITFOR DELAY '00:01:00' 
ROLLBACK TRAN   
GO

 select * From TResult

-------------------- Window2 ---------------------------
-- 1st Proc Call 

DECLARE @return_value int, @retval bigint

EXEC    @return_value = [TResult_Add]
        @ClientId = 12605,
        @EmployeeId = 16805332,
        @PolicyId = 31963,
        @TestSourceId = 6,
        @TestPanelId = 356,
        @TestStatusId = 11,
        @CollectedTime = N'2023-10-10 13:40:00.000',
        @retval = @retval OUTPUT

SELECT  @retval as N'@retval'
SELECT  'Return Value' = @return_value
GO 
  
-------------------- Window3 ---------------------------

-- 2nd Proc Call 

DECLARE @return_value int, @retval bigint

EXEC    @return_value = [TResult_Add]
        @ClientId = 12605,
        @EmployeeId = 16805332,
        @PolicyId = 31963,
        @TestSourceId = 6,
        @TestPanelId = 356,
        @TestStatusId = 11,
        @CollectedTime = N'2023-10-10 13:40:01.000',
        @retval = @retval OUTPUT

SELECT  @retval as N'@retval'
SELECT  'Return Value' = @return_value
GO
1

There are 1 best solutions below

1
Charlieface On

There is no need for TABLOCKX and you should definitely avoid NOLOCK.

The correct hints are SERIALIZABLE, UPDLOCK, and you put this on the EXISTS, not on the INSERT.

You could use a transaction, but in this case it seems simpler to combine the EXISTS into a INSERT...SELECT...WHERE NOT EXISTS... and check @@ROWCOUNT afterwards.

Note that you only need to check @@ROWCOUNT if you put a WHERE, otherwise it will always be 1 as the INSERT will always happen.

CREATE OR ALTER PROCEDURE temsweb.TResult_Add
     @ClientId bigint,
     @EmployeeId numeric(18,0),
     @PolicyId numeric(18,0), 
     @TestSourceId bigint,
     @TestPanelId numeric(18,0),
     @TestStatusId bigint,
     @CollectedTime datetime = NULL,
     @retval bigint OUTPUT
AS

SET NOCOUNT, XACT_ABORT ON;  -- needs XACT_ABORT for correct rollback

IF @TestSourceId NOT IN (3,12) OR @TestStatusId <> 13
BEGIN
    SET @retval = -1;
    RETURN;
END;


INSERT INTO TResult (
  ClientId, EmployeeId, PolicyId, 
  TestSourceId, TestPanelId, TestStatusId,   
  CollectedTime, CreateDateTime) 
SELECT 
  @ClientId, @EmployeeId, @PolicyId,
  @TestSourceId, @TestPanelId, @TestStatusId,
  @CollectedTime, GETDATE()           
WHERE NOT EXISTS (SELECT 1
    FROM TResult tr WITH (SERIALIZABLE, UPDLOCK)
    WHERE tr.employeeid = @EmployeeId 
      AND tr.CollectedTime = @CollectedTime 
      AND tr.PolicyId = @PolicyId
      AND tr.TestPanelId = @TestPanelId
      AND tr.TestStatusId <> 12
  );

IF @@ROWCOUNT = 0
    SET @retval = -1;
ELSE
    SET @retval = SCOPE_IDENTITY();

You should also place the correct index to prevent the EXISTS from locking up the whole table. You probably need something like:

TResult (employeeid, CollectedTime, PolicyId, TestPanelId) INCLUDE (TestStatusId)

The key columns can be any order, so you may want to look at what other queries might want. If any of the columns are unique/primary keys then the rest can be moved to the INCLUDE.