Event hub azure functions bulk insert produces violation of primary key

83 Views Asked by At

I have an EventHub-triggered Azure Functions to receive many datasets and insert them to the SQL Server database. To avoid duplication, reduce SQL concurrent connection and fast the process I am using SqlBulk insert.

I have this code:

public class StaffEntrance
{
    public string StaffId { get; set; }
    public long UnixDate { get; set; }
    public string Gate { get; set; }
    ... other properties
}

Both StaffId and UnixDate are primary keys.

The following methods are

[Function("sync")]
public async Task Run([EventHubTrigger("sync", Connection = "constevthub", ConsumerGroup = "syncgroup")] string[] input)
{
    if (input == null || input.Length == 0) return;
    try
    {
        ...desirializations and validations
        var staff = staffEntrances.DistinctBy(x => new { x.StaffId, x.UnixDate }).ToList();
        await Insert(staff);
    }
    catch(Exception ex)
    {
        ..Log to table
    }
}

async Task<List<StaffEntrance>> Insert(List<StaffEntrance> staff)
{
    BulkConfig bulkConfig = new()
    {
        WithHoldlock = false,
        UseOptionLoopJoin = false
    };
    await context.BulkInsertOrUpdateAsync(staff, bulkConfig); 
    await context.BulkSaveChangesAsync();
    return staff;
}

This functions is triggered every one or two minutes and contains hundred or thousand of records. Before inserting these records I distinct the values by primary keys to avoid duplication. Unfortunately, this duplication still happens occasionally. In one day it can be 50 to 90 errors indicating duplicate values

Violation of PRIMARY KEY constraint 'PK_StaffEntrance'. Cannot insert duplicate key in object 'dbo.StaffEntrance'. The duplicate key value is (1697793420, 204403454). The statement has been terminated

0

There are 0 best solutions below