SQL Server change tracking - see only the last updated columns on the row?

1k Views Asked by At

I am struggling to understand how to use Change tracking for a simple task (not so simple it looks like) of tracking individual column changes to the same row?

Consider the following example:

I have a table:

CREATE TABLE [dbo].[Products](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [Name] [nvarchar](max) NULL,
    [Manufacturer] [nvarchar](max) NULL,
    [Country] [nvarchar](max) NULL,
 CONSTRAINT [PK_Products] PRIMARY KEY CLUSTERED ([Id] ASC)

Now I insert a row:

insert into Products ([Name],[Manufacturer],[Country])
values ('phone', 'sony', 'japan')

I add change tracking:

ALTER DATABASE [Test]
SET CHANGE_TRACKING = ON
(CHANGE_RETENTION = 90 MINUTES, AUTO_CLEANUP = ON);

ALTER TABLE [dbo].[Products]
ENABLE CHANGE_TRACKING
WITH (TRACK_COLUMNS_UPDATED = ON);

I run 1st update:

update Products set Manufacturer = 'Sega' where Name = 'phone'

I check 2 column updates:

select 
    CHANGE_TRACKING_IS_COLUMN_IN_MASK(COLUMNPROPERTY(OBJECT_ID('dbo.Products'),'Manufacturer', 'ColumnId'), CT.SYS_CHANGE_COLUMNS) 'Changed?', *
FROM
    CHANGETABLE(CHANGES dbo.Products, 0) AS CT
order by sys_change_version;

select 
    CHANGE_TRACKING_IS_COLUMN_IN_MASK(COLUMNPROPERTY(OBJECT_ID('dbo.Products'),'Country', 'ColumnId'), CT.SYS_CHANGE_COLUMNS) 'Changed?', *
FROM
    CHANGETABLE(CHANGES dbo.Products, 0) AS CT
order by sys_change_version;

This is what I get back: enter image description here

So far so good, as you can see only the Manufacturer column is reported as changed as expected.

However if I now try to run a 2nd update on the same row for a different column:

update Products set Country = 'USA' where Name = 'phone'

and run the same query above to check columns that have been updated, I get back that BOTH! columns have been updated: enter image description here

I would expect that the 2nd time I run update the change tracking would only show me the 2nd column as the one that has changed, not both of them!

It gets worse. If I then try and separate the updates by their SYS_CHANGE_VERSION number, that is supposed to track separate updates, I realise that both of these updates have been assigned the same number!:

select * from CHANGETABLE(CHANGES dbo.Products, 0) as changes

enter image description here

Which looks like a bug to me!

I don't want to trigger processing for Manufacturer column when I am updating a Country column on the row.

I scoured the internet but couldn't find any obvious solution to this problem. The obvious solution of cleaning table manually also seems not so straightforward and comes with its own can of worms.

Obviously change tracking has been around for a long time, so it is either I am missing something, or I am very surprised that bug has not been fixed (or even encountered by many people)

Am I missing something obvious here? Or does anyone knows how to make change tracking work for identifying column changes ONE AT A TIME like I imagine they were supposed to work?

2

There are 2 best solutions below

2
siggemannen On BEST ANSWER

Here's a little script demostrating change tracking.

create procedure spTrackChanges
AS
BEGIN

    begin tran
    
    declare @synchronization_version bigint;
    declare @last_synchronization_version bigint;

    -- Obtain the current synchronization version. This will be used next time that changes are obtained.   
    select  @synchronization_version = CHANGE_TRACKING_CURRENT_VERSION();
    -- Get version ID to get changes SINCE
    select  @last_synchronization_version = change
    from    tracking
    where   tablename = 'products'
    -- Return changes
    SELECT     CHANGE_TRACKING_IS_COLUMN_IN_MASK(COLUMNPROPERTY(OBJECT_ID('dbo.Products'),'Manufacturer', 'ColumnId'), CT.SYS_CHANGE_COLUMNS) 'Changed?',*
    FROM    CHANGETABLE(CHANGES Products, @last_synchronization_version) AS CT;
    -- Set new value
    update t
    set change = @synchronization_version
    from tracking t
    where tablename = 'Products'
    
    commit tran;

END

I create a separate procedure to do the change tracking "thing", because it's a repetitive task.

ALTER DATABASE [YourDB]
SET CHANGE_TRACKING = ON
(CHANGE_RETENTION = 90 MINUTES, AUTO_CLEANUP = ON);

if object_id('Products') IS NOT NULL
    drop table Products;
if object_id('Tracking') IS NOT NULL
drop table Tracking;

create table Tracking (tableName sysname PRIMARY KEY, change INT)
-- Initiate
insert into tracking (tableName, change)
select 'Products', 0

CREATE TABLE [dbo].[Products](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [Name] [nvarchar](max) NULL,
    [Manufacturer] [nvarchar](max) NULL,
    [Country] [nvarchar](max) NULL,
 CONSTRAINT [PK_Products] PRIMARY KEY CLUSTERED ([Id] ASC)
)

ALTER TABLE [dbo].[Products]
ENABLE CHANGE_TRACKING
WITH (TRACK_COLUMNS_UPDATED = ON);

-- Seed with value
insert into Products ([Name],[Manufacturer],[Country])
values ('phone', 'sony', 'japan');

-- Run procedure that gets initial sync
exec sptrackchanges; -- returns 0 as changed

update Products set Manufacturer = 'Sega' where Name = 'phone';

exec sptrackchanges; -- returns 1 as changed


update Products set Country = 'USA' where Name = 'phone';

exec sptrackchanges; -- returns 0 as changed

exec sptrackchanges; -- returns nothing, no changes

Change tracking relies on the "current version" value, which needs to be advanced all the time. For this reason, one needs to store the value in some table, i've implemented a simple per table tracking by storing it in the tracking table.

Every time changes occur, you get them by using the previous value you obtain by calling CHANGE_TRACKING_CURRENT_VERSION, these changes can affect both multiple rows and multiple columns.

By using the CHANGE_TRACKING_IS_COLUMN_IN_MASK you can obtain which columns were affected by the changes.

I don't quite understand why this doesn't work for you, since you:

  • are able to obtain specific columns involved in changes since last
  • update of one column doesn't generate update of another column, unless there has been change to other columns since last processing.
  • there's no duplicate processing of same changes occuring at all

If you're only interested in changes occuring to specific columns you can do the CHANGE_TRACKING_IS_COLUMN_IN_MASK thing and it should work

There is no difference here if multiple changes are collapsed into one or not, unless you're really interested in knowing that column X was changed Y number of times - but this is seldom needed.

The collapse changes example is:

update Products set Manufacturer = 'Apple' where Name = 'phone';
update Products set Manufacturer = 'M$' where Name = 'phone';
update Products set Country = 'USA' where Name = 'phone';

exec sptrackchanges; -- returns 1 as changed
0
Tanuki On

Thanks to solution provided by Mr. siggemannen above, here is how I ended up implementing it in code.

I am posting a full, ready, idempotent solution here in case someone else needs it.


Function to check if change tracking is enabled:

CREATE OR ALTER FUNCTION IsChangeTrackingEnabledForTable(@DbName sysname)
RETURNS BIT
AS
BEGIN
    DECLARE @RowCount INT
    DECLARE @ReturnValue BIT 
    SET @ReturnValue = 1
    
    SELECT @RowCount = COUNT(*) FROM sys.change_tracking_tables
    JOIN sys.tables ON sys.tables.object_id = sys.change_tracking_tables.object_id
    JOIN sys.schemas ON sys.schemas.schema_id = sys.tables.schema_id
    WHERE sys.tables.name = @DbName;

    IF @RowCount > 0
    BEGIN
        SET @ReturnValue = 1
    END
    ELSE
    BEGIN
        SET @ReturnValue = 0
    END
    
    RETURN @ReturnValue
END;
GO

Enable change tracking for your database:

IF NOT EXISTS (SELECT 1 FROM sys.change_tracking_databases WHERE database_id = DB_ID('your-database-name'))
begin
    ALTER DATABASE [your-database-name]
    SET CHANGE_TRACKING = ON
    (CHANGE_RETENTION = 7 DAYS, AUTO_CLEANUP = ON);
    SELECT 'Change tracking enabled for Database [your-database-name]';
end

Enable change tracking for your table. 'Products' in my case

if (dbo.IsChangeTrackingEnabledForTable('Products') = 0)
begin
    ALTER TABLE Products
    ENABLE CHANGE_TRACKING
    WITH (TRACK_COLUMNS_UPDATED = ON);
    SELECT 'Change tracking enabled for table [dbo.Products]';
end

Create change tracking table:

IF OBJECT_ID(N'__ChangeTrackingForUpdates', N'U') IS NULL 
BEGIN   
    -- This table is necessary to figure out which columns have been updated in the last update/synchronisation
    -- It contains table name and the last change version number that has been processed for this table.
    create table __ChangeTrackingForUpdates
    (TableName sysname PRIMARY KEY, ChangeNumber INT);

    insert into __ChangeTrackingForUpdates 
    (TableName, ChangeNumber)
    values ('Products', 0);

    select 'Change tracking table created';
END
GO

Function that gets last synchronization version number:

create or alter function ChangeTracking_GetLastSynchronizationVersionForTable(@TableName sysname)
returns bigint
as
begin
    declare @last_synchronization_version bigint;

    -- Get version ID to get changes since last synchronisation 
    select  @last_synchronization_version = ct.ChangeNumber
    from    __ChangeTrackingForUpdates ct
    where   ct.TableName = @TableName

    return ISNULL(@last_synchronization_version, 0)
end
go

Function that checks whether 'Products' table had a specific column updated since last synchronisation:

-- Please note that same procedure will need to be created for each table that you want to track updates from.
-- I have experimented with dynamic SQL, but in my experience it sporadically takes forever when executing, so I had to drop that approach
create or alter function ChangeTracking_IsValueUpdated_Products(@Id int, @ColumnName sysname)
returns bit
as
begin
    declare @ReturnIsValueUpdated bit = 0
    declare @TableName sysname = 'Products'

    declare @last_synchronization_version bigint = dbo.ChangeTracking_GetLastSynchronizationVersionForTable(@TableName)

    -- Check whether this specific column has been updated
    select @ReturnIsValueUpdated = COALESCE(CHANGE_TRACKING_IS_COLUMN_IN_MASK(COLUMNPROPERTY(OBJECT_ID(@TableName), @ColumnName, 'ColumnId'), CT.SYS_CHANGE_COLUMNS), 0)
    from   CHANGETABLE(CHANGES Products, @last_synchronization_version) AS ct
    where  ct.Id = @Id and ct.SYS_CHANGE_OPERATION = 'U'

    return ISNULL(@ReturnIsValueUpdated, 0)
end
go

After done with update processing, make sure to update synchronisation number to the latest:

create or alter procedure ChangeTracking_UpdateTableToLatestChangeVersion @TableName sysname
as
begin
    declare @synchronization_version bigint;
    select  @synchronization_version = CHANGE_TRACKING_CURRENT_VERSION();

    update ct
    set ChangeNumber = @synchronization_version
    from __ChangeTrackingForUpdates ct
    where TableName = @TableName;
end
go

And here is the usage:

Seed data:

insert into Products ([Name],[Manufacturer],[Country])
values 
('phone', 'sony', 'japan'),
('phone1', 'sony1', 'japan1'),
('phone2', 'sony2', 'japan2'),
('phone3', 'sony3', 'japan3'),
('phone4', 'sony4', 'japan4'),
('phone5', 'sony5', 'japan5');

Run updates (Substitute with your own Ids):

update Products set Manufacturer = 'Sony' where Id = 1002
update Products set Manufacturer = 'Phillips' where Id = 1005
update Products set Country = 'Cuba' where Id = 1002
update Products set Country = 'USA', Manufacturer = 'Google' where Id = 1003
update Products set Name = Name where Id = 1002

Check which columns for which IDs have changed:

select dbo.ChangeTracking_IsValueUpdated_Products(1001, 'Manufacturer')
select dbo.ChangeTracking_IsValueUpdated_Products(1002, 'Manufacturer')
select dbo.ChangeTracking_IsValueUpdated_Products(1003, 'Country')
select dbo.ChangeTracking_IsValueUpdated_Products(1003, 'Manufacturer')
select dbo.ChangeTracking_IsValueUpdated_Products(1004, 'Manufacturer')
select dbo.ChangeTracking_IsValueUpdated_Products(1005, 'Manufacturer')
select dbo.ChangeTracking_IsValueUpdated_Products(1006, 'Manufacturer')

Here is the result: enter image description here You should only get 1 for columns that you have updated previously.

In the end make sure to synchronise to the latest version after finished processing updates:

exec ChangeTracking_UpdateTableToLatestChangeVersion 'Products'

Profit!

Note: Be careful implementing this Change Tracking on database which is used by ORMS. For example if you use .Net and database changes are saved using Entity Framework ORM, by default it saves ALL of the columns to the database on any update, including Id. So in that case change tracking for updates will not help, as it will always return that the column was updated regardless of the new value being different from the old value or not.