How to update older "duplicate" records (duplicate except for the date column)

Question

How to update older "duplicate" records (duplicate except for the date column)

76 Views Asked by user3302110 At 28 August 2015 at 18:38

We have a table that contains, for this example, links to demographic questions (questionID) for each subscriber, with a date indicating when the subscriber answered a particular demographic question. In some cases, a subscriber may have answered the same question again at a later date, and we now have multiple records for the same subscriber and questionID, but with different answer dates (see sample data):

subscriberID questionID  dateAnswered            isDeleted
------------ ----------- ----------------------- ---------
100          559         2015-07-29 13:07:26.153 0
100          560         2015-07-29 13:07:26.153 0
100          561         2015-07-29 13:07:26.153 0
100          562         2015-07-29 13:07:26.153 0
100          575         2015-07-29 13:07:26.153 0
102          559         2015-07-30 15:12:46.143 0
102          564         2015-07-30 15:12:46.143 0
102          588         2015-07-30 15:12:46.143 0
102          559         2015-07-31 16:11:53.323 0
114          575         2015-08-21 11:27:14.253 0
114          588         2015-08-21 11:27:14.253 0
114          560         2015-08-21 11:27:14.253 0
114          588         2015-08-24 05:44:42.030 0
114          562         2015-08-21 11:27:14.253 0
114          575         2015-08-24 05:44:42.030 0

The app that was storing the answers should have flagged the older records as "deleted" (set isDeleted = 1) but it did not do so, and I now need to clean up the older records.

This seems like it should be simple, but it's got me stumped. How do I (a) select any records where there are duplicate subscriberID and questionIDs but with different answer dates? And (b) how do I do an update to set all but the newest records for each subscriber to have isDeleted=1?

Any help would be appreciated! I suspect a self join may be in order, but I haven't figured it out yet. Thus the question!

Original Q&A

There are 2 best solutions below

Richard Vivian On 28 August 2015 at 19:41

The select /update below will affect all records that are not marked as deleted , except for the last record by each subscriber for each question. Just another approach.

   WITH LastAnswers AS
    (
    SELECT    subscriberID ,questionID , MAX(dateAnswered) AS LastAnsweredDate
    FROM      TableName 
    GROUP BY  subscriberID ,questionID
    )

UPDATE TableName
   SET TableName.isDeleted = 1
FROM 
  TableName

LEFT JOIN LastAnswers 
ON  TableName.subscriberID = LastAnswers.subscriberID 
AND TableName.questionID   = LastAnswers.questionID 
AND TableName.dateAnswered = LastAnswers.LastAnsweredDate

WHERE  LastAnswers.LastAnsweredDate IS NULL AND TableName.isDeleted = 0

**M.Ali** · Accepted Answer · 2015-08-28T18:41:44.227000

M.Ali On 28 August 2015 at 18:41 BEST ANSWER

;WITH X AS 
 ( 
  SELECT ROW_NUMBER() OVER (PARTITION BY 
                            subscriberID, questionID  
                            ORDER BY dateAnswered DESC) rn 
  , * 
  FROM TableName 
 )
UPDATE X 
 SET isDeleted = 1
WHERE rn > 1

How to update older "duplicate" records (duplicate except for the date column)

There are 2 best solutions below

Related Questions in SQL-SERVER

Related Questions in T-SQL

Related Questions in DUPLICATE-DATA

Trending Questions

Popular # Hahtags

Popular Questions