We have a table that contains, for this example, links to demographic questions (questionID) for each subscriber, with a date indicating when the subscriber answered a particular demographic question. In some cases, a subscriber may have answered the same question again at a later date, and we now have multiple records for the same subscriber and questionID, but with different answer dates (see sample data):
subscriberID questionID dateAnswered isDeleted
------------ ----------- ----------------------- ---------
100 559 2015-07-29 13:07:26.153 0
100 560 2015-07-29 13:07:26.153 0
100 561 2015-07-29 13:07:26.153 0
100 562 2015-07-29 13:07:26.153 0
100 575 2015-07-29 13:07:26.153 0
102 559 2015-07-30 15:12:46.143 0
102 564 2015-07-30 15:12:46.143 0
102 588 2015-07-30 15:12:46.143 0
102 559 2015-07-31 16:11:53.323 0
114 575 2015-08-21 11:27:14.253 0
114 588 2015-08-21 11:27:14.253 0
114 560 2015-08-21 11:27:14.253 0
114 588 2015-08-24 05:44:42.030 0
114 562 2015-08-21 11:27:14.253 0
114 575 2015-08-24 05:44:42.030 0
The app that was storing the answers should have flagged the older records as "deleted" (set isDeleted = 1) but it did not do so, and I now need to clean up the older records.
This seems like it should be simple, but it's got me stumped. How do I (a) select any records where there are duplicate subscriberID and questionIDs but with different answer dates? And (b) how do I do an update to set all but the newest records for each subscriber to have isDeleted=1?
Any help would be appreciated! I suspect a self join may be in order, but I haven't figured it out yet. Thus the question!