We have a number of SQL instances which have the same structure. Transactional replication has been set up to publish these into a single Subscriber. (A data warehouse scenario.)
We are using MS SQL 2012 R2 with the standard transactional replication setup
Each publisher instance table has a identifier column which is not part of that tables primary or clustered key. On the Subscriber we have added the identifier column to the primary or clustered key. We now have issues on deletion where the rows submitted cannot be found in the Subscriber as they have already been removed by the first publishers deletion. We are missing that identifier column at source.
As the publisher instances are supplied by the ERP vendor, I don't want to modify these tables to include the identifier column in the clustered keys.
How can I add the additional identifier column to the clustered key through the replication process?
If I've understood your scenario correctly, I would build the process as follows:
For each source system, I would add a source system identifier to each dataset as it lands in the ingestion layer of your subscriber
Initially process data from each source system in isolation. So if a record is deleted in a source system, then flag that source system's record in your subscriber as deleted
Where the same data can be sourced from multiple systems (which I think is the scenario you are describing) determine what the source system precedence logic is. For example, if you have customer data in multiple systems
Build your next data layer (after the ingestion layer) using this logic
BTW I probably wouldn't be deleting records in your subscriber, instead I would soft-delete them and then you wouldn't have these types of issues.