I have to maintain a history table. Table A is the main table and table B the history table. On 1st run table A has 100 records, so the table B has 100 records. On next day run, In table A, I have 2 new records and 1 updated record. So I need only those 3 records to be pulled and loaded in table B. Column "ID" is unique column and also have "created date" and "updated date" columns in both the tables. On day 2, I will have 2 records for one particular ID in history, counting to total 103 records.
How can I achieve this?
- Step 1: pull new records: insert into left join on id column
- Step 2: to pull updated record: Insert overwrite join on id where
source.updated date>target.updated date
Issue is I having duplicate records for other IDs as well. I wasn't able to pull only new and updated records in subsequent loads.
I can't use merge syntax. I need to achieve this using basic insert and insert overwrite.