I have 2 tables, #"Lead Data1" and #"Enrolment Data", with ~300k and ~16k rows respectively with key (Lead Number). I am using a separate query to pull #"Enrolment Data". I want to update the #"Lead Data1" columns {"Created On", "Course", "Program", "Business"} from the #"Enrolment Data", retaining the other columns. If any key (Lead Number) in #"Enrolment Data" is not found in #"Lead Data1", those records need to be appended to it.
I tried the below code:
let
Source = Excel.Workbook(File.Contents("xxx.xlsb"), null, true),
#"Lead Data1" = Source{[Name="Lead Data"]}[Data],
#"Promoted Headers" = Table.PromoteHeaders(#"Lead Data1", [PromoteAllScalars=true]),
#"Removed Other Columns" = Table.SelectColumns(#"Promoted Headers",{"Created On", "Owner", "Lead Number", "Source", "Source type", "TL", "Course", "Program", "Business"}),
Here I remove all unnecessary columns, these are the ones I need in my output from combining #"Lead Data1" and #"Enrolment Data". I will use #"Removed Other Columns" table again to append non-matched items.
#"Removed Other Columns1" = Table.SelectColumns(#"Removed Other Columns",{"Owner", "Lead Number", "Source", "Source type", "TL"}),
I only keep columns I need to merge to #"Enrolment Data" (which don't have these)
#"Merged Queries" = Table.Join(#"Removed Other Columns1", {"Lead Number"}, #"Enrolment Data", {"Lead Number"}, JoinKind.RightOuter, 2),
I am merging with #"Enrolment Data" and not the other way around because it has much lower number of rows (if that makes a difference). This would give me the "updated" rows, including those with keys not matching #"Lead Data1".
#"Rows to Remove" = Table.Column(#"Merged Queries","Lead Number"),
Once merged, I select the list of matched keys (updated rows) in #"Enrolment Data".
#"Selected Rows" = Table.RemoveMatchingRows(#"Removed Other Columns", each List.Contains(#"Rows to Remove",[#"Lead Number"])),
I remove those rows from #"Lead Data1" (giving "non-updated" rows)
#"Appended Queries" = Table.Combine({#"Merged Queries", #"Selected Rows"})
I append the "Updated" and "Non-Updated" rows.
in #"Appended Queries"
However, the query throws an error of "Data Source not found" - I think this might be because I am referencing #"Removed Other Columns" table wrong. Is there any way around it, or my approach itself is faulty? If so, any solution will be highly appreciated.




Your example uses the function Table.Join but as far as i know, it can't work with some columns having the same name in both joined tables so I guess your sample code has been reworked.
Here is a proposal with Table.NestedJoin (that you can eventually replace by Table.Join after a few changes in your column names; I guess Table.Join might offer better performances) where I directly perform an outer join in order to get all expected lines. Then I simply update the original columns (from "Lead data") with "Enrollment data" columns content whenever not null.
Type is lost through such an operation (there might be some workaround?) so you have to re-type impacted columns.