Merge rows if they contain a duplicate value in one of many columns

60 Views Asked by At

SQL Fiddle

I'd like a query that will merge rows that have if they have the same value in one of 3 columns.

  • If one of of the following columns matches the rows should merge.
    • customerid, externalid, customername
  • It should retain the date and customerId of the record most recent joindate.
  • The externalId, customerName & shiptocode columns should be concatinated if they are not the same
  • I don't want to update the table, I just want a query.

Given this table:

customerid externalid customername shiptocode joindate
01 1234567 Test Customer 1 2023-01-04
01 123567 Test Customer 2 2022-12-31
03 1234567 Test 10 2022-01-05
04 ARACODE ARACODE Customer 1 2022-12-28
05 ARACODE2 ARACODE Customer 2 2023-01-04
06 CBE1EX Normal Customer 1 2023-01-04
07 ZOCDOC NormAL Customer 1 2023-01-01

They query should result in this result:

customerid externalid customername shiptocode joindate
03 1234567, 123567 Test Customer, Test 1, 2, 10 2023-01-04
05 ARACODE2,ARACODE ARACODE Customer 2, 1 2023-01-04
06 CBE1EX, ZOCDOC Normal Customer 1 2023-01-04

I have this working for one field but I'm not sure how to get it to concat the additional fields. It seems like I'd need to use a Partition by but I don't see a way to concat with that.

SELECT 
  DISTINCT(customerId),
  array_to_string(array_agg(distinct externalId),', ') AS externalId,
  array_to_string(array_agg(distinct customerName),', ') AS customerName, 
  array_to_string(array_agg(distinct shipToCode),', ') AS shipToCode,
  MAX(joinDate)
 FROM customers
 GROUP BY customerId
 ORDER BY MAX(joinDate) DESC;
0

There are 0 best solutions below