Can we select rows with duplicate entries in Alter Row transformation in Azure Data Factory?

313 Views Asked by At

There are two columns:

col_A col_B
111 2.0
222 1.0
222 2.0
333 1.0

Using alter row transformation i would like to select rows that have repeated entries in col_A, in this example 222 and select the corresponding highest value in col_B i.e. 2.0

The output should look as follows:

col_A col_B
111 2.0
222 2.0
333 1.0
1

There are 1 best solutions below

6
Rakesh Govindula On BEST ANSWER

You can use Aggregate transformation in Data flow.

This is my source data:

enter image description here

In the Aggregate transformation, give col_A for group by to select the distict rows and col_B for aggregate. Use the max(col_B) in this to get the max values.

Group By:

enter image description here

Aggregate:

enter image description here

Result:

enter image description here

Update about Extra Columns:

To Carry the Remaining columns after aggregation transformation, use Join transformation with Join type as inner join(Join of above with source) and give the above two columns as Keys.

enter image description here

After Join, you can remove the extra columns with select transformation.

enter image description here

Result with Extra rows:

enter image description here