I have this is tsv file in Azure Data Lake which contains two main columns among other columns:
| Id | ResponsePath |
|---|---|
| 1 | response path for id 1 |
| 2 | response path for id 2 |
The response path for each record is a path to json file which is a also stored in Azure Data Lake.
I want to iterate over each record in the file, read the json stored in the file using the response path for that record, extract a property from this json and add it as a column along with the data from original file tsv file and upload this output to another data store.
So the end result would look something like this:
| Id | ResponsePath | trace id |
|---|---|---|
| 1 | response path for id 1 | 61sd3414e |
| 2 | response path for id 2 | 6153eidkl |
How do I do this is Azure Data Factory?
I tried setting up a lookup to fetch the column but the I cannot query for files in ADLS under the lookup module. I was thinking about using some data flow but have no idea how to set it up for this purpose since I would have to create a source based on a coulmn from another file.
You can use Lookup activity to read the
tsvfile and then use foreach activity to iterate over each row intsvfile. Inside foreach activity, use dataflow activity to get the field value from Json by passing theResponsePathas a dataflow parameter.ResponsePathvalue in that.tsvfile.ResponsePathcolumnDataflow inside the For-each activity: