Add File name column to Dynamic Frame

19 Views Asked by At

I am trying to fetch json files from a S3 prefix into Glue DynamicFrame. What I also want is to add column for file name to identify the each record source file. I am trying like below using option attachFilename -

def read_data(self) -> DynamicFrame:
        dyf = self.glue_context.create_dynamic_frame.from_options(
            connection_type="s3",
            connection_options={
                "paths": [f"s3://{self.args['SOURCE_S3_BUCKET']}/{self.args['SOURCE_S3_KEY']}"],
                "recurse": True
            },
            format="json",
            format_options={
                "jsonPath": "$",
                "multiline": True,
                "attachFilename": "source_file_name"
            },
            transformation_ctx=f"extract_data"
        )
        print("dyf")
        print(dyf.show(2))

        return dyf

But getting below error -

An error occurred while calling o118.toDF. source_file_name already exists

I have tried changing the column name for attachFilename but still getting error for any column name. Can someone please help me to identify what I am doing wrong here ?

0

There are 0 best solutions below