Adding custom metadata to DataFrame schema using iceberg table format

430 Views Asked by Almog Gelber At 22 November 2021 at 09:38

I'm adding custom metadata into the DataFrames schema in my PySpark application using StructField's metadata field

It worked fine when I wrote parquet files directly into s3. The custom metadata was available when reading these parquet files as expected.

But it's not working using iceberg table format. There is no error, but the df.schema.fields.metadata is always empty.

Is there a way to solve it?

There are 1 best solutions below

Almog Gelber On 29 November 2021 at 05:16

Solved by making sure the key is always 'comment'

For example: {'comment': 'my_metadata_info_field'}