I am trying to load csv files from Datalake into delta tables but I am getting duplicate column name issue while spinning up the tables.
My CSV looks something like this -
Id Alpha Source Alpha source
1 AKH null
2 AKG
And I trying to load the table from abfss -
@dlt.table(comment="load csv files in bronze",
name = "dev.bronze.logs",
table_properties = {
'delta.columnMapping.mode': 'name'
})
def table():
landing_zone_path = "abfss://[email protected]/log/"
df = spark.readStream.format("cloudFiles") \
.option("cloudFiles.format", "csv") \
.option("header","true")\
.option("inferSchema", "True")\
.load(landing_zone_path)
return df
I would expect the additional columns to go into _rescue_data but thats not happening. I also tried using spark.conf.set('spark.sql.caseSensitive', True) but that dint work too.
