i want to read data from a delta table (signals) with following strucutre:
StructType(
[
StructField("timestamp", TimestampType(), **True**),
StructField("name", StringType(), False),
StructField("value", DoubleType(), True)
]
)
Note that the timestmap is allowed to be NULL.
One of multiple Jobs is reading this data with:
spark.readStream.option("badRecordsPath", "path_to_bad_records")
.schema(expected_schema)
.table(signals)
While the expecting schema is following:
StructType(
[
StructField("timestamp", TimestampType(), **False**),
StructField("name", StringType(), False),
StructField("value", DoubleType(), True)
]
)
Note that i do not want to read in any recors with timestmap == NULL even though i know and expect the source to have some of these recors. Ideally i want to save the "malformed" ( records with NULL) to a badRecordsPath to a log file on this path.
I try to answer following questions: What happens to records not following the expected schema ? If they are ignored will the be written to the badrecordspath ? As i can only find examples for files (json, csv)