I have a schema of
-- item: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- key: string (nullable = true)
| | |-- type: string (nullable = true)
| | |-- one: string (nullable = true)
| | |-- two: boolean (nullable = true)
| | |-- three: long (nullable = true)
- I want to create a new column for each key in the array and value of it should be based on the type (if item.type = "one", then value of that key will be element.one)
- I want to remove struct(item) from the array if its key equals "electronic"
Couldn't understand exactly what you want, but like Kafels said you can use inline to explode the array of structs into rows (one per element at the array) and columns, and then you will have the following schema:
And after that just filter the "electronic" items with
df.filter(col("item") != "electronic")