SQL query to explode mutiple values in a nested json using databricks

73 Views Asked by At

I'm trying to fetch the data of a nested json using databricks sql, but unable to perform the explode on multiple array columns and its throwing below error

[UNSUPPORTED_GENERATOR.MULTI_GENERATOR] The generator is not supported: only one generator allowed

when trying to execute the below query

SELECT
explode(array_join(from_json(food:Fruits.children\[*\].`Fruit Quantity`, 'array\<string\>'), ' - ')) AS `Fruit Quantity`,
explode(array_join(from_json(food:Fruits.children\[*\].`Fruit Weight`, 'array\<string\>'), ' - ')) AS `Fruit Weight`,
explode(array_join(from_json(food:Fruits.children\[\*\].`Fruit Packaging`, 'array\<string\>'), ' - ')) AS `Fruit Packaging`
FROM
input_table

Input

| Day | Food |
| -------- | -------- |
| Day 1   | {
  "breakfast": {
    "carbs": [
    {
    "carbs name": "No food"
    }

    ]
  },
  "Fruits": {
    "batch": "batch01",
    "children": [
      {
        "Fruit name": "apple01",
        "Fruit Quantity": 3,
        "Fruit Weight": "50g",
        "Fruit Packaging": "Basket"
      },
      {
        "Fruit name": "apple01",
        "Fruit Quantity": 3,
        "Fruit Weight": "50g",
        "Fruit Packaging": ""
      },
      {
        "Fruit name": "apple02",
        "Fruit Quantity": 5,
        "Fruit Weight": "100g",
        "Fruit Packaging": "Foil"
      }
    ]
  }
}   |
Day 2 | {dinner:["bread", "egg"]}

Output enter image description here

Note: I need sql without any CTEs or subqueries.

The food column will have different data types i might need different functionality to be done on those as separate columns. The Lateral view explode option is excluding the other data types

1

There are 1 best solutions below

5
s.polam On BEST ANSWER

Use LATERAL VIEW EXPLODE to explode multiple array columns. Check below query.

SELECT
  day,  
  FROM_JSON(food, 'Fruits struct<batch string>').Fruits.batch AS Batch,
  children['Fruit name'] AS `Fruit name`,
  children['Fruit Quantity'] AS `Fruit Quantity`,
  children['Fruit Weight'] AS `Fruit Weight`,
  children['Fruit Packaging'] AS `Fruit Packaging`,
  carbs['carbs name'] AS `carbs name`
FROM VALUES 
  ('Day 1 ', '{ "breakfast": { "carbs": [ { "carbs name": "No food" } ] }, "Fruits": { "batch": "batch01", "children": [ { "Fruit name": "apple01", "Fruit Quantity": 3, "Fruit Weight": "50g", "Fruit Packaging": "Basket" }, { "Fruit name": "apple01", "Fruit Quantity": 3, "Fruit Weight": "50g", "Fruit Packaging": "" }, { "Fruit name": "apple02", "Fruit Quantity": 5, "Fruit Weight": "100g", "Fruit Packaging": "Foil" } ] } }') AS (day, food) 
LATERAL VIEW EXPLODE(FROM_JSON(food, 'Fruits struct<children array<map<string,string>>>').Fruits.children) AS children
LATERAL VIEW EXPLODE(from_json(food, 'breakfast struct<carbs array<map<string,string>>>').breakfast.carbs) AS carbs

+------+-------+----------+--------------+------------+---------------+----------+
|day   |Batch  |Fruit name|Fruit Quantity|Fruit Weight|Fruit Packaging|carbs name|
+------+-------+----------+--------------+------------+---------------+----------+
|Day 1 |batch01|apple01   |3             |50g         |Basket         |No food   |
|Day 1 |batch01|apple01   |3             |50g         |               |No food   |
|Day 1 |batch01|apple02   |5             |100g        |Foil           |No food   |
+------+-------+----------+--------------+------------+---------------+----------+