Terraform loop over Map variable to provision multiple Databricks catalogs

37 Views Asked by At

I'm following this example to provision multiple Databricks catalogs (see provider method) using for_each, however terraform doesn't detect any differences from the state. The provider method definitely works - rather it seems the issue is the HCL syntax to perform a loop, which is not noticing anything.

variables.tf (where the configuration is declared):

variable "catalog" {
  type = map(object({
    catalog_grants         = optional(map(list(string)))
    catalog_owner          = optional(string)         # Username/groupname/sp application_id of the catalog owner.
    catalog_storage_root   = optional(string)         # Location in cloud storage where data for managed tables will be stored
    catalog_isolation_mode = optional(string, "OPEN") # Whether the catalog is accessible from all workspaces or a specific set of workspaces. Can be ISOLATED or OPEN.
    catalog_comment        = optional(string)         # User-supplied free-form text
    catalog_properties     = optional(map(string))    # Extensible Catalog Tags.
    schema_name            = optional(list(string))   # List of Schema names relative to parent catalog.
    schema_grants          = optional(map(list(string)))
    schema_owner           = optional(string) # Username/groupname/sp application_id of the schema owner.
    schema_comment         = optional(string)
    schema_properties      = optional(map(string))
  }))
  description = "Map of catalog name and its parameters"
  default = {
    catalog = {
      example_catalog1 = {
        catalog_grants = {
          "[email protected]" = ["USE_CATALOG", "USE_SCHEMA", "CREATE_SCHEMA", "CREATE_TABLE", "SELECT", "MODIFY"]
        }
        schema_name = ["raw", "refined", "data_product"]
      }
      example_catalog2 = {
        catalog_grants = {
          "[email protected]" = ["USE_CATALOG", "USE_SCHEMA", "CREATE_SCHEMA", "CREATE_TABLE", "SELECT", "MODIFY"]
        }
        schema_name = ["raw", "refined", "data_product"]
      }
    }
  }
}

main.tf (where the loop is performed)

resource "databricks_catalog" "this" {
  for_each = var.catalog

  name           = each.key
  owner          = each.value.catalog_owner
  storage_root   = each.value.catalog_storage_root
  isolation_mode = each.value.catalog_isolation_mode
  comment        = lookup(each.value, "catalog_comment", "default comment")
  properties     = lookup(each.value, "catalog_properties", {})
  force_destroy  = true
}

The only required argument for the databricks_catalog resource is name, which is provided as the key in the dictionary above.

On the other hand, provisioning a catalog without the loop works:

resource "databricks_catalog" "sandbox" {
  name    = "example_catalog3"
}
1

There are 1 best solutions below

2
marcjanek On BEST ANSWER

In the variable block for the default value, you have provided an incorrect structure. You have provided a map of a map with catalogs. To solve your issue you need to remove the top-level map, which is unnecessary for your use case. After removing that map your variable will look like this:

variable "catalog" {
  type = map(object({
    catalog_grants         = optional(map(list(string)))
    catalog_owner          = optional(string)         # Username/groupname/sp application_id of the catalog owner.
    catalog_storage_root   = optional(string)         # Location in cloud storage where data for managed tables will be stored
    catalog_isolation_mode = optional(string, "OPEN") # Whether the catalog is accessible from all workspaces or a specific set of workspaces. Can be ISOLATED or OPEN.
    catalog_comment        = optional(string)         # User-supplied free-form text
    catalog_properties     = optional(map(string))    # Extensible Catalog Tags.
    schema_name            = optional(list(string))   # List of Schema names relative to parent catalog.
    schema_grants          = optional(map(list(string)))
    schema_owner           = optional(string) # Username/groupname/sp application_id of the schema owner.
    schema_comment         = optional(string)
    schema_properties      = optional(map(string))
  }))
  description = "Map of catalog name and its parameters"
  default = {
    # HERE I REMOVED THE MAP "CATALOG"
    example_catalog1 = {
      catalog_grants = {
        "[email protected]" = ["USE_CATALOG", "USE_SCHEMA", "CREATE_SCHEMA", "CREATE_TABLE", "SELECT", "MODIFY"]
      }
      schema_name = ["raw", "refined", "data_product"]
    }
    example_catalog2 = {
      catalog_grants = {
        "[email protected]" = ["USE_CATALOG", "USE_SCHEMA", "CREATE_SCHEMA", "CREATE_TABLE", "SELECT", "MODIFY"]
      }
      schema_name = ["raw", "refined", "data_product"]
    }
  }
}

As a validation I've run the Terraform plan and I got 2 resources as expected:

Terraform will perform the following actions:

  # databricks_catalog.this["example_catalog1"] will be created
  + resource "databricks_catalog" "this" {
      + force_destroy  = true
      + id             = (known after apply)
      + isolation_mode = "OPEN"
      + metastore_id   = (known after apply)
      + name           = "example_catalog1"
      + owner          = (known after apply)
    }

  # databricks_catalog.this["example_catalog2"] will be created
  + resource "databricks_catalog" "this" {
      + force_destroy  = true
      + id             = (known after apply)
      + isolation_mode = "OPEN"
      + metastore_id   = (known after apply)
      + name           = "example_catalog2"
      + owner          = (known after apply)
    }

Plan: 2 to add, 0 to change, 0 to destroy.