How to address kubernetes load balancer profile cyclic dependency

58 Views Asked by At

I have a Kubernetes cluster in my Terraform template, as follows:

resource "azurerm_kubernetes_cluster" "k8s-cluster-dev" {
  name                = "k8s-cluster-dev"
  location            = azurerm_resource_group.api-rg.location
  resource_group_name = azurerm_resource_group.api-rg.name
  dns_prefix          = "k8s-cluster-dev"
  depends_on = [
    azurerm_public_ip.k8s-cluster-dev-ingress-load-balancer-public-ip-address
  ]

  default_node_pool {
    name                        = "default"
    node_count                  = 1
    vm_size                     = "Standard_B4ms"
    vnet_subnet_id              = azurerm_subnet.api-vnet-subnet-01.id
  }

  network_profile {
    network_plugin = "azure"
    network_policy = "azure"
    service_cidr   = "10.0.0.0/16"
    dns_service_ip = "10.0.0.10"
    load_balancer_profile {
      outbound_ip_address_ids = [azurerm_public_ip.k8s-cluster-dev-ingress-load-balancer-public-ip-address.id]
    }
  }

  identity {
    type = "SystemAssigned"
  }
}

The public IP mentioned above has to be deployed into a "Managed Cluster" resource group (MC_***) in order for Ingress controller to be able to find it. If deployed elsewhere, the ingress controller cannot be deployed successfully.

This is the public IP address Terraform:

resource "azurerm_public_ip" "k8s-cluster-dev-ingress-load-balancer-public-ip-address" {
  name                = "k8s-cluster-dev-ingress-load-balancer-public-ip-address"
  resource_group_name = "MC_${var.resource_group_name_api_rg}_${var.k8s_cluster_dev_name}_${var.location_code}"
  location            = azurerm_resource_group.api-rg.location
  allocation_method   = "Static"
  sku                 = "Standard"
}

If I first deploy the kubernetes cluster without deploying the public IP address (and remove the load_balancer_profile.outbound_ip_address_ids from the template), everything works fine

But when I try to deploy both of them in one go, as you can see, it creates cyclic dependency. I cannot deploy the kubernetes cluster until the public IP address is deployed in the right resource group. And that resource group is not created until the creation of kubernetes cluster finishes successfully.

If I try to create the MC_*** resource group beforehand, Kubernetes doesn't like it and the deployment fails complaining about an "already existing" resource group.

I use this Ingress nginx controller: https://docs.nginx.com/nginx-ingress-controller/installation/installing-nic/installation-with-helm/. I couldn't find a way to define a custom resource group while installing it with helm

The only other way is to first deploy the terraform template without load balancer profile and then re-deploy with, but I'd like to avoid it if possible.

How do I resolve this without having to apply the template twice?

1

There are 1 best solutions below

2
Vinay B On

Verifying AKS Cluster Deployment with Public IP in single terraform apply and I was able to achieve this requirement successfully.

The main obstacle in the initial scenario is the cyclic dependency resulting from the requirement to deploy a Public IP address within a Managed Cluster (MC__) resource group, which only materializes after the creation of an Azure Kubernetes Service (AKS) cluster. This leads to a catch-22 situation where the AKS cluster requires the Public IP for its configuration, yet the Public IP cannot be provisioned until the AKS cluster is operational and the MC__ resource group has been established.

The local-exec provisioner was selected to dynamically execute commands post-resource creation, enabling the provisioning of the AKS cluster prior to the Public IP address within the appropriate resource group, all within a single terraform apply. Utilizing the Azure CLI for resource management, this method offers a flexible and compatible resolution to the cyclic dependency challenge, circumventing the need for multiple Terraform executions.

My terraform configuration:

provider "azurerm" {
  features {}
}

variable "resource_group_name" {
  description = "Name of the resource group to contain the AKS cluster"
  type        = string
}

variable "location" {
  description = "Azure region for the resource group"
  type        = string
}

variable "aks_cluster_name" {
  description = "Name of the AKS cluster"
  type        = string
}

resource "azurerm_resource_group" "main" {
  name     = var.resource_group_name
  location = var.location
}

resource "azurerm_kubernetes_cluster" "aks" {
  name                = var.aks_cluster_name
  location            = azurerm_resource_group.main.location
  resource_group_name = azurerm_resource_group.main.name
  dns_prefix          = var.aks_cluster_name

  default_node_pool {
    name       = "default"
    node_count = 1
    vm_size    = "Standard_B4ms"
  }

  identity {
    type = "SystemAssigned"
  }

  tags = {
    Environment = "Development"
  }

  provisioner "local-exec" {
    when        = create
    interpreter = ["pwsh", "-Command"]
    command     = <<-EOT
      az network public-ip create --name '${var.aks_cluster_name}-public-ip' --resource-group '${azurerm_kubernetes_cluster.aks.node_resource_group}' --allocation-method Static --sku Standard
    EOT
  }
}

output "node_resource_group" {
  value = azurerm_kubernetes_cluster.aks.node_resource_group
}

***Deployment Succeeded: ***

enter image description here

enter image description here

enter image description here