Airflow on Azure Container Apps with terraform : Errno 13 - Permission denied: '/opt/airflow/logs/scheduler

69 Views Asked by At

I have delopyed airflow on azure container apps using terraform and am using azure file as the volumes for :

/opt/airflow/logs
/opt/airflow/dags
/opt/airflow/files

I have tested an exemple dag and it worked fine but when I tested a production dag I started having this issue

Changing /opt/airflow/logs/dag_id=purshaseOrderProposal/run_id=manual__2024-03-06T14:59:07.165291+00:00/task_id=validate_dfs.load_expectation_suite permission to 509
2024-03-06T15:30:49.302222600Z Failed to change /opt/airflow/logs/dag_id=purshaseOrderProposal/run_id=manual__2024-03-06T14:59:07.165291+00:00/task_id=validate_dfs.load_expectation_suite permission to 509: [Errno 1] Operation not permitted: '/opt/airflow/logs/dag_id=purshaseOrderProposal/run_id=manual__2024-03-06T14:59:07.165291+00:00/task_id=validate_dfs.load_expectation_suite'

this is my dockerfile :

# Use the official Apache Airflow image as the base image
FROM apache/airflow:2.8.1-python3.8

# Set the working directory to /usr/local/airflow
WORKDIR /usr/local/airflow

# Copy the wheel file into the container at /usr/local/airflow
COPY ./dist/*.whl .

# Copy the requirements.txt file into the container at /usr/local/airflow
COPY ./requirements.txt .

# Install dependencies from requirements.txt
RUN pip install --no-cache-dir -r requirements.txt

# Install the specific wheel file
RUN find . -name '*.whl' -type f -exec pip install --no-cache-dir {} +
 

this is my schudler ressource in terraform :

resource "azurerm_container_app" "scheduler" {
  name                         = "airflow-scheduler"
  container_app_environment_id = azurerm_container_app_environment.app_environment.id
  resource_group_name          = azurerm_resource_group.resource_group.name
  revision_mode                = "Single"


  template {
    
    volume {
      name = "logs"
      storage_name = "${azurerm_container_app_environment_storage.logs.name}"
      storage_type = "AzureFile"
    }
    volume {
      name = "dags"
      storage_name = "${azurerm_container_app_environment_storage.dags.name}"
      storage_type = "AzureFile"
    }
    volume {
      name = "files"
      storage_name = "${azurerm_container_app_environment_storage.files.name}"
      storage_type = "AzureFile"
    }
    container {
      name   = "scheduler-app"
      image  = "***/airflow-***:latest"
      cpu    = 1
      memory = "2Gi"
      command = ["airflow","scheduler"]
         env {
        name = "AIRFLOW__CORE__EXECUTOR" 
        value = "CeleryExecutor"
      }
      env {
        name = "AIRFLOW__WEBSERVER__RBAC"
        value = "False"
      }
      env {
        name = "AIRFLOW__WEBSERVER__SECRET_KEY"
        value = "lj7/ZeylaZc+AlTYPrR2Tw=="
      } 
      env {
        name = "AIRFLOW__CORE__STORE_SERIALIZED_DAGS" 
        value = "False"
      }
      env {
        name = "AIRFLOW__CORE__CHECK_SLAS"
        value = "False"
      }
      env {
        name = "AIRFLOW__CORE__PARALLELISM"
        value = "50"
      }
      env {
        name = "AIRFLOW__CORE__LOAD_EXAMPLES"
        value = "False"
      }
      env {
        name = "AIRFLOW__CORE__LOAD_DEFAULT_CONNECTIONS"
        value = "False"
      }
      env {
        name = "AIRFLOW__SCHEDULER__SCHEDULER_HEARTBEAT_SEC"
        value = "10"
      }
      env {
        name = "AIRFLOW__CELERY__BROKER_URL"
        value = ":${azurerm_redis_cache.redis.primary_access_key}@***:6379/0"
      }
       env {
        name = "AIRFLOW__CELERY__RESULT_BACKEND"
        value = "db+postgresql://airflow@airflow@***:***@****:5432/postgres_db?sslmode=require"
      }
      env {
        name = "AIRFLOW__CORE__SQL_ALCHEMY_CONN"
        value = "postgresql+psycopg2://airflow@***:***@****:5432/postgres_db?sslmode=require"
      }
      env {
        name = "AIRFLOW__CORE__FERNET_KEY"
        value = "P**="
      }
      volume_mounts {
        name = "logs"
        path = "/opt/airflow/logs"
      }
      volume_mounts {
        name = "dags"
        path = "/opt/airflow/dags"
      }
      volume_mounts {
        name = "files"
        path = "/opt/airflow/files"
        }
    }
    # init_container {
    #   image = "alpine:latest"
    #   name = "init-scheduler"
    #   command = ["sh", "-c", "chown -R 50000:0 /opt/airflow/logs/"]
    #   cpu = 0.5
    #   memory = "0.5Gi"
    #   volume_mounts {
    #     path = "/opt/airflow/logs"
    #     name = "logs"
    #   }
    # }
  }
  depends_on = [null_resource.run_initdb]
}

I have found the github issue for the problem but I can't really find aways to implement the solution since am working azure container apps and not self-managed k8s

I tried chmod 777 -R /opt chown -R airflow:root /opt chown -R 50000:0 /opt AIRFLOW_UID=0 AIRFLOW_GID=0 also changing the user to root but that started crashing the container as a whole since airflow couldn't find the /home directory I also tried to pass airflow_uid as an env variable in terraform, didn't work!

0

There are 0 best solutions below