I have delopyed airflow on azure container apps using terraform and am using azure file as the volumes for :
/opt/airflow/logs
/opt/airflow/dags
/opt/airflow/files
I have tested an exemple dag and it worked fine but when I tested a production dag I started having this issue
Changing /opt/airflow/logs/dag_id=purshaseOrderProposal/run_id=manual__2024-03-06T14:59:07.165291+00:00/task_id=validate_dfs.load_expectation_suite permission to 509
2024-03-06T15:30:49.302222600Z Failed to change /opt/airflow/logs/dag_id=purshaseOrderProposal/run_id=manual__2024-03-06T14:59:07.165291+00:00/task_id=validate_dfs.load_expectation_suite permission to 509: [Errno 1] Operation not permitted: '/opt/airflow/logs/dag_id=purshaseOrderProposal/run_id=manual__2024-03-06T14:59:07.165291+00:00/task_id=validate_dfs.load_expectation_suite'
this is my dockerfile :
# Use the official Apache Airflow image as the base image
FROM apache/airflow:2.8.1-python3.8
# Set the working directory to /usr/local/airflow
WORKDIR /usr/local/airflow
# Copy the wheel file into the container at /usr/local/airflow
COPY ./dist/*.whl .
# Copy the requirements.txt file into the container at /usr/local/airflow
COPY ./requirements.txt .
# Install dependencies from requirements.txt
RUN pip install --no-cache-dir -r requirements.txt
# Install the specific wheel file
RUN find . -name '*.whl' -type f -exec pip install --no-cache-dir {} +
this is my schudler ressource in terraform :
resource "azurerm_container_app" "scheduler" {
name = "airflow-scheduler"
container_app_environment_id = azurerm_container_app_environment.app_environment.id
resource_group_name = azurerm_resource_group.resource_group.name
revision_mode = "Single"
template {
volume {
name = "logs"
storage_name = "${azurerm_container_app_environment_storage.logs.name}"
storage_type = "AzureFile"
}
volume {
name = "dags"
storage_name = "${azurerm_container_app_environment_storage.dags.name}"
storage_type = "AzureFile"
}
volume {
name = "files"
storage_name = "${azurerm_container_app_environment_storage.files.name}"
storage_type = "AzureFile"
}
container {
name = "scheduler-app"
image = "***/airflow-***:latest"
cpu = 1
memory = "2Gi"
command = ["airflow","scheduler"]
env {
name = "AIRFLOW__CORE__EXECUTOR"
value = "CeleryExecutor"
}
env {
name = "AIRFLOW__WEBSERVER__RBAC"
value = "False"
}
env {
name = "AIRFLOW__WEBSERVER__SECRET_KEY"
value = "lj7/ZeylaZc+AlTYPrR2Tw=="
}
env {
name = "AIRFLOW__CORE__STORE_SERIALIZED_DAGS"
value = "False"
}
env {
name = "AIRFLOW__CORE__CHECK_SLAS"
value = "False"
}
env {
name = "AIRFLOW__CORE__PARALLELISM"
value = "50"
}
env {
name = "AIRFLOW__CORE__LOAD_EXAMPLES"
value = "False"
}
env {
name = "AIRFLOW__CORE__LOAD_DEFAULT_CONNECTIONS"
value = "False"
}
env {
name = "AIRFLOW__SCHEDULER__SCHEDULER_HEARTBEAT_SEC"
value = "10"
}
env {
name = "AIRFLOW__CELERY__BROKER_URL"
value = ":${azurerm_redis_cache.redis.primary_access_key}@***:6379/0"
}
env {
name = "AIRFLOW__CELERY__RESULT_BACKEND"
value = "db+postgresql://airflow@airflow@***:***@****:5432/postgres_db?sslmode=require"
}
env {
name = "AIRFLOW__CORE__SQL_ALCHEMY_CONN"
value = "postgresql+psycopg2://airflow@***:***@****:5432/postgres_db?sslmode=require"
}
env {
name = "AIRFLOW__CORE__FERNET_KEY"
value = "P**="
}
volume_mounts {
name = "logs"
path = "/opt/airflow/logs"
}
volume_mounts {
name = "dags"
path = "/opt/airflow/dags"
}
volume_mounts {
name = "files"
path = "/opt/airflow/files"
}
}
# init_container {
# image = "alpine:latest"
# name = "init-scheduler"
# command = ["sh", "-c", "chown -R 50000:0 /opt/airflow/logs/"]
# cpu = 0.5
# memory = "0.5Gi"
# volume_mounts {
# path = "/opt/airflow/logs"
# name = "logs"
# }
# }
}
depends_on = [null_resource.run_initdb]
}
I have found the github issue for the problem but I can't really find aways to implement the solution since am working azure container apps and not self-managed k8s
I tried chmod 777 -R /opt chown -R airflow:root /opt chown -R 50000:0 /opt AIRFLOW_UID=0 AIRFLOW_GID=0 also changing the user to root but that started crashing the container as a whole since airflow couldn't find the /home directory
I also tried to pass airflow_uid as an env variable in terraform, didn't work!