I'm using Airflow 2.2.5 built from the official docker image with a Postgres database. Let's say as an example I have a DAG set to run daily at 2am. As long as I only turn it on a let it run, the process works fine and the DAG runs daily at 2am as intended, but if for some reason I need to do a manual run, say at 11am, now on the next day (and all the following ones) the DAG will run automatically at 11am instead of the programmed 2am.
I'v tried using cron expressions instead of the datetime.timedelta for the schedule_interval and even though in the UI it appears the DAG will run at 2am (despite the 11am manual run), the DAG actually only runs at 11am contrary to the UI indication.
As anyone else noticed this behavior? And is there anything I can do to prevent the manual runs from interfering with the scheduled ones?
Thanks
Airflow is a data flow tool, and each run will give you some context variables to use them in your processing:
data_interval_start: it's the first date in the data you will process, it's equal to the previous run end datedata_interval_end: it's the last date in the data you will processSo if you are using these variables to filter the data you want to process, you cannot change the
schedule_intervaljust for running a manual run, because if you use0 11,12 * * *for example, you will have two runs per day:data_interval_start=12h00 of the previous dayanddata_interval_end=11h00 of the current day(23 hours)data_interval_start=11h00 of the current dayanddata_interval_end=12h00 of the current day(1 hour)But if you have a dag which use the full data at each run, or a dag which do some tasks without using any data, you can trigger it manually by different ways:
Trigger DAGin the dag pageairflow dags trigger <dag_id>(doc)POST api/v1/dags/{dag_id}/dagRuns(doc)