How to run a Nextflow pipeline from an arbitrary process?

164 Views Asked by At

how can I run a a Nextflow pipeline from an arbitrary process?

-resume option is not useful when there is no error but the content of the output file of a process is erroneous and you want to make a run from a selected process onward. Somehow, I could not find a working answer to this simple looking question on Google.

2

There are 2 best solutions below

3
dthorbur On

You can add checkpoints throughout your workflow declaration. For example, here is a simple pipeline with 2 processes where an input channel is created either from the collection of the output of Process1 or from the publishDir directory from Process1. I used the collect() function just to show that processing of the all_p1 channel.

workflow ExamplePipe {
  main:
  if( params.skip_p1 == false ){
    Process1()
    Process1
      .out
      .output1
      .collect()
      .set { all_p1 }
  }

  if (params.skip_p1 ){
    Channel
       .fromPath("path/to/Process1publishDir/*.fasta")
       .collect()
       .set { all_p1 }
  }

  Process2( all_p1 )
}
0
entropy On

The most practical solution I found is to change the name of the folder for the process under the work directory and use -resume option to run. Then it only runs the last process as I wanted. In general I have updates in the last process and this solution seem to work practical enough for now. If I need to run multiple process in sequence I would change the folder names of those processes too. The relevant folder names are already printed into the terminal by Nextflow after each run.