Submitting parallel jobs on HTCondor, using python

552 Views Asked by At

I am trying to submit parallel jobs in a loop on HTCondor, following is a simple example of the python script -

test_mus = np.linspace(0, 5, 10)
results = [pyhf.infer.hypotest(test_mu, data, model)
        for test_mu in test_mus]

I would like to submit each job (results), over the for loop (so 10 jobs) simultaneously to 10 machines, and then combine all the results in a pickle.

I have the submission script for this job as below -

executable            = CLs.sh
+JobFlavour           = "testmatch"
arguments             = $(ClusterId) $(ProcId)
Input                 = LHpruned_NEW_32.json
output                = output_sigbkg/CLs.$(ClusterId).$(ProcId).out
error                 = output_sigbkg/CLs.$(ClusterId).$(ProcId).err
log                   = output_sigbkg/CLs.$(ClusterId).log
transfer_input_files  = CLs_test1point.py, LHpruned_NEW_32.json
should_transfer_files = YES
queue

I would like to know how to submit 10 jobs, to parallelize the jobs. Thank you !

Best, Shreya

1

There are 1 best solutions below

0
Matt Pitkin On

It's worth looking at the description of how to submit a HTCondor DAG via Python here. In your case, if you install the htcondor Python module, you could do something like:

import htcondor
from htcondor import dags

# create the submit script
sub = htcondor.Submit(
    {
        "executable": "CLs.sh",
        "+JobFlavour": "testmatch",
        "arguments": "$(ClusterId) $(ProcId)",
        "Input": "LHpruned_NEW_32.json",
        "output": "output_sigbkg/CLs.$(ClusterId).$(ProcId).out",
        "error": "output_sigbkg/CLs.$(ClusterId).$(ProcId).err",
        "log": "output_sigbkg/CLs.$(ClusterId).log",
        "transfer_input_files": "CLs_test1point.py, LHpruned_NEW_32.json",
        "should_transfer_files": "YES",
    }
)

# create DAG
dag = dags.DAG()

# add layer with 10 jobs
layer = dag.layer(
    name="CLs_layer",
    submit_description=sub,
    vars=[{} for i in range(10)],
)

# write out the DAG to current directory
dagfile = dags.write_dag(dag, ".")

You can use the vars argument to add macros giving values for each specific job if you want, e.g., if you wanted mu as one of the executable arguments you could switch this to:

import htcondor
from htcondor import dags

# create the submit script
sub = htcondor.Submit(
    {
        "executable": "CLs.sh",
        "+JobFlavour": "testmatch",
        "arguments": "$(ClusterId) $(ProcId) $(MU)",
        "Input": "LHpruned_NEW_32.json",
        "output": "output_sigbkg/CLs.$(ClusterId).$(ProcId).out",
        "error": "output_sigbkg/CLs.$(ClusterId).$(ProcId).err",
        "log": "output_sigbkg/CLs.$(ClusterId).log",
        "transfer_input_files": "CLs_test1point.py, LHpruned_NEW_32.json",
        "should_transfer_files": "YES",
    }
)

# create DAG
dag = dags.DAG()

# add layer with 10 jobs
layer = dag.layer(
    name="CLs_layer",
    submit_description=sub,
    vars=[{"MU": mu} for mu in np.linspace(0, 5, 10)],
)

# write out the DAG to current directory
dagfile = dags.write_dag(dag, ".")

Once the DAG is created, you can either submit it as normal with condor_submit_dag from the terminal, or submit it via Python using the instructions here.

Note: your +JobFlavour value for the submit file will actually get converted to MY.JobFlavour in the file that gets created, but that's ok and means the same thing.