Batch workflow

The workflow to submit batch processing to the Condor system is as follows:

  1. Create a directory in which to submit jobs to the Condor system.

    Make sure that the directory and files with which you plan to work are readable and writable by other users, which include Condor processes.

  2. Choose an execution environment, called a universe, for your jobs.

    At HMDC, you always use the vanilla universe. This execution environment supports processing of individual serial jobs, but has few other restrictions on the types of jobs that you can execute.

  3. Make your jobs batch ready.

    Batch processing runs in the background, meaning that you cannot input to your executable interactively. You must create a program or script that reads in your inputs from a file, and writes out your outputs to another file.

    You also must identify the full path and executable source to use for your Condor cluster. The default executable for the condor_submit_util script is the R language. In the RCE, the path and executable source for this language is /usr/bin/R.

  4. If you choose to use the condor_submit_util script to create the submit description file (or submit file) and submit your jobs to the Condor system for batch processing automatically, skip to step the next step.

    If you choose to submit your batch processing to the Condor system manually, create a submit file.

    A submit file is a plain-text file that describes a batch of jobs for the Condor software. This file contains the following descriptors:

    • Environment (vanilla)

    • Executable program path and file name

    • Program arguments

    • Input and output file names

    • Log and error file names

  5. Execute the condor_submit_util command to write the submit file and submit your program automatically to the Condor job queue.

    If you chose to write your own submit file, execute the condor_submit <submit file>.submit command to submit your jobs to the queue.

    Condor then checks the submit file for errors, creates a ClassAd object and the object attributes for that cluster, and then places this object in the queue for processing.