Batch workflow

The workflow to submit batch processing to the Condor system is as follows:

  1. Create a directory in which to submit jobs to the Condor system.

    Make sure that the directory and files with which you plan to work are readable and writable by other users, which include Condor processes.

    For example, type the following:

    mkdir condor
    cd condor
    

    You can request that a project directory be set up for you to use for batch processing. If you perform you batch processing within your home directory, the space used for your data and program files can consume much of your allotted resources, and this can cause problems with logging into the system, so working in a project space is recommended. For more information on project spaces go to our Projects and Shared Space page.

  2. Choose an execution environment, called a universe, for your jobs.

    At HMDC, you always use the vanilla universe. This execution environment supports processing of individual serial jobs, but has few other restrictions on the types of jobs that you can execute.

  3. Make your jobs batch ready.

    Batch processing runs in the background, meaning that you cannot input to your executable interactively. You must create a program or script that reads in your inputs from a file, and writes out your outputs to another file.

    You also must identify the full path and executable source to use for your Condor cluster. The default executable for the condor_submit_util script is the R language. In the RCE, the path and executable source for this language is /usr/bin/R.  Any command line application or program can be submitted as a batch job (Matlab, Stata, Python, etc)

  4. If you choose to use the condor_submit_util script to create the submit description file (or submit file) and submit your jobs to the Condor system for batch processing automatically, skip to step the next step.

    If you choose to submit your batch processing to the Condor system manually, create a submit file.

    A submit file is a plain-text file that describes a batch of jobs for the Condor software. This file contains the following descriptors:

    • Environment (vanilla)

    • Executable program path and file name

    • Program arguments

    • Input and output file names

    • Log and error file names

    Here is an example of a basic submit file:

    Universe        = vanilla
    Executable      = /usr/bin/R
    Arguments       = --no-save --no-restore
    should_transfer_files = NO
    Requirements = Memory >= 32
    output  = $HOME/mybatchjob/output.txt
    error   = $HOME/mybatchjob/error.txt
    Log     = $HOME/mybatchjob/log.txt
    Queue   1
    
  5. Execute the condor_submit_util command to write the submit file and submit your program automatically to the Condor job queue.

    If you chose to write your own submit file, execute the condor_submit <submit file>.submit command to submit your jobs to the queue.

    Condor then checks the submit file for errors, creates a ClassAd object and the object attributes for that cluster, and then places this object in the queue for processing.