Once you have submitted your job(s) to the queue, you have various ways of checking in on the status of your jobs including e-mail notification of job completion and command line access to both your jobs status and the current state of the cluster.
Managing Job Status
You can monitor progress of your batch processing using the
condor_q commands. This section describes how to check the status of your processes at any time, and how to remove a process from the Condor queue.
After you submit a cluster for processing, you can check the status of the Condor pool machines and verify that machines are available on which your jobs can execute.
To check the status of the Condor pool, type the command condor_status. This command returns information about the pool resources. Output lists the number of virtual machines (VMs) available in the pool and whether they are in use. If there are no idle VMs, your batch processing is queued when it is submitted.
Name OpSys Arch State ActivityLoadAvMemActvtyTime
email@example.com LINUX X86_64Claimed Busy 1.060 19750+17:43:50
firstname.lastname@example.org LINUX X86_64 Claimed Busy 1.060 1975 0+17:43:48
email@example.com LINUX X86_64 Claimed Busy 1.000 1975 0+17:44:43
firstname.lastname@example.org LINUX X86_64 Claimed Busy 1.000 1975 0+17:44:36
email@example.com LINUX X86_64 UnclaimedIdle 0.010 1975 0+00:03:57
firstname.lastname@example.org LINUX X86_64 Unclaimed Idle 0.000 1975 0+00:00:04
email@example.com LINUX X86_64 Unclaimed Idle 0.000 1975 0+00:00:04
Total Owner Claimed Unclaimed Matched Preempting Backfill
X86_64/LINUX 7 0 4 3 0 0 0
Total 7 0 4 3 0 0 0
To check the cumulative use of resources within in the Condor pool, include the option
-submitterwith the command
condor_status. This command returns information about each user in the Condor pool. Output lists the user's name, machine in use, and current number of jobs per machine. Use this command to help determine how many resources Condor has available to run your jobs. An example is shown here:
Name Machine Running IdleJobs HeldJobs
firstname.lastname@example.org w4.hmdc.ha 2 0 0
email@example.com x1.hmdc.ha 9 0 0
firstname.lastname@example.org x3.hmdc.ha 40 0 0
email@example.com. x5.hmdc.ha 32 0 0
RunningJobs IdleJobs HeldJobs
firstname.lastname@example.org 49 0 0
email@example.com. 32 0 0
firstname.lastname@example.org 2 0 0
Total 83 0 0
Removing your job
To remove a process from the queue, type the command condor_rm <cluster ID>.<process ID>. For example:
> condor_rm 9.9
Job 9.9 marked for removal
To find a list of your jobs type:
To remove all jobs affiliated with a cluster, type the command condor_rm <cluster ID>. For example, the command condor_rm 4 removes all jobs assigned to cluster 4.
To remove all of your clusters' jobs from the Condor queue, type condor_rm -a. For example:
> condor_rm -a
All jobs marked for removal.
Jobs must be deleted from the host they were submitted from.