If you would like to use multiple CPU cores simultaneously, please be sure to create the appropriate resource reservations so that your jobs do not compete with other jobs for a CPU, and are allocated resources for your exclusive use.
Many applications and programming libraries can make use of multiple CPU cores simultaneously by running multiple active threads or processes.
We do not currently place any technological limitations on CPU core usage in cluster computing, and instead ask that you observe the rule of "one CPU core per job instance" unless specifically reserving additional cores for your jobs. In both batch and interactive cluster computing, available cluster resources may be lower during periods of intense utilization.
Batch cluster jobs are allocated one CPU core (and 4GB memory) per instance, although you can queue an unlimited number of job instances for high-throughput parallel computing.
For RCE Powered (interactive) jobs, you can request an allocation of up to 24 CPU cores (and up to 250GB RAM), but each job submission queues only one instance.
Stata/MP will reserve 8 cores per job by default
To request a different number of CPU cores, use condorInteractiveSubmit.pl -c num_cpu -x command
Some applications and libraries require additional options to set the number of CPU cores used.
To set the number of CPU cores used by the R Goto linear algebra library: Sys.setenv(GOTO_NUM_THREADS=1)
To run Matlab on a single CPU core: matlab -singleCompThread
To track iteration number for batch submissions, use one of the following:
Add --args '$(Process)' to the Arguments line of your Condor submit file. This passes to the R process the process number of the R run, which progresses from 0 to one less than the number of runs.
Capture the argument in a variable in your R code by entering the following line: run <- commandArgs(TRUE). The R object run contains the run number. You then can use this object to construct appropriate output file names for your job.
When you submit a batch job, the R script is copied to a staging area and then executed by a cluster node. This means that you must set paths explicitly in your R scripts. To set paths, add the following code to the beginning of all your R scripts. This tells R to find the absolute path to your home directory, then set the working directory to that path:
Use this code to address such problems as the following:
Loading required package: MASS Error in file(file, "r") : unable to open connection In addition: Warning message: cannot open file '<filename>', reason 'No such file or directory' Execution halted
Note: If you use a subdirectory, include the path to the subdirectory in the setwd command referenced previously.
The status of the cluster can be viewed interactively as a graph showing current cluster usage and cluster usage over time. These graphs show both the number of jobs on the cluster, and the owners of those jobs.
If you look at the "Pool Resource (Machine) Statistics" for the past day, you will see the usage, over time, of the Interactive Cluster nodes. If you look at "Pool User (Job) Statistics" for the past hour you can see who currently has jobs running on the Interactive Cluster ("User" column) and how many nodes they are using ("JobsRunning Average" column).
Alternatively, you can also view the cluster status from the command line in the RCE.
To view the status of the Batch Cluster, run: condor_status -pool batch-head.priv.hmdc.harvard.edu
To view the status of the Interactive Cluster (where "RCE Powered" jobs are run), run: condor_status -pool cod-head.priv.hmdc.harvard.edu
To better understand the output of the condor_status command, refer to the Condor Documentation. For detailed information on how to check the status of jobs in the cluster, please refer to the Batch Processing guide.
In order to run a matlab job on the RCE Batch Cluster, you must set a number of environment variables and run matlab in a command line mode. HMDC has developed a simple program to automate this process called submitMatlabBatch.sh.
To run your Matlab code as a batch job, open a terminal window in the RCE and type:
Simply supply the name of your matlab code file in place of myfile.m, and your code will run one iteration as a batch job.
Standard output from your job will be captured in the file condor_submit_util/myfile.m.condor.out .
Standard error messages will be captured in the file condor_submit_util/myfile.m.condor.err .
Make your public key an authorized key: cat ~/id_rsa.pub >> ~/.ssh/authorized_keys
The public key file on the RCE can be deleted: rm ~/id_rsa.pub
Exit the RCE and log in via SSH or PuTTY, and you no longer need to type your password. You will still need to type your passphrase once to unlock your key, but after unlocking your key it can authorize multiple SSH connections when you use an SSH agent to temporarily retain your credentials. You can think of this as being similar to a "password manager" as you would use for a web browser. Examples of SSH agents include Pageant (puTTY/Windows), Keychain (Mac), and ssh-agent (Linux & Mac).