R supports two primary ways of accessing compressed data. This allows you to keep your data files on disk compressed saving space, and often time (since the file I/O saved by compression is often more expensive than the cpu cycles it uses).
If you are storing your data in native format, simply use the compress option of save:
tst.df=as.data.frame(cbind(1:10,2:11)) # just some testing data save(tst.df,file="test.Rbin", compress=T) # save a compressed R file
You can use load as normal, to read the compressed files:
To access any other kind of file with compression, simply use gzfile("") around the file name:
write.table(tst.df,gzfile("test.dat.gz")) # write a compressed file read.table(gzfile("test.dat.gz"),row.names=1)# read it back in
Files compressed using the gzfile method can also be compressed and uncompressed using the UNIX gzip and gunzip commands (respectively).
When running a Stata .do file on the batch cluster via condor_submit_util, you have to add some additional arguments in order to get your job to stop when Stata encounters an unrecoverable error (which is probably the behavior you want).
The command line below runs Stata with the example file my_dofile.do:
To configure your user account such that every time you connect to the RCE, some action is performed:
Write a script that performs the desired action. The scripting languages available in the RCE include BASH, (/bin/bash), multiple versions of Python (/usr/bin/python) and Perl (/usr/bin/perl).
Copy this script to the directory ~/.rce/startup with the command cp [scriptname] ~/.rce/startup/. If the directory does not exist, create it with the command mkdir -p ~/.rce/startup.
Make sure the permissions on your script permit execution; to ensure that this is the case, run the command chmod +x ~/.rce/startup/[scriptname].
Your script is run every time you connect to the RCE.
Note: Be sure to test your script; a misbehaving script can prevent you from being able to connect to the RCE! In particular, your script must not require any keyboard input or other interaction with the user; it will not be able to communicate with you while it is running, and you will not be able to connect to the RCE while your script sits waiting for input.
Login nodes: These servers provide a graphical or command line interface to the meat and potatoes of the RCE, the Interactive and Batch cluster. It operates like a personal desktop environment, but is not meant for running jobs.
Interactive cluster: Run memory intensive jobs on these COD (compute-on-demand) nodes that require user interaction.
Home directories are allocated 500 MB. We do not increase this space, but you may request a project directory.
Project directories are available in storage sizes suited to each particular researcher. Please contact us to request storage space. Fees may be applicable to sizes over a certain amount. To read about using your storage space, please see:
1 TB of shared scratch space is available to all users on our interactive cluster. Top level scratch space is world-writeable and -readable (Unix 1777 permissions). User created directories are only owner writeable/readable (1700) or owner/group (2770) if you are a member of a research group. Do not use the scratch space for permanent storage.
We have retired the login.hmdc.harvard.edu name in favor of the other name for the Kennedy RCE login nodes, kennedy.fas.harvard.edu. If you are a KSG user you can login to kennedy.fas.harvard.edu. These nodes are exclusively for KSG students, staff and faculty.
The RCE is a service that is remotely accessible, and many users work with sensitive data. Since we cannot guarantee the security of the physical environment around the user's computer, we are required to follow the Harvard information security policy for application availability which indicates that the screen lock timeout should "only be a few minutes."
If you are running a RCE desktop session from a Windows client, you are probably used to using ctrl+c for copying and ctrl+v for pasting. Terminals remap these functions to shift+ctrl+c and shift+ctrl+v, respectively. The ctrl key is used for specific operations, such as terminating command-line operations (ctrl+c). You can also right-click in the terminal to copy and paste.
The default window behavior in the RCE uses the focus-follows-mouse model, in which the user selects windows by merely pointing at them, and the selected window jumps to the foreground. To change this behavior go to:
Applications → RCE Utilities → Change Windows Focus Behavior
The simplest way to share files with collaborators in the RCE is to use a project space, which is a folder you all have access to. Send us a support request asking for a shared project space and we will create one for you. (Storage sizes over a certain amount are subject to a nominal fee.) Be sure to include the names of your collaborators.
Once you have a project space it will be linked from your RCE home directory under ~/shared_space/ by the name of the project. You may request more than one project space, but limitations on storage allocation apply.
Grant your collaborators access to the files you create in the project space
There are two ways to allow your collaborators access to the files you create in a project space. First and simplest -- do nothing. We run an automatic process each night that will change the permissions on your project files so that any permissions you have will also be granted to your project group.
If, however, you want to grant your collaborators immediate access to project files you have created, run the fixGroupPerm.sh command, as described in Projects & Shared Space.
Change your default file creation mode (optional)
If most or all of your work in the RCE is done in collaborative project spaces, you may want to change the default file creation mode (i.e. the file access permissions) for your RCE account so that all files you create can be modified by members of the group which owns them. For instructions see Projects & Shared Space