News

RCE outage

July 8, 2014

We experienced an unscheduled network connectivity outage today from 11AM-12PM due to a problem with the HUIT router. HUIT has resolved the problem and all HMDC services have been restored.

Unscheduled RCE outage: Mon June 16 10:47am

June 16, 2014

While troubleshooting an issue with an independent RCE node, a configuration change caused NX sessions to be interrupted. If you are reconnecting to your lost NX session, you should now be able to resume the previous session. Please contact support@help.hmdc.harvard.edu if you had cluster jobs running and are unable to resume your NX session. SSH logins should not have been affected.

June 8-9 Maintenance Complete

June 9, 2014

We have completed our maintenance and all services are online and available.  Here are some highlights of what we accomplished during the maintenance:

*Removed old RCE5 compute node equipment and migrated new RCE6 equipment into our cluster rack.

*Migrated all RCE project and home volumes from our old back-end storage to our new cluster storage hardware.   The new cluster storage will allow us to scale into the Petabytes.  

Unplanned DNS Outage

May 21, 2014

There was an unplanned outage today (Wed May 21) at 5pm lasting approximately until 5:15pm. This was caused by a misconfiguration of DNS. This issue has now been resolved. We apologize for any confusion or inconvenience this caused. If you have any questions or concerns, please contact us at support@help.hmdc.harvard.edu. Thank you.

NX4 Beta released

NX4 Beta released

May 15, 2014

HMDC announces the BETA release of NoMachine NX4 powered virtual desktops for the Research Computing Environment (RCE). The NoMachine NX4 BETA infrastructure provides both a web interface as well as a full featured desktop client featuring printer and storage sharing. NoMachine NX4 allows you to run interactive and batch jobs on the RCE cluster from anywhere -- your tablet, desktop or web browser -- while also providing the ability to copy files directly from your desktop or USB storage into the RCE with ease.

Update to StatTransfer v12 and minor fixes.

April 9, 2014

At 12:30 today we updated to version 12 of StatTransfer.    You can use StatTransfer locally by going to the Mathematics menu, or you can run it on an RCE Powered node.  There is also a command line version, to use the CLI version simply type `st` at a command prompt.

We also made some minor fixes to a few applications that spawned terminals when they started, the fix keeps the unnecessary terminal from starting.  The batch watch utlility was also fixed.  If you choose, email is now sent upon batch job completion.

Critical patch to openssl installed

April 8, 2014

Due to an exploit in the openssl package installed in the RCE we pushed out a workaround this morning at 09:45.  No users should have been affected by this change. If you are interested in the openssl exploit you can read more about it here: CVE-2014-0160

Retired hostname: login.hmdc.harvard.edu

March 18, 2014

We have retired the RCE server login.hmdc.harvard.edu. This has been replaced by the server kennedy.fas.harvard.edu.

All KSG RCE researchers can point their NX clients to kennedy.fas.harvard.edu to connect to the RCE.

Introducing the New RCE

February 28, 2014

The Harvard-MIT Data Center at IQSS is proud to announce the next major version of the Research Computing Environment.

The RCE offers a centralized place in which to store data, and collaborate on social science research analysis. The newest version is a major overhaul of the environment, making new software and tools available to our users. Known as RCE6, it is based on Enterprise Linux 6, and built on a new Puppet infrastructure to allow for flexible configuration and rapid feature updates.

RCE Powered job failures

February 27, 2014

During our scheduled LDAP outage this morning (2/27) we discovered that some RCE Powered jobs were killed due a failure to lookup user identities.  Our condor resource manager was configured to continue jobs, even on user lookup failure, but this appears to be a bug in the system as it did fail jobs when the system was unable to lookup user information.