Outages and Maintenance
-
The maintenance work was completed successfully and Halstead has been returned to normal operations as of Wednesday December 14, 2016 at 10:00am. Original message: The Halstead cluster will be unavailable beginning at Wednesday, December 14th, 2016 a...
-
UPDATE As of 7:50 pm, Wednesday, 14 December 2016, this issue is completely resolved. UPDATE As of about 6:00 pm another problem has been found in the EXRC scheduler code. We will update this news item once we have more details. Original Item The EXR...
-
Unscheduled Outage for EXRC Cluster
Following the restoration of power to the affected building, the EXRC cluster has been returned to service on Thursday, December 22nd, 2016 at 2:45pm EST. Original article As of Tuesday, December 20th, 2016 at 12:00pm EST, EXRC is unavailable due to...
-
The Halstead cluster is back online as of 4:50 PM after scheduled early-access maintenance. Unfortunately, queued jobs were lost due to complications during maintenance. If you had any jobs queued and waiting before maintenance started, you will need...
-
The Halstead cluster will be unavailable beginning at Wednesday, January 4th, 2017 at 10:00am EST, for scheduled early-access maintenance (see Halstead Cluster Early Access Policies). The cluster will return to full production by Wednesday, January 4...
-
The Scholar cluster will be unavailable beginning at Thursday, January 5th, 2017 at 8:00am EST, for scheduled maintenance. The cluster will return to full production by Thursday, January 5th, 2017 at 5:00pm EST. This work is being done during the se...
-
The maintenance for Carter cluster was cancelled and will be rescheduled at a later date. The cluster has remained in service. Original Notice The Carter cluster will be unavailable beginning at Tuesday, January 10th, 2017 at 8:00am EST, for emergen...
-
The maintenance work was completed successfully and Halstead has been returned to normal operations as of Wednesday, January 11, 2017 at 12:00pm. Original Message The Halstead cluster will be unavailable beginning at Wednesday, January 11th, 2017 at...
-
Emergency maintenance for GitHub
Patching has been completed and github.rcac.purdue.edu service is back in full production mode. Original message Tonight, Thursday, January 12, 2017, at 9:00pm – 10:00pm EST github.rcac.purdue.edu will be taken down for brief emergency maintenance. G...
-
Conte is back in production, and jobs have started running. Thank you for your patience. ===== Because of additional work required to fix a configuration problem, this maintenance is running past the scheduled end time. We are extending the outage...
-
Connectivity issues to Research Data Depot
System monitoring has revealed intermittent issues connecting to the Research Data Depot on Thursday January 19. When this issue occurs, users will experience pauses when working in a UNIX shell on community cluster systems, or as interrupted or drop...
-
Unscheduled scratch outage on Conte
The scratch filesystem serving Conte is currently unavailable. Both currently running jobs and attempts to access files in scratch will block until the filesystem is back online. Job scheduling on Conte has been paused while storage engineers addres...
-
Emergency Security Patching of RCAC Clusters
Due to a recent security vulnerability, the Carter, Halstead, Hammer, Radon, Rice, Scholar, and Snyder clusters will have their operating system upgraded to a newer version during February 2, 2017 5:00pm - March 2, 2017 5:00pm EST. Unlike other cl...
-
Halstead MPI problem, scheduling paused
Following the security updates on Halstead, an issue was discovered that prevented multi-node MPI jobs from running properly. Scheduling on Halstead has been stopped, and systems engineers are working on fixing the issue. We will provide further stat...
-
Unscheduled scratch outage on Rice, Snyder, and Hammer
The scratch filesystem serving Hammer, Rice, and Snyder is currently unavailable. Both currently running jobs and attempts to access files in scratch will block until the filesystem is back online. Job scheduling on Hammer, Rice, and Snyder has been...
-
Conte and Hathi Cluster Maintenance
The Conte and Hathi clusters have been updated and returned to full production. This is a gentle reminder that the Conte and Hathi clusters will be undergoing a scheduled maintenance beginning at Tuesday, February 21st, 2017 at 8:00am EST. Please sa...
-
The Research Data Depot has been restored to service. A portion of the systems serving the Research Data Depot have suffered a failure. Some systems using Depot have been affected, particularly research clusters and users accessing the Depot over NFS...
-
Partial scratch outages on Rice, Snyder, Carter, Scholar and Hammer
The scratch filesystems serving Carter, Hammer, Rice, Scholar, and Snyder started behaving abnormally this morning. This may have affected some jobs, and anyone using one of the login nodes for these clusters may have had sessions freeze or seen dela...
-
The Fortress archival storage system is currently experiencing intermittent connectivity. We expect the situation to be resolved by approximately 1pm. UPDATE: Storage engineers have resolved the connectivity problems and Fortress is back in full prod...
-
UPDATE: At this time, the maintenance has been completed and is back in service. The Thinlinc cluster will be unavailable starting at 5pm on March 14th until midnight for necessary maintenance and upgrades. During this time, the remote desktop servic...