Outages and Maintenance
-
Nearly all major clusters operated by ITaP Research Computing are stopped due to issues with their storage systems relating to the power loss on the West Lafayette campus in the wake of the severe weather Sunday night. This includes: Conte, Carter,...
-
The Fortress HPSS Archive is offline due to issues with their storage systems relating to the power loss on the West Lafayette campus in the wake of the severe weather Sunday night. Engineers are investigating the problem now, but until this is reso...
-
All ITaP Research Computing systems are currently experiencing an issue with accessing network filesystems. A case has been opened with our vendor as ITaP engineers troubleshoot the issue. Cluster users may experience issues accessing files in /home,...
-
Maintenance completed on LustreD filesystem
UPDATE 6:00 pm 14 Dec 2013 As of 5:45 pm we believe this problem has been corrected and Conte has returned to normal operation. The LustreD filesystem, serving the Conte cluster, is experiencing some issues as of about 4:30 pm Saturday 14 Dec 2013. S...
-
Lustre D filesystem unavailable
Update - 2:25pm, 12/16/2013 The LustreD scratch filesystem has been returned to service and both the filesystem and scheduler appear to be working properly. Conte has been returned to normal production service as of 2:20pm. Update - 10:30am, 12/16/2...
-
Hansen and WinHPC clusters at reduced capacity
On December 21, 2013, the Hansen and WinHPC clusters will operate at reduced capacity while datacenter power maintenance is performed on a portion of the system. In the days leading up to December 21st, this will appear as potentially increased queue...
-
The Hansen, Coates, and Rossmann clusters will be unavailable beginning at 8:00am on Tuesday, January 7, 2014, for scheduled maintenance. The clusters will return to full production by 5:00pm, Wednesday, January 8. During this time, these systems wil...
-
The Lustre D filesystem, serving the Conte cluster, has become unavailable as of about 8:00 pm Thursday 13 Feb, 2014. System engineers are working to bring the system back to 100% operation. Currently running jobs should be able to continue, but sch...
-
UPDATE - As of 7:45pm Sunday, March 16th, 2014, the fileserver maintenance has completed successfully, and cluster systems are back online. All Research Computing systems will be unavailable from 8:00am Saturday, 3/15/2014 through Sunday, 3/16/2014...
-
During the maintenance scheduled for 3/15/2014-3/16/2014, the Rossmann cluster will be upgraded to Red Hat Enterprise Linux, version 6. Only those PBS jobs with walltimes short enough that they will finish prior to the beginning of this maintenance...
-
Fortress HPSS Archive Maintenance
The IBM T3584 tape library serving Fortress is scheduled to be down Wednesday, March 19, 2014 from 8AM to 5PM for a hardware upgrade. Additional tape capacity and tape drives will be added to support increased demand. HPSS will remain accessible, ne...
-
In order to repair a hardware issue with the underlying disk storage comprising LustreC, ITaP storage engineers will execute a brief maintenance on the filesystem on Monday morning, April 7, 2014. This issue is currently impacting the filesystem's re...
-
Scheduling Paused on Hansen and Carter
The scratch filesystem on Hansen and Carter is currently unavailable due to a hardware issue. Attempts to access scratch will block until the filesystem is back online. Job scheduling on Hansen and Carter has been paused while storage engineers addre...
-
The Peregrine1 cluster will be unavailable beginning at 8:00am on Monday, May 19, 2014, for scheduled maintenance. The cluster will return to full production by 5:00pm, Tuesday, May 20. During this time, the cluster's network link to West Lafayette...
-
During the maintenance scheduled for May 19-20 the Hansen cluster will be upgraded to Red Hat Enterprise Linux, version 6. Only those PBS jobs with walltimes short enough that they will finish prior to the beginning of this maintenance period are bei...
-
The Carter cluster will be unavailable beginning at 8:00am on Monday, May 19, 2014, for scheduled maintenance. The cluster will return to full production by 5:00pm, Wednesday, May 21. During this time, Carter will receive operating system patches, an...
-
Emergency Chilled Water Maintenance
During early June, 2014, all RCAC systems housed in the Math Sciences building will be unavailable due to an emergency repair to the redundant chilled-water system serving the MATH datacenter. A major chilled water line has developed a leak, and must...
-
UPDATE: Fortress was successfully returned to service as of 7:35 pm Wednesday, 15 July. As of 8:30am on July 15, 2014, the Fortress HPSS Archive is unavailable due to a hardware issue. Access to Fortress via HSI, HTAR, Globus, or CIFS is not availabl...
-
On Tuesday and Wednesday, July 22-23, 2014, the Conte cluster will be unavailable for system upgrades. During this upgrade, Conte will receive several significant new capabilities: Infiniband drivers will be upgraded, enabling native Infiniband conne...
-
Lustre Maintenance - 9/30/2014
The Lustre Filesystem for Hansen and Carter will be briefly unavailable, beginning at 9:00 AM on Tuesday, September 30. At this time, ITaP engineers will perform repairs on one of the Lustre storage nodes. Access to scratch may be intermittent while...