Outages and Maintenance
-
Unscheduled multiple clusters and Data Depot outage
The Bell, Brown, Gilbreth, Halstead, Hammer, Scholar, Workbench clusters and Data Depot began experiencing issues with intermittent high load on the Data Depot servers around 4:30pm EDT. Engineers are currently diagnosing the issue and are working to...
-
The Weber cluster will be unavailable Wednesday, October 20, 2021 from 8:00am - 8:00pm EDT for scheduled maintenance. The cluster will return to full production by Wednesday, October 20th, 2021 at 8:00pm EDT. During this time, Weber will be expanded...
-
The Bell cluster began experiencing issues with high load and sluggish performance on the scratch filesystem around 1:20pm EDT. Engineers are currently diagnosing the issue and are working to identify a fix. Job scheduling has been paused while this...
-
Fortress Archive Monthly Maintenance
The Fortress Archive will be unavailable Wednesday, November 3, 2021 from 8:30am - 12:00pm EDT for scheduled monthly maintenance (first Wednesday of every month). During this time, Fortress will receive normal software and hardware updates. Any trans...
-
Fortress Archive Monthly Maintenance
The Fortress Archive will be unavailable Wednesday, December 1, 2021 from 8:30am - 12:00pm EST for scheduled monthly maintenance (first Wednesday of every month). During this time, Fortress will receive normal software and hardware updates. Any trans...
-
The Weber cluster began experiencing issues with expired VPN certificate around 10:00am EST. Engineers are currently diagnosing the issue and are working to identify a fix. We will provide an update by 5pm.
-
The Bell cluster began experiencing issues with its scratch filesystem around 6:30pm EST. Engineers are currently diagnosing the issue and are working to identify a fix. Job scheduling has been paused while this issue is being addressed. We will prov...
-
The Scholar cluster will be unavailable January 4, 2022 8:00am - January 5, 2022 6:00pm EST for scheduled maintenance. The cluster will return to full production by Wednesday, January 5th, 2022 at 6:00pm EST. During this time, Scholar will have the...
-
Fortress Archive Monthly Maintenance
The Fortress Archive will be unavailable Wednesday, January 5, 2022 from 8:30am - 12:00pm EST for scheduled monthly maintenance (first Wednesday of every month). During this time, Fortress will receive normal software and hardware updates. Any transf...
-
The Weber cluster began experiencing issues with weber-sftp subsystem around 2:00pm EST. The problem affects ingress/egress path to the cluster. Engineers are currently diagnosing the issue and are working to identify a fix. We will provide an upda...
-
The Bell cluster began experiencing issues with scheduler database around 11:35am EST. The problem manifests as freezing and/or "socket timed out" and "Unable to contact slurm controller" error messages upon the usual Slurm comman...
-
Fortress Archive Monthly Maintenance
The Fortress Archive will be unavailable Wednesday, February 2, 2022 from 8:30am - 12:00pm EST for scheduled monthly maintenance (first Wednesday of every month). During this time, Fortress will receive normal software and hardware updates. Any trans...
-
The Gilbreth cluster began experiencing issues with its Data Depot mounts around 9:00am EST. The /depot filesystem is not visible on some of the login and compute nodes. Engineers are currently diagnosing the issue and are working to identify a fix....
-
As of 8:00pm EST on Friday, February 11th, 2022 the Data Depot filesystem outage has been resolved and scheduling has been resumed on all clusters. The Bell, Brown, Gilbreth, Halstead, Scholar, Workbench, and Data Depot cluster began experiencing i...
-
Fortress Tape Archive Maintenance
The Fortress tape archive library will undergo replacement work on one of its tape-picking robotic arms on Thursday, February 24, 2022 from 8:30am - 5:00pm EST. During this time, Fortress will remain available and functional, but users may observe de...
-
Unscheduled Math data center cooling outage
The Math building data center began experience issues with its cooling system around 11:40am EST. As one of manifestations, users may experience issues while logging in to the Anvil, Bell, Gilbreth, Halstead, Workbench, and Data Depot clusters. To m...
-
Fortress Archive Monthly Maintenance
The Fortress Archive will be unavailable Wednesday, March 2, 2022 from 8:30am - 12:00pm EST for scheduled monthly maintenance (first Wednesday of every month). During this time, Fortress will receive normal software and hardware updates. Any transfer...
-
Whole-Floor Cluster Maintenance
The majority of Research Computing computational resources (Bell, Brown, Geddes, Gilbreth, Halstead, Hammer, Scholar, Weber, and Workbench clusters) will be unavailable March 15, 2022 4:00pm - March 16, 2022 12:00pm EDT during Whole-Floor Data Depo...
-
Whole-Floor Data Depot Maintenance
The Data Depot filesystem will be undergoing scheduled maintenance and will be unavailable for use from March 15, 2022 5:00pm - March 16, 2022 8:00am EDT for critical software updates which can only be applied during a full service downtime. During...
-
github.itap Offline for Scheduled Whole-Floor Maintenance
During RCAC whole-floor downtime due to scheduled Data Depot maintenance, ITaP’s Github Enterprise service will be unavailable. In an effort to reduce impact to developers, this work will be performed during off hours on Tuesday, March 15, 2022 from...