Outages and Maintenance
-
DXUL/Fortress maintenance planned for Wednesday, August 15
The DXUL/Fortress archival storage system will be unavailable from 9am-5pm Wednesday, August 15, while system maintenance is performed. Please refer questions about this service interruption to rcac-help@purdue.edu.
-
HPC system maintenance planned for Wednesday thru Friday, August 15-17
The RCAC HPC systems will be unavailable Wednesday through Friday, August 15-17, so that system maintenance can be performed. Systems involved include all RCAC Linux clusters, Condor, the SGI Altix, the IBM SP and Regattas, and the Sun F6800 cluster....
-
Radon Linux cluster unavailable
The Radon Linux cluster will be unavailable from 8am-5pm on Thursday, August 14, while its job scheduler is upgraded from PBSPro version 8.0 to version 9.1. A scheduling reservation has been installed and only those PBS jobs with walltimes less than...
-
RCAC system and data center maintenance
RCAC systems including the DXUL/fortress archival storage system will be unavailable beginning at 8am Tuesday, 3/17, while system and MATH data center maintenance is performed. All systems are expected to be back in service by 6pm Thursday, 3/19, and...
-
The Radon Linux cluster will be unavailable Tuesday and Wednesday, August 25-26, for hardware upgrades and a conversion from Debian Linux to Red Hat Enterprise Linux 5 (RHEL5). Since this upgrade changes the version of Linux used on the cluster, all...
-
RCAC system maintenance scheduled
RCAC systems will be unavailable from 8am-6pm Friday, October 9, for electrical work in the MATH data center and system maintenance. The Coates Linux cluster will not be returned to service until Tuesday afternoon, October 13, so RCAC staff can cond...
-
Coates cluster network problems
Network problems arose following Coates cluster maintenance Tuesday, January 5. ITaP staff are working to resolve these problems, but we are currently unable to say when Coates will be returned to production. Final Update: The problems which arose f...
-
Cooling problems on coates-b, -c, and -e nodes
Coates-b, -c, and -e nodes have been powered down due to a problem with a CDU (cooling distribution unit) that cools those systems. PBS jobs running on those nodes at the time have been requeued for execution after cooling has been restored and the...
-
Coates and Rossmann cluster job scheduling temporarily suspended
Job scheduling on the Coates and Rossmann Linux cluster was disabled from 7:15-10:20pm Saturday, October 30, due to a partial cooling loss in the MATH datacenter.
-
Lustre scratch storage system unavailable
The Lustre storage system that provides scratch storage on the Rossmann and Coates Linux clusters (via /scratch/lustreA) failed at approximately 1:30pm Thursday, February 3. ITaP Storage Engineers are in MATH working on the problem, but we are curre...
-
ITaP research computers to be down during building upgrades
What’s happening? ITaP’s research computing systems will be shut down beginning at 3 a.m. Tuesday, March 29. The Coates and Rossmann cluster supercomputers could be off through 6 p.m. Thursday, March 31. Why? An outage related to an ongoing power and...
-
All RCAC systems unavailable some portion of Tue-Fri, 3/29-4/1
All RCAC systems will be unavailable on Tuesday, March 29th from 3:00am – 6:00pm. The Rossmann, Coates, Radon, and Moffett clusters will remain down through 6:00pm Thursday, March 31st. Update, 9:00am, March 29: Power has been restored to the Math b...
-
Aug. 5-17 research computing system outage FAQ
What’s happening? ITaP’s research computing systems will be shut down beginning at 5 p.m. Friday, Aug 5, including the Rossmann, Coates, Moffett and Radon clusters. The supercomputers are scheduled to be off until Wednesday, Aug. 17. Why? An outage r...
-
MATH Datacenter upgrades, starting Friday, August 5
Beginning at 5:00 pm, Friday, August 5th, the Coates and Rossmann supercomputer clusters will be unavailable due to work to complete a power and cooling upgrade to the Math Sciences building datacenter. We estimate that these clusters will be unavail...
-
Major research computing systems down during Aug. 5-17 for building upgrades
Major ITaP research computing systems will be shut down beginning at 5 p.m. Friday, Aug 5, including the Rossmann, Coates, Moffett and Radon clusters. The supercomputers are scheduled to be off until Wednesday, Aug. 17. An outage to complete power an...
-
ITaP research computers to be down two weekends in August for building upgrade
ITaP’s research computing systems will be shut down at least part of August 6-7 and August 13-14 because of an ongoing power upgrade project at the Mathematical Sciences Building. Some of the research computing systems also might be down or have to r...
-
This week, ITaP engineers have been troubleshooting issues with the Coates cluster, with the most common symptom being PBS jobs that abort or restart after some period of run time. Late yesterday afternoon, a change was made to the cluster's networki...
-
Fortress archive upgrade to HPSS
UPDATE: Archive Conversion moved to Oct. 7-14 Information Technology at Purdue (ITaP) is upgrading the research computing archival storage system Fortress. Currently based upon EMC's DiskXtender (DXUL), Fortress is being upgraded to new, more powerfu...
-
The LustreA scratch filesystem, used by Rossmann and Coates, suffered an unknown failure sometime in the early morning of November 15, 2011. LustreA was returned to normal operation at about 10:30am. Any jobs on those systems run overnight before t...
-
Updated 11/30/11: Network engineers have identified the cause of the network issue in question, and have applied a workaround, which has restored the Hansen network to full functionality. The next maintenance window to address to address the root cau...