Skip to main content
Have a request for an upcoming news/science story? Submit a Request

Outages

  • Radon Linux cluster unavailable

    The Radon Linux cluster will be unavailable from 8am-5pm on Thursday, August 14, while its job scheduler is upgraded from PBSPro version 8.0 to version 9.1. A scheduling reservation has been installed and only those PBS jobs with walltimes less than...

  • Coates cluster network problems

    Network problems arose following Coates cluster maintenance Tuesday, January 5. ITaP staff are working to resolve these problems, but we are currently unable to say when Coates will be returned to production. Final Update: The problems which arose f...

  • Cooling problems on coates-b, -c, and -e nodes

    Coates-b, -c, and -e nodes have been powered down due to a problem with a CDU (cooling distribution unit) that cools those systems. PBS jobs running on those nodes at the time have been requeued for execution after cooling has been restored and the...

  • Coates and Rossmann cluster job scheduling temporarily suspended

    Job scheduling on the Coates and Rossmann Linux cluster was disabled from 7:15-10:20pm Saturday, October 30, due to a partial cooling loss in the MATH datacenter.

  • Lustre scratch storage system unavailable

    The Lustre storage system that provides scratch storage on the Rossmann and Coates Linux clusters (via /scratch/lustreA) failed at approximately 1:30pm Thursday, February 3. ITaP Storage Engineers are in MATH working on the problem, but we are curre...

  • Aug. 5-17 research computing system outage FAQ

    What’s happening? ITaP’s research computing systems will be shut down beginning at 5 p.m. Friday, Aug 5, including the Rossmann, Coates, Moffett and Radon clusters. The supercomputers are scheduled to be off until Wednesday, Aug. 17. Why? An outage r...

  • Coates PBS scheduler issues

    This week, ITaP engineers have been troubleshooting issues with the Coates cluster, with the most common symptom being PBS jobs that abort or restart after some period of run time. Late yesterday afternoon, a change was made to the cluster's networki...

  • Unscheduled LustreA Outage

    The LustreA scratch filesystem, used by Rossmann and Coates, suffered an unknown failure sometime in the early morning of November 15, 2011. LustreA was returned to normal operation at about 10:30am. Any jobs on those systems run overnight before t...

  • Fortress: ADIC Scalar 10k tape robot unavailable

    Update 12/2/11 (4:15pm) The tape robot has been returned to service and Fortress is back in production. Please contact us at rcac-help@purdue.edu if you encounter further issues. Update 12/2/11: The ADIC Scalar 10K robot is temporarily down again wit...

  • Hansen: unscheduled outage to Lustre scratch

    Update The error condition on the Lustre filesystem has been cleared, and Hansen is back in production and accepting new jobs. Jobs already running should have resumed at the point where they were blocked waiting when the Lustre error occurred. This...

  • Fortress: ADIC Scalar 10k tape robot unavailable (1/4/2012)

    Update - 1/9/2012 The repairs to the ADIC tape library have been completed and Fortress' tape functionality is back in operation. Update - 1/6/2012 Following work today by vendor engineers, the latest estimate for the ADIC tape robot's return to serv...

  • Coates Scheduling unavailable

    This morning, the PBS system on Coates developed an issue with the storage holding its internal state.While systems engineers are working on recovering it from backup, any new job submissions will not be possible, nor will you be able to query job st...

  • Lustre unavailable on Hansen cluster

    Update: As of 9:45pm, Lustre is back in production and scheduling has resumed on Hansen. Original Notice: As of approximately 8:00pm February 7, an issue was found the Lustre filesystem on Hansen making the filesystem unavailable for use. ITaP engine...

  • Unscheduled outage to Rossmann cluster

    At approximately 10:50pm, Thursday, March 15, the power distribution to large portions of the Rossmann cluster failed. These feeds also power the login nodes for the cluster, which, while unavailable, renders Rossmann unavailable for use. Power was r...

  • PBS unavailable on Rossmann cluster

    Due to a network issue, the server running the PBS software for Rossmann is unavailable. While the server is unavailable, attempts to use PBS commands ("qsub", "qstat", "pbsnodes") will fail with error messages like: qst...

  • Unscheduled outage to MATH datacenter

    Update - 9:30pm, 4/1/2012: As of about 9:30pm, Sunday, 1 April, ITaP systems staff have returned Hansen to production status, and job scheduling is re-enabled. The scratch filesystem on Hansen has been restored with no apparent loss of files; if you...

  • Partial outage affecting some Coates queues

    Update - 6:45 pm Tuesday, 10 April 2012 ITaP engineers have found and repaired the network issue that was affecting Coates nodes type B, C and E. Job scheduling has been resumed for all queues. If you encounter any problems, please report them to rc...

  • Unscheduled Samba Outage

    Update : 1:45pm As of As of 1:45pm this afternoon, systems staff have completed patching the samba servers used to access storage systems. You should now be able to connect to samba.rcac.purdue.edu for samba access to home and scratch directories and...

  • Unscheduled HPSS outage

    Update - April 11, 2012 240pm At around 240pm, ITaP engineers have restored communications between the HPSS system and the tape library. Access to Fortress from Samba, HSI/HTAR and other methods has been restored. I apologize for the inconvenience th...

  • Unscheduled Power outage in Math Datacenter

    Update: 10:00pm Tuesday As of 8:30pm Tuesday 21 August 2012, the LustreB filesystem has been returned to full service. Our storage engineers with assistance of the vendor have verified that the system is stable. If you encounter any issues, please co...