File Server Outage
May 13, 2009
Around 2 pm, the primary file server for CIMS home directories experienced a severe load spike and had to be rebooted. We are still investigating the cause of the problem. Due to at least one bug in the file sharing software, restoring the system and all home directories to full service was a slow process, and because of the residual client-side affects, things were not fully stabilized until around 6 pm. During this time most CIMS computing services, including mail delivery and the web servers were disrupted.
We are investigating what caused the problem and improving the reboot process to reduce the amount of time it takes to restore all services in the event of a similar outage in the futture. We are also planning changes over the summer that will make other critical services, like mail and the webservers less dependent on home directories.
Should there be a recurrance, please check this page for information, or call systems support at (212) 998-3037.