Web381 will be taken down Wdnesday February 13th between 09:00 UTC and 12:00 UTC for a RAID adapter swap. We will update this post as maintenance progresses.
2013-02-13 12:24 UTC We had to start an offline RAID rebuild, it is now at 72%.
2013-02-13 15:45 UTC After rebuilding the RAID the OS was not able to boot up, we’re currently investigating the issue and considering a hardware replacement.
2013-02-13 17:24 UTC: The RAID controller is causing the kernel to panic on boot. We’ve currently got the server in a rescue environment and we are copying all data off of the current hard drives. Once this is finished we will have the entire server chassis swapped and begin restoring the data to the machine.
2013-02-13 18:33 UTC: The server chassis is currently being swapped.
2013-02-13 19:33 UTC: The relevant hardware has been swapped and we are working to bring the server back online in order to verify the hardware swap has corrected the problem.
2013-02-13 21:58 UTC: The new hardware has not helped the cause of the kernel panics. We’ve decided to bring the machine back online in a rescue environment to pull as much data off the machine as possible then have all of the hardware switched and restore from the backups we are taking now.
2013-02-14 00:14 UTC: We’ve replaced the entire machine including the hard disks. We are now preparing an OS reload.
2013-02-14 03:06 UTC: We’ve restored the mysql and postgres databases on the machine and all $HOME directories. We’re now vigorously testing the machine to verify the complete hardware swap has corrected the problem we were seeing.
2013-02-14 04:02 UTC: We’ve restored all cron jobs and the SSH fingerprint information so you will not see any warnings that the host key has changed. We’re taking the machine through one final test before allowing logins.
2013-02-14 04:05 UTC: The final test has completed and the server is now back online and functioning normally. We’ll continue to closely monitor the machine throughout the next few hours.