Web65 filesystem went read only, We’re currently taking the server down to run a fsck. We will update this post as maintenance progresses.
2013-09-09 16:30 UTC Fsck finished and is now scanning for multiply-claimed blocks.
2013-09-09 20:37 UTC: The multiply claimed blocks check has now reached the point where the file system may be too unstable to boot. We’re stopping the file system check and preparing a new server to which we can migrate the data. We’ll post more information when it is available.
2013-09-09 22:49 UTC: The new machine is setup and online, we are now in the process of transferring the valid data from the old server and our backups to the new machine. The new server’s IP address is: 18.104.22.168.
2013-09-10 00:57 UTC: We are still restoring data to the new server.
2013-09-10 02:48 UTC: We are still restoring data to the new server.
2013-09-10 05:34 UTC: We have transferred approximately 155G of data to the new server. The transfer of the MySQL and PostgreSQL databases have finished. We will now begin restoring the databases.
2013-09-10 08:01 UTC: We have now transferred approximately 221G of data to the server. All MySQL and PostgreSQL databases have been reloaded. Log in has been enabled for users.
2013-09-10 17:19 UTC: We have finished restoring the server from our backup servers and we are continuing to transfer the files from the old server. Due to the extent of the file system damage some files in the transfer from the old server may be damaged or missing so we have opted to use the data from our backup servers.
2013-09-11 04:40 UTC: The data from our backup systems is still coming, if you notice any files or databases still missing please open a support ticket so we can look into it further.
Web65 has been taken offline for emergency maintenance. The server was experiencing severely high loads and not utilizing any SWAP space despite the OOM killer running. We’re investigating for hardware problems now.
2012-06-13 02:34 UTC: The server is back online but our testing and investigation isn’t finished so the server may go offline multiple times before the maintenance is complete.
2012-06-13 03:06 UTC: We’ve taken the server offline again to replace all of the RAM in the server to rule out RAM being a problem.
2012-06-13 03:43 UTC: The server is back online now and we’re closely monitoring the server to verify that the RAM swap has fixed the problems we were seeing.
2012-06-13 04:00 UTC: We’ll continue to monitor the server closely throughout the night but it looks like the hardware problems we found have been fixed with the RAM swap.
2012-06-13 04:32 UTC: The problems seem to have returned and we have had to reboot the server. We are still monitoring the server to find out the exact cause.
2012-06-14 12:49 UTC: The problem was with one of our backup subsystems. We have corrected it now and the server is stable.