[Fixed] MySQL problems on Web193 – Emergency Maintenance

Posted in Problems by

We’re investigating a problem with MySQL on Web193 and will update this post when we have more information.

Update: [20:13 CDT] – Our investigation into the MySQL problems has led to some emergency maintenance on the server as we look for the root cause. We’ll update this post as we get more information.

Update: [20:58 CDT] – The server is now back online and responding to requests. We are continuing to monitor the situation closely to verify that the server stays stable.

Update: [21:06 CDT] – We’ve taken the server mostly offline again to verify the integrity of the MySQL databases.

Update: [22:52 CDT] – We’ve found and repaired the error. We discovered some of the MySQL databases had been corrupted. We’ve run a check all on the databases and restored most of the databases that had become corrupted. Due the nature of the database errors some databases couldn’t be backed up and weren’t restored. We’ve restored the databases that were corrupted and could be restored from our latest backups.

Update: [05:36 CDT] – The problems with MySQL on Web193 are back. We’ve temporarily disabled InnoDB in MySQL while we are working on fixing the problem.

Update: [11:51 CDT] – The InnoDB engine is back up but all┬áInnoDB tables are currently read-only. We are sill working on restoring full functionality.

Update: [00:30 CDT] – We are about to get the whole chassis for the server swapped by our data center.

Update: [07:06 CDT] – All of the new hardware is in place.┬áInnoDB tables are going offline for a few minutes, hopefully for the last time, so that any remaining corrupted data can be fixed.

Update: [07:11 CDT] – All MySQL tables are now back online, free of any corruption and fully writeable.

-
-

[Fixed] Scheduled maintenance on Web193

Posted in Scheduled downtime by

Web193 will be taken down at 2:00 PM UTC on Friday, June 3rd 2011 to repair its RAID controller. Expected downtime should be less than one hour. We will update this post as the maintenance progresses.

2011-06-03 16:41 UTC: Web193 has been taken offline for repair.

2011-06-03 17:52 UTC: The RAID controller has been replaced and the server is currently running a filesystem check.

2011-06-03 18:33 UTC: Web193 is back online, but we’re still seeing some problems with the disk hardware. We’ll update this post when we have more information.

2011-06-03 20:44 UTC: Web193 has been taken offline for an emergency disk replacement. We’ll update this post when we have more information.

2011-06-03 23:11 UTC: Web193 is back online following replacement of the RAID controller and one disk. The MySQL service would not restart due to filesystem corruption that resulted from the faulty hardware, so we restored all MySQL databases from a backup made yesterday around 11:30 PM UTC. Please open a support ticket if you have any questions or notice any problems with your sites or data.

-
-