[Fixed] Filesystem check on Web300

Posted in Problems by

Web300 needs to be taken down for a full filesystem check due to corruption.

We will be taking it down at 03:30 UTC on 31-03-2013.

We will update this post as the maintenance progresses.

[ Mar 31 03:53 UTC] We are having some problems getting the server into rescue mode for starting the fsck and are working with the datacenter to solve them.

[Mar 31 06:49 UTC] The server is down and the datacenter is still working on booting it into rescue mode.

[Mar 31 08:00 UTC] The server has finally been booted into rescue and the fsck is running now.

[Mar 31 09:11 UTC] The fsck is on its second pass now and is at 60%.

[Mar 31 09:56 UTC] The server is back and OK now.

-
-

[Fixed] Degraded performance on web242

Posted in Downtime by

Web242 suffered a disk failure earlier today. No data was lost, but we had to swap the disk and rebuild the RAID. The rebuild is ongoing at this time, and disk performance is degraded while it is in progress. Because of this, sites may be slow and your ability to connect to MySQL databases may be impacted.

We will update this post when the rebuild is complete, at which time normal performance should be restored.

2013-03-29 14:50 UTC: The RAID Array has finished rebuilding and service has returned to normal.

-
-

[Fixed] Web310 offline

Posted in Downtime by

Web310 is currently offline due to multiple fan failures. We’re working to restore service at this time and will update this post when we have more information.

2013-03-27 22:52 UTC: The fan failure problem has been resolved and the server is now back online and functioning normally.

2013-03-27 00:37 UTC: Web310 has gone offline again. We’re investigating the issue at this time.

2013-03-27 00:52 UTC: Web310 is back online at this time.

-
-

[Done]Emergency maintenance on Web308, March 27th, 2013.

Posted in Downtime by

After rebooting the machine came back with a read only filesystem, we’re now putting the server in rescue env for running a fsck on it. We will update the post as maintenance progresses.

2013-03-27 12:28 UTC The server is back to operational status.

-
-

[Fixed] Web301 offline

Posted in Downtime by

Web301 is currently offline. We’re working to restore service and will update this post when we have more information.

2013-03-25 23:56 UTC: Web301 stopped at a kernel panic after we attempted to reboot it. We’re booting the machine to a rescue environment for further troubleshooting.

2013-03-26 01:29 UTC: After ruling out network issues we’ve begun a filesystem check on the server’s main file system. We’ll continue to update this post as the check continues.

2013-03-26 02:47 UTC: The file system check is at 50% on the first pass.

2013-03-26 04:04 UTC: The file system check is at 95% on the first pass.

2013-03-26 04:36 UTC: The file system check is over and the server is OK now.

 

-
-

[Fixed] Intermittent connectivity to Web346

Posted in Problems by

A new DDOS attack has commenced against Web346. The upstream threat mitigation system has been activated to try to identify and block the specific packets and flows responsible for the attack while allowing legitimate transactions to pass. Connectivity may be intermittent due to the scale of the attack. We’ll update this post as the situation progresses.

2013-03-20 17:26 UTC: The attack has subsided and normal connectivity has been restored.

-
-

[Fixed] Performance problems with web335

Posted in Downtime by

There are some performance problems with web335.

We are investigating the problem.

2013-03-17 18:43 UTC: We are taking the server down to run fsck on the filesystem.

2013-03-17 20:11 UTC: fsck was not required at this time. Performance issue seems to be under control. We are continuing to monitor for further issues.

2013-03-17 21:25 UTC: Performance issue has been resolved.

-
-

[Complete]Scheduled Web334 migration

Posted in Downtime by

As scheduled we are starting the migration of Web334 to new hardware. During the migration services on the machine will be unavailable. We will update this post once the migration is over.

2013-03-15 20:10 UTC: The migration is complete and the new server is now online and serving traffic normally.

-
-

[Fixed]Read-only filesystem on web346

Posted in Downtime by

The filesystem on Web346 is currently read-only. We put the server offline and we’re checking its filesystem status. We’ll update this post as soon we have more information.

2013-03-12 03:10 UTC: The filesystem check is done and the server is back up.

-
-

[Fixed] Emergency disk swap for Mail5

Posted in Downtime by

One of the server’s disks is failing so we are getting it replaced by the datacenter.

We will update this post with the status as the maintenance goes.

10:53:07 UTC: The disk has been replaced and the server is back and OK now.

-
-