Web73 down (fixed)

Posted in Downtime by

Web73 is currently down while we investigate some filesystem errors. We’ll update the post as soon as we have more information.

Update (12.40pm GMT): The filesystem on the server is corrupted beyond recovery so we’re going to do an OS reload and restore the data from backup. We’ll update this post with our progress.

Update (3.30pm GMT): We have now moved the server onto new hardware (in case the filesystem errors were hardware-related) and we have started copying all the data from backup.

Update (5.30pm GMT): The server is now back up with new hardware and the data from yesterday’s backup. Note that the RSA host key has changed so your SSH client may display a warning about it.

-
-

Drive replacement on web42 (fixed)

Posted in Downtime by

One of the drives on web42 died and we are currently rebuilding the RAID with the new drive. We will update this post once the server is back online.

2009-04-24 12:26 CDT – the drive rebuild is complete and Web42 is back online.

-
-

Mail5 and Webmail problems (fixed)

Posted in Downtime by

Mail services on mail5.webfaction.com and webmail.webfaction.com are currently not working. We are looking into the problem and hope to have normal service restored soon. We will update this entry as we have more information.

2009-04-17 10:24 CDT – troubleshooting on mail5/webmail is still in progress.

2009-04-17 11:33 CDT – we’ve just pointed webmail.webfaction.com at a different server IP. Webmail users will be able to access webmail as soon as the DNS change propagates, but you will not have access to your usual webmail address book since it is located on mail5. If your mailbox resides on mail5, you still will not be able to access your mail. Troubleshooting on mail5 is still in progress.

2009-04-17 15:37 CDT – the problem on mail5 appears to be a failed OS upgrade. We are re-installing packages now.

2009-04-17 17:29 CDT – mail5 is back online and webmail.webfaction.com has been pointed back to mail5. All mail5 users should be able to access their mail now, but the server may be slow to respond for the next several hours as it catches up with today’s incoming mail.

-
-

Web67 Down (fixed)

Posted in Downtime by

Web67 is currently down while we investigate a potential problem on the filesystem. We will update this entry as we have more information.

2009-04-14 12:13 CDT – A filesystem repair is in progress on Web67. We hope to have service restored soon.

2009-04-14 12:42 CDT – The filesystem repair on Web67 is still in progress.

2009-04-14 12:44 CDT – The filesystem repair on Web67 completed successfully and the server is now online.

-
-

Datacenter DNS issues (fixed)

Posted in Downtime by

Our datacenter is experiencing DNS issues right now. If your app on our server needs to resolve some domain names it may be unable to do so. We’ll update this post as soon as we have more information.

Update: the DNS issues are now resolved.

-
-

Web42 audible alarm (fixed)

Posted in Downtime by

Web42 is currently down while we investigate an audible alarm. We’ll update this post once we know more about the issue.

Update: The problem was a degraded RAID on the server. The server is now back online and the RAID is rebuilding in the background.

-
-

Web59 Down (fixed)

Posted in Downtime by

Web59 is currently inaccessible and we’re looking into it.

Update: The problem was a misconfiguration in the firewall and it is now fixed.

-
-

Memory upgrade on Web64 (done)

Posted in Scheduled downtime by

Web64 will be down for a few minutes tomorrow at 9am GMT while we add more memory to the server.

Update: The upgrade is now over

-
-

Emails rejected by hotmail (fixed)

Posted in Downtime by

E-mails sent through our e-mail platform are currently rejected by Hotmail. We are currently talking to the Hotmail team to get this resolved as soon as possible. We will update this post once it is fixed.

Update: We are still working with Hotmail and the ban should be lifted tomorrow at the latest.

Update (4.09am GMT): E-mails to Hotmail are now going through again.

-
-

Web 69 Down (fixed)

Posted in Downtime by

Web 69 is currently down. Its root partition went read only and rebooting it revealed an issue which we are currently working on resolving now. We will post updates as they are available.

It appears to be an issue with the RAID controller and we are currently replacing the hardware and restoring all data from backup.

2009-03-09 06:00 PST: Web69 is still down having suffered a serious RAID controller failure. We have recovered all of the data from the server and are currently working on restoring it to a new standby server which will replace web69.

2009-03-09 06:16 PST: Web69 is now back online with all its data. We decided to move the data onto a new server to give us more time to check the hardware on the failing server. We copied all the data from just before the crash so no data has been lost.

-
-