[Fixed] Web77 Slow performance, intermittent 502 errors

Posted in Downtime by

We are looking into the issue and will update this post when we have more information.

Update/Fixed [02:30 GMT]: The cause of the problems were enormous amounts of traffic being sent to the server. We have isolated and blocked the IPs responsible for the increased traffic and the server is working normally.

-
-

[Done] Schedule work on Web38 on Jan 28th 2010

Posted in Scheduled downtime by

We will be investigating the cause of visual alarms on web38.webfaction.com on Jan 28th, 2010 at 6am GMT.

Depending on the severity we might have to take the server down and if so it will be down between a few minutes and a few hours. We will update this ticket as soon as we have more information tomorrow

Update: Web38 was taken down for 5 minutes, and it is now fully operational again.

-
-

[Done] Schedule work on Web11 tomorrow

Posted in Scheduled downtime by

We will be investigating the cause of visual alarms on web11.webfaction.com on Jan 17th, 2010 at 4pm GMT.

Depending on the severity we might have to take the server down and if so it will be down between a few minutes and a few hours. We will update this ticket as soon as we have more information tomorrow

Update (6.20pm GMT): We replaced one of the drives in the RAID and the server is now back online.

-
-

[Fixed] Web22 down

Posted in Downtime by

Web22 is currently down. We’re investigating the cause and hope to have service restored soon.

Update [10:20 GMT]: The problem appears to be file-system related. We’re running an FSCK on the server now.

Update [13:50 GMT]: FSCK still running.

Update [17:15 GMT]: FSCK still running. We’ll update ASAP.

Update [20:06 GMT]: Unfortunately fsck didn’t fix the errors on the filesystem so at this point we are going to re-install the server and restore all data from backup.

Update [21:28 GMT]: We are currently restoring files on web22.

Update [22:26 GMT]: We are still restoring files on web22.

Update [12:53 GMT]: We are restoring user files on web22.

Update [02:10 GMT]: We are still restoring user files on web22.

Update [02:53 GMT]: We are still restoring user files on web22.

Update [02:53 GMT]: Web22 is up and we are verifying files.

Update [05:30 GMT]: Web22 is up and fully functional.

-
-

[Fixed] Web48 down.

Posted in Downtime by

Web48 is currently down. We’re investigating the cause and hope to have service restored soon.

Update [07:20 GMT]: Web48 is back online and the problem has been resolved. The problem appears to have been bad memory which was replaced. The server is working normally at this time.

-
-

[Fixed] Web40 down

Posted in Downtime by

Web40 is currently down. We’re investigating the cause and hope to have service restored soon.

Update [21:22 GMT]: Web40 is back online. The problem appears to have been an extremely high spike in system load. The server is working normally at this time.

-
-

[Fixed] Web1 undergoing emergency maintenance

Posted in Downtime by

Web1 is down for emergency software maintenance. We are working to resolve the issues and bring the machine back up.

Update [12:45 GMT]: Web1 is fully operational again. The problem was a misconfiguration in the boot manager.

-
-

[Reminder] Scheduled down time tomorrow.

Posted in Scheduled downtime by

See http://statusblog.webfaction.com/2009/12/31/scheduled-down-time-on-sunday/.

-
-

[Done] Scheduled down time on Sunday.

Posted in General by

On Sunday at…

  • …4 PM GMT we are going to replace Web 1’s /dev/sdc with a 300 GB HDD. Expected down time is an hour.
  • …5 PM GMT we are going to reboot Dweb 30, Dweb 57, Dweb 58, Dweb 59, Mail 2, and Mail 3 to install a kernel update. Expected down time is 15 minutes.

Update [4:45 PM GMT]: Due to an unexpected delay Web 1 was just taken down. We apologize for the inconvenience.
Update [6:10 PM GMT]: Updates:

  • At the rate data is rsyncing Web 1 will likely be down for another 2 and a half hours.
  • Dweb 30, Dweb 57, Dweb 58, Dweb 59, Mail 2, and Mail 3 have been rebooted.

Update [7:00 PM GMT]: At the rate data is rsyncing Web 1 will likely be down for another hour and 15 minutes.
Update [8:40 PM GMT]: The rsyncing is complete.
Update [10:00 PM GMT]: Done

-
-

[Fixed] DNS problem affecting multiple servers

Posted in Problems by

A problem with one of our DNS servers briefly interrupted access to most of our servers. The issue has been resolved and all servers are now accessible.

-
-