Scheduled work on Web99 on Wednesday

We will be investigating the cause of visual alarms on web99.webfaction.com on Feb 10th, 2010 at 10am GMT.

Depending on the severity we might have to take the server down and if so it will be down between a few minutes and a few hours. We will update this ticket as soon as we have more information on Wednesday.

[Fixed] Web84 through Web122 down

Web84 through Web122 are currently down. We are investigating the issue at this time and will post an update shortly.

[UPDATE 2010-02-07 16:54 UTC] Web84 through Web122 are back online. The cause of the problem is still under investigation.

[UPDATE 2010-02-08 15:05 UTC] The outage was caused due to a failure of a network router in the data center.

[Fixed] Web113 under DDoS attack

Web113 is under a Distributed Denial of Service attack, which is causing occasional delays and timeouts for all sites hosted there. We are working to rectify the effect of the attack.

No other servers are affected.

Update [4:42 GMT]: What we believed to be a DDoS attacked turned out to be an excessive mount of traffic directed at some extremely large page hosted on the server. We have now taken measures to reduce the impact of this traffic, and the server is back to normal.

[Fixed] Web89 down

Web89 is currently down. We’re investigating the cause and hope to have service restored soon.

[UPDATE] We are still working with our datacenter to bring the server back online, currently there is no ETA. The issue is network related.

[UPDATE 19:52 GMT] All analysis of the server itself show no defect so far. The issue still seems to be within the network. There is still no ETA on a fix.

[UPDATE 21:20 GMT] Datacenter personnel are still investigating the issue. They are currently booting the system to a live CD. There is still no ETA on a fix.

[UPDATE 22:19 GMT] Web89 is back up and serving content. If you are still experiencing issues please let us know so we can resolve them. Datacenter personal are still unsure as the root cause of the issue.

[Fixed] Web90 down.

Web90 is currently down. We’re investigating the cause and hope to have service restored soon.

Update [05:00 GMT]: The problem appears to be file-system related. We’re running an FSCK on the server now.
Update [05:35 GMT]: The automatic FSCK failed. We are now running a manual FSCK on the server.
Update [06:35 GMT]: The manual FSCK is at 38% and still running.
Update [07:05 GMT]: The first FSCK has passed. We are currently running as second FSCK and it is at 86%.
Update [07:37 GMT]: The server is back online and serving requests.

[Done] Scheduled downtime on Web62 on Sunday

We will be taking down Web62 on Sunday Jan 31st at 4pm GMT to add some new drives to the server. We expect the work to take around 30mins and we will update this post when the work is complete.

Update 5.45pm GMT: the work is now complete

[Fixed] Web77 Slow performance, intermittent 502 errors

We are looking into the issue and will update this post when we have more information.

Update/Fixed [02:30 GMT]: The cause of the problems were enormous amounts of traffic being sent to the server. We have isolated and blocked the IPs responsible for the increased traffic and the server is working normally.

[Done] Schedule work on Web38 on Jan 28th 2010

We will be investigating the cause of visual alarms on web38.webfaction.com on Jan 28th, 2010 at 6am GMT.

Depending on the severity we might have to take the server down and if so it will be down between a few minutes and a few hours. We will update this ticket as soon as we have more information tomorrow

Update: Web38 was taken down for 5 minutes, and it is now fully operational again.

[Done] Schedule work on Web11 tomorrow

We will be investigating the cause of visual alarms on web11.webfaction.com on Jan 17th, 2010 at 4pm GMT.

Depending on the severity we might have to take the server down and if so it will be down between a few minutes and a few hours. We will update this ticket as soon as we have more information tomorrow

Update (6.20pm GMT): We replaced one of the drives in the RAID and the server is now back online.

[Fixed] Web22 down

Web22 is currently down. We’re investigating the cause and hope to have service restored soon.

Update [10:20 GMT]: The problem appears to be file-system related. We’re running an FSCK on the server now.

Update [13:50 GMT]: FSCK still running.

Update [17:15 GMT]: FSCK still running. We’ll update ASAP.

Update [20:06 GMT]: Unfortunately fsck didn’t fix the errors on the filesystem so at this point we are going to re-install the server and restore all data from backup.

Update [21:28 GMT]: We are currently restoring files on web22.

Update [22:26 GMT]: We are still restoring files on web22.

Update [12:53 GMT]: We are restoring user files on web22.

Update [02:10 GMT]: We are still restoring user files on web22.

Update [02:53 GMT]: We are still restoring user files on web22.

Update [02:53 GMT]: Web22 is up and we are verifying files.

Update [05:30 GMT]: Web22 is up and fully functional.