Mail platform load issue (fixed)

Posted in Service change by

We’ve been experiencing intermittent high loads on our mail platform which have rendered some mail services slow.

In the past few hours we have added some more mail servers to our platform. We are currently working on spreading the load across these new servers so the speed of mail services should improve soon.

Update (Feb 9th, 10.30pm GMT): We have started spreading mailboxes across our new mail servers but the load remains high on the platform at the moment. We are continuing to spread mailboxes to new servers.

Update (Feb 10th, 5pm GMT): The load on the mail platform remains high. We are still moving mail accounts to fresh new mail servers but the process is taking a while because the servers we are moving them from are overloaded. We also have some more e-mail servers on their way to make sure that these load problems won’t happen again once we have the load under control.

Update (Feb 11th, 4.30pm GMT): We have now migrated a good number of accounts to our new mail servers and the load on the platform is now much better. We will continue to migrate more accounts.

Update (Feb 23rd, 6.40pm GMT): The load across our mail platform has now been low for over a week and all services have been responding quickly during that time so we’re marking this issue as resolved.

-
-

Web4 Down (fixed)

Posted in Downtime by

Web4 is down with a file system issue. We’ll update this post as soon as it is back up.

2009-02-03 10:46 CST – The filesystem check on Web4 is still in progress. We hope to have service on Web4 restored soon.

2009-02-03 11:18 CST – Web4 is back online.

-
-

Web46 DOS’d (fixed)

Posted in Downtime by

Web46 is being DOS’d. We’re hoping to restore service asap and we’ll update this post when it’s done.

Update: we were able to stop the attack so web46 is now acting normally.

-
-

Problems on Krait, Taipan, Viper and Mamba (fixed)

Posted in Downtime by

We’ve had some issues on these servers since yesterday. Basically, a misconfiguration (2 characters to be exact) in our memory watchdog script caused it to kill some root processes it shouldn’t have. This means that we had some SSH issues, DNS issues and databases issues on these servers (basically, these services were dying regularly and getting restarted later).

Fortunately we’ve been able to track down the problem and everything should be back to normal now.

Remi.

-
-

Welcome

Posted in General by

Welcome to our new status blog.

This is where we will keep you informed on what’s happening on our servers:

  • Scheduled downtime
  • Upgrades
  • Problems (yes, we’re only human)

Remi.

-
-