[Done]Scheduled maintenance on Web111, February 29th.

Posted in Scheduled downtime by

Web111 will be taken offline for a disk swap Wednesday 29th February 2012 between 10:00 UTC and 12:00 UTC. We’ll update this post as maintenance progresses.

2012-02-29 11:09 UTC The disk has been replaced. The server is back at operational status.

-
-

[Complete]Scheduled Web99 migration

Posted in Downtime by

As scheduled we are starting the migration of Web99 to new hardware. During the migration services on the machine will be unavailable. We will update this post once the migration is over.

2012-02-27 09:33 UTC: The migration has finished and the server is back online and working normally.

-
-

[Complete]Scheduled Web34 migration

Posted in Downtime by

As scheduled we are starting the migration of Web34 to new hardware. During the migration services on the machine will be unavailable. We will update this post once the migration is over.

2012-02-27 09:41 UTC: The migration has finished and the server is back online and working normally.

-
-

[Done]Scheduled Web105 migration

Posted in Downtime by

As scheduled we are starting the migration of Web105 to new hardware. During the migration services on the machine will be unavailable. We will update this post once the migration is over.

2012-02-21 17:17 UTC: The migration is finished and the server is now back online and functioning normally.

-
-

[Done]Scheduled Web102 migration

Posted in Downtime by

As scheduled we are starting the migration of Web102 to new hardware. During the migration services on the machine will be unavailable. We will update this post once the migration is over.

2012-02-21 16:20 UTC: The migration is finished. The new server is online and functioning normally.

-
-

[Done]Scheduled maintenance on Web168, February 21st.

Posted in Scheduled downtime by

Web168 will be taken offline for a disk swap Tuesday, February 21st 2012 between 09:00 UTC and 13:00 UTC. We’ll update this post as maintenance progresses.

2012-02-21 09:56 UTC The disk has been replaced and is now rebuilding. The server is back to operational status.

-
-

Emergency maintenance on web28

Posted in Downtime by

The server has been having problems since the last few hours of intermittently becoming unresponsive.

We think the problem is due to faulty RAM and have scheduled an immediate RAM
replacement to solve it.

We will update this post regularly with more information.

2012-02-20 19:44 UTC: the server RAM was replaced, but the problem persists. We’ll continue to troubleshoot and will update this post when we have more information.

2012-02-20 01:16 UTC: We’ve isolated the problem down to a out of memory killer error that is being triggered by numerous processes (different processes each time). We’ve disabled all non-essential services on the machine. The server seems stable for now and we’re continuing to monitor it.

-
-

[Fixed] Emergency maintenance on web129

Posted in Downtime by

Earlier today web129 suffered a hardware failure, causing the server to be available intermittently. We’re currently performing a full scan of all the hardware to determine the exact cause.

We will update this post regularly with more information.

2012-02-19 15:45 UTC: Initial diagnostics showed bad memory. We have replaced all the memory and are continuing with the hardware tests.
2012-02-19 16:20 UTC: web129 is back online

-
-

[Done] Reboots on various servers on February 19th 2012

Posted in Downtime by

On February 19th we will be rebooting the following servers for routine kernel updates at various times throughout the day:

  • web27
  • web28
  • web30
  • web35
  • web37
  • web41
  • web49
  • web53
  • web60
  • web65
  • web70
  • web72
  • web77
  • web88
  • web95
  • web100
  • web106
  • web114
  • web117
  • web129
  • web146
  • web147
  • web157
  • web166
  • web174
  • web192
  • web193
  • web197
  • web200
  • web205
  • web213

Downtime on each server is expected to be less than 20 minutes. We will update this post as maintenance progresses.

2012-02-19 11:10 UTC: All servers except web129 have been rebooted and are back online. We’re investigating the issue with web129.
2012-02-19 13:30 UTC: Web129 is suffering from a hardware fault. We have created a new post specifically for this server.

-
-

[Complete]Scheduled Web148 migration

Posted in Downtime by

As scheduled we are starting the migration of Web148 to new hardware. During the migration services on the machine will be unavailable. We will update this post once the migration is over.

2012-02-17 17:29 UTC: The migration is complete and the server is now back online.

-
-