Trial by Fire

So after not getting to sleep for ages last night I got woken up at 1:30 by a whole stack of pages in a row. Most of my servers were down. At first I thought something had freaked out the UPSes and they'd all shut down in error. But I looked more closely at the logs of the servers that I *could* reach. Sure enough "UPS on battery" on the ones connected with smart signalling cables. So power had gone out in just one section of the building. Checked with Jim what we should do. Decided to just go in early in the morning to turn everything back on. Watched as the upses drained themselves and turned the servers off, then went back to bed.

Woke up at 5:15. Looked at the two machines that aren't connected by serial cables to upses. They were still up. I assumed that the power had been restored before their ups died, so I got ready and went into work.

Total blackness. Everything was still dead. Even the machines that I'd checked before I left - their ups lasted over four hours!!

Sigh.

So called security who got onto the contractors who eventually came out. Watched them as they flicked some very big switches - so big they needed to use a metal bar wedged in them to turn them. Very cool. Seems that some big device in the building took out the earth leakage circuit for the whole section of the building. Yay. So power was out for most of the night. Hope people's freezers didn't warm up too much. Kicking myself I didn't call security at 1:30. But I blame only two hours sleep and not thinking straight on that one.

But the coolest thing is, all of the servers I connected up with their serial cables on Friday shut themselves down nicely. Still a couple of little issues to work out, but it was a good real-life test of the shutdown system.

I'm *very* sleepie now.