Archive for June, 2009

Red lights of impending doom

June 22nd, 2009

I walked in this morning and got logged in.  Shortly after that, I was approached by a coworker stating that the server room was really loud.  I wandered down and heard the server fans screaming away before I even got to the door.  As soon as I opened the door I was blasted by a head wave as hot as some of our mid summer heat waves here in eastern PA.  At first, I felt like I was walking into some crazy bizarro server room.  Every server was flashing amber or red and there was a slight odor of burning electronics.

I quickly realized the 1 year old AC unit had crapped out. The server room has two doors so I opened both doors and grabbed a few box fans to start moving air through the room. I then started to shut down most of the servers and a large chunk of the network gear. Somewhere during that, I called our facilities department and alerted them of the situation. They began to work on the AC unit and determined that a fuse blew, which has been replaced and the unit is back on. The server room is still cooling down now and I have slowly been bringing servers and other equipment back online.

There are a few issues here. Ultimately, things break and we need to have some sort of control of the environment of that room or at the very least, a way to monitor it. First of all, I should have an independent temperature sensor that will alert me when it gets too hot in there. If someone hadn’t alerted me to the problem when they did, there could have been a shit load of issues if equipment started to fail.

The other issue is that I’ve always been toying around with a network monitor system, but I’ve never been able to settle on anything. I’ve mucked around with Zenoss and nagios but I’ve never stuck with anything. I need to just find a proper system and knuckle down and get it configured.

There are lessons to be learned from this morning!