July 2015 Archives

I pulled the wrong power.  The only explanation is that I was tired, and not being careful, which is no excuse at all.

55 customers were rebooted, all have received a small credit. 

Yeah.  so this was in part due to me working on servers in the datacenter during a non-emergency situation.   I was putting more ram in a test server and just taking it in and out of the rack every time I needed to adjust something.  In non-emergency situations, I should do this in the lab, not the data center. 

branch down as of 6:10PDT

| | Comments (0)
UPDATE: The VM were back up as of about 8:15PDT.
Branch had a hardware failure and the 14 VMs still on branch are being moved to new hardware.
On wilson, despite it having a 1Gbps card, the link is running at 100Mbps. I am looking into why on the switch the speed is 1Gbps and there were no dropped packets, but probably the port is mislabeled in the switch configuration.

I am scheduling downtime at 20:00 PDT on Saturday July 11th for someone to change the physical port and the cable. The downtime should be minimal, probably a minute or less.

wilson packet loss

| | Comments (0)
There was intermittant packet loss on wilson between approximately 16:10 and 16:25 PDT, likely related to inbound traffic. There was one particular VM who was the destination, however I believe should have been possible for wilson to manage that amount of traffic so we'll review the full data path.

knife down

| | Comments (0)
UPDATE: 12:18PM: everyone is back up as of a fewer minutes ago on much newer hardware.

UPDATE 10:23AM: knife hardware has bit the dust. Luke is figuring out an alternative.
As of 9:16AM PDT. Luke is looking at it. 25 VMs are effected.

About this Archive

This page is an archive of entries from July 2015 listed from newest to oldest.

June 2015 is the previous archive.

August 2015 is the next archive.

Find recent content on the main index or look in the archives to find all content.