luke: February 2012 Archives

coins crash last night

| | Comments (0)
so coins crashed when a drive was being rebuilt last night.   It came right back up and I'm now rebuilding the drive with a much lower min_rebuild_speed value, which I hope might help keep it from crashing again.    There is a new kernel in the works that I will write about later if it passes testing.   

mares down again.

| | Comments (2)
definitely a hardware problem.  I'm going to reboot it now, and replace the hardware with a spare tonight.

Yeah, uh, it's not rebooting, even goosed from the PDU.  I've gotta swap the hardware.  expect another 2 hours of downtime. 

update:  ugh,  that was a lot longer than 2 hours.  We are almost done.  A comedy of errors; I left my rescue CD at home and the dhcp/pxeboot server I normally use to get rescue images in this case are on mares.   expect a good debriefing on how I'm going to prevent these problems going forward after I have it up. 

Note, yes, the dhcp server is down.  Customers that are on dhcp (which should be none of the customers I setup)   may be down now.  You can fix this by adding your IP address statically. 

I am kind of in and out of the #prgmr channel on as I work.

It is done.  Everyone is back.   I am cleaning up at the data center, then I will stop for a burger and clean up the support queue.   If you are still down, complain loudly and I will help.

The mares hardware is completely replaced (save for two harddrives that I needed to keep;  raid is rebuilding now.)     I will figure out what the problem was later. 

About this Archive

This page is a archive of recent entries written by luke in February 2012.

luke: January 2012 is the previous archive.

luke: March 2012 is the next archive.

Find recent content on the main index or look in the archives to find all content.