February 2012 Archives

coins crash last night

| | Comments (0)
so coins crashed when a drive was being rebuilt last night.   It came right back up and I'm now rebuilding the drive with a much lower min_rebuild_speed value, which I hope might help keep it from crashing again.    There is a new kernel in the works that I will write about later if it passes testing.   

New update on prgmr.com servers

| | Comments (0)
blog.prgmr.com and wiki.prgmr.com may now be accessed over IPv6. 

We also now have SSHFP records in dns for all of the dom0's.  If you notice any missing, please let support know.

mares down again.

| | Comments (2)
definitely a hardware problem.  I'm going to reboot it now, and replace the hardware with a spare tonight.

Yeah, uh, it's not rebooting, even goosed from the PDU.  I've gotta swap the hardware.  expect another 2 hours of downtime. 

update:  ugh,  that was a lot longer than 2 hours.  We are almost done.  A comedy of errors; I left my rescue CD at home and the dhcp/pxeboot server I normally use to get rescue images in this case are on mares.   expect a good debriefing on how I'm going to prevent these problems going forward after I have it up. 

Note, yes, the dhcp server is down.  Customers that are on dhcp (which should be none of the customers I setup)   may be down now.  You can fix this by adding your IP address statically. 

I am kind of in and out of the #prgmr channel on irc.freenode.net as I work.

It is done.  Everyone is back.   I am cleaning up at the data center, then I will stop for a burger and clean up the support queue.   If you are still down, complain loudly and I will help.

The mares hardware is completely replaced (save for two harddrives that I needed to keep;  raid is rebuilding now.)     I will figure out what the problem was later.