replacing now, will report back before the reboot

update: Woo!  okay, got it working without a reboot.  expect some degraded performance as it rebuilds.

bad disk on marshall

the bathtub curve strikes again.   I've replaced the disk and it's rebuilding.  expect degraded performance for a while (the thing claims four hours right now, but I think it rebuilds outside in, so it will slow as it goes inward or if disk I/O load increases.) 

Anyhow, I anticipate no downtime.  

coral reboot

it was hung up this morning;  this is a placeholder;  I'll update when I figure out what happened.

gah.  I don't have logging serial console setup at that location.

expect degraded performance for the next day or so as the disk rebuilds.  

there is a bad disk on coral as well, but either my replacement drive was bad, the backplane was bad, or there is some kernel issue preventing hot swap even though it's an ahci sata card.   all things are possible;  I'll look more at coral tomorrow. 

I got it to see the disk on coral;   the problem is not the backplane; coral is rebuilding.

unplanned reboot of birds.prgmr.com

disk issues.   And this is one of the ancient nvidia mcp55 servers where hot swap doesn't work with the kernel we use, so you will see a reboot.  Sorry.    We have some new stuff to move people, so if you want to be guinea pigs and if you are willing to re-ip (we don't have more servers at this data center) we can move you to a more modern server.   email support. 

Birds is back.   Disk is rebuilding.  expect degraded performance for the next day or so. 

crock soft lockup. again.

BUG: soft lockup detected on CPU#0!

Call Trace:
[Sun Oct  2 08:56:03 2011] <IRQ> [<ffffffff8025758a>] softlockup_tick+0xce/0xe0
 [<ffffffff8020df48>] timer_interrupt+0x3a0/0x3fa
 [<ffffffff80257874>] handle_IRQ_event+0x4e/0x96
 [<ffffffff80257960>] __do_IRQ+0xa4/0x105
 [<ffffffff8020bd5c>] do_IRQ+0x44/0x4d
 [<ffffffff8034c980>] evtchn_do_upcall+0x19e/0x250
 [<ffffffff80209d8e>] do_hypervisor_callback+0x1e/0x2c
 <EOI> [<ffffffff803581ea>] show_rd_sect+0x0/0x68
 [<ffffffff802ebbf9>] __read_lock_failed+0x5/0x14
 [<ffffffff80343f3e>] get_device+0x17/0x20
[Sun Oct  2 08:56:03 2011] [<ffffffff803fc3fd>] .text.lock.spinlock+0x53/0x8a
 [<ffffffff80358211>] show_rd_sect+0x27/0x68
 [<ffffffff802bc351>] sysfs_read_file+0xa5/0x12e
 [<ffffffff8027e3f5>] vfs_read+0xcb/0x171
 [<ffffffff8027e7d4>] sys_read+0x45/0x6e
 [<ffffffff802097b2>] tracesys+0xab/0xb5

All users on crock will be getting a SLA credit this month.