BUG: soft lockup on dish, rebooted

| | Comments (0)
BUG: soft lockup detected on CPU#0!

Call Trace:
  [] softlockup_tick+0xce/0xe0
 [] timer_interrupt+0x3a8/0x402
 [] handle_IRQ_event+0x4e/0x96
 [] __do_IRQ+0xa4/0x105
 [] do_IRQ+0x44/0x4d
 [] evtchn_do_upcall+0x19e/0x256
 [] do_hypervisor_callback+0x1e/0x2c
  [] show_rd_sect+0x0/0x68
 [] __read_lock_failed+0x8/0x14
 [] get_device+0x17/0x20
 [] .text.lock.spinlock+0x53/0x8a
 [] show_rd_sect+0x27/0x68
 [] sysfs_read_file+0xa5/0x12c
 [] vfs_read+0xcb/0x171
 [] sys_read+0x45/0x6e
 [] tracesys+0xab/0xb5

I will be tracking my debugging process here. (as of this moment, the server has been rebooted, and all domains should be back within 10 minutes or so.)

everyone ought to be back up now, please complain to support@ if you still have issues.

Edit: we're now having a 'infinite retry' disk error

SCSI device sda: drive cache: write back
ata1.00: limiting speed to UDMA/16
ata1.00: exception Emask 0x40 SAct 0x1 SErr 0x800 action 0x2
ata1.00: (irq_stat 0x40000008)
ata1.00: tag 0 cmd 0x60 Emask 0x41 stat 0x41 err 0x4 (internal error)
SCSI device sda: 1953525168 512-byte hdwr sectors (1000205 MB)
sda: Write Protect is off
SCSI device sda: drive cache: write back
ata1.00: limiting speed to PIO4
ata1.00: exception Emask 0x40 SAct 0x1 SErr 0x800 action 0x2
ata1.00: (irq_stat 0x40000008)
ata1.00: tag 0 cmd 0x60 Emask 0x41 stat 0x41 err 0x4 (internal error)
end_request: I/O error, dev sda, sector 603497953
SCSI device sda: 1953525168 512-byte hdwr sectors (1000205 MB)
sda: Write Protect is off
SCSI device sda: drive cache: write back
ata1.00: limiting speed to PIO3
ata1.00: exception Emask 0x40 SAct 0x0 SErr 0x800 action 0x2
ata1.00: (irq_stat 0x40000001)
ata1.00: tag 0 cmd 0x24 Emask 0x41 stat 0x41 err 0x4 (internal error)
SCSI device sda: 1953525168 512-byte hdwr sectors (1000205 MB)
sda: Write Protect is off
SCSI device sda: drive cache: write back
which is weird, as I'd bet money that's an 'enterprise grade' drive that ought to fail straight out rather than looping like that. I'm heading down now.

Leave a comment

About this Entry

This page contains a single entry by luke published on August 1, 2010 2:49 PM.

whetstone reboot was the previous entry in this blog.

possible dish disk failure is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.