Taft rebooted itself

| | Comments (0)
Update: When it came up, I tried to finish the update, and it rebooted again, and so to get the kernel update that should fix it, I booted into the non-xen kernel, and finished the update.  I then rebooted the host, to boot into the xen kernel, and it hung when it was rebooting.  Luke is on the way to the datacenter to look at it.

Taft had a kernel panic and rebooted itself while I was yum installing updates.  Guests are coming back up now.

PCI-DMA: Out of SW-IOMMU space for 4096 bytes at device 0000:04:00.0
mpt2sas0: chain buffers not available
PCI-DMA: Out of SW-IOMMU space for 4096 bytes at device 0000:04:00.0
mpt2sas0: chain buffers not available
Feb 14 18:05:34 taft kernel: PCI-DMA: Out of SW-IOMMU space for 4096 bytes at device 0000:04:00.0^M
Feb 14 18:05:34 taft kernel: mpt2sas0: chain buffers not available^M
Feb 14 18:05:34 taft kernel: PCI-DMA: Out of SW-IOMMU space for 4096 bytes at device 0000:04:00.0^M
[Thu Feb 14 18:05:34 2013]Feb 14 18:05:34 taft kernel: mpt2sas0: chain buffers not available^M
PCI-DMA: Out of SW-IOMMU space for 4096 bytes at device 0000:04:00.0
Unable to handle kernel paging request at ffff88002fc89010 RIP:
 [<ffffffff880e66ee>] :mpt2sas:_scsih_qcmd+0x476/0x6e4
PGD 13eb067 PUD 13ec067 PMD 156b067 PTE 0
Oops: 0000 [1] SMP
last sysfs file: /devices/system/cpu/cpu0/topology/physical_package_id
CPU 0
Modules linked in: xt_physdev ipt_MASQUERADE netloop netbk iptable_nat blktap ip_nat blkbk bridge lockd sunrpc ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink iptable_filter ip_tables ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables be2iscsi ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic ipv6 xfrm_nalgo crypto_api uio cxgb3i libcxgbi cxgb3 8021q libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi loop dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec dell_wmi wmi button battery asus_acpi ac parport_pc lp parport joydev sg pcspkr e1000e i2c_i801 i2c_core serio_raw tpm_tis serial_core tpm tpm_bios dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod ahci libata raid10 shpchp mpt2sas scsi_transport_sas sd_mod scsi_mod raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 3, comm: ksoftirqd/0 Not tainted 2.6.18-308.4.1.el5xen #1
[Thu Feb 14 18:05:50 2013]RIP: e030:[<ffffffff880e66ee>]  [<ffffffff880e66ee>] :mpt2sas:_scsih_qcmd+0x476/0x6e4
RSP: e02b:ffffffff8079bd38  EFLAGS: 00010002
RAX: ffff88003e9004f8 RBX: 0000000000000009 RCX: ffffffff880d7057
RDX: ffff88003e9004f8 RSI: 0000000014000030 RDI: ffff88003e009da8
RBP: ffff88002fc89000 R08: ffff880006624000 R09: 0000000000000000
R10: 00000008e8ccc000 R11: 0000000000000000 R12: ffff88000c676e00
R13: ffff88003e8f2978 R14: ffff88003e009db0 R15: 00000000fffa74fa
FS:  00002af38b4a0170(0000) GS:ffffffff80634000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000
Process ksoftirqd/0 (pid: 3, threadinfo ffff880006728000, task ffff88000670b7e0)
[Thu Feb 14 18:05:50 2013]Stack:  ffff88003e233c00  000007322f173fc0  ffff88003e9b9940  ffff88003e9004f8
 ffff88003e9b9900  d50000008023f468  fffffff494000000  0000000300000732
 140000000000000f  ffffffff8021d1d6
Call Trace:
 <IRQ>  [<ffffffff8021d1d6>] __mod_timer+0xff/0x10e
 [<ffffffff88084db2>] :scsi_mod:scsi_dispatch_cmd+0x2ac/0x366
 [<ffffffff8808a4de>] :scsi_mod:scsi_request_fn+0x2c7/0x39e
 [<ffffffff8025e5ba>] blk_run_queue+0x41/0x73
 [<ffffffff880893b5>] :scsi_mod:scsi_next_command+0x2d/0x39
 [<ffffffff88089536>] :scsi_mod:scsi_end_request+0xbf/0xcd
[Thu Feb 14 18:05:50 2013] [<ffffffff880896b4>] :scsi_mod:scsi_io_completion+0x170/0x329
 [<ffffffff880b67ce>] :sd_mod:sd_rw_intr+0x21e/0x258
 [<ffffffff8808992e>] :scsi_mod:scsi_device_unbusy+0x67/0x81
 [<ffffffff80238f47>] blk_done_softirq+0x67/0x75
 [<ffffffff80212eb8>] __do_softirq+0x8d/0x13b
 [<ffffffff8025fda4>] call_softirq+0x1c/0x278
 <EOI>  [<ffffffff8029132c>] ksoftirqd+0x0/0xbf
 [<ffffffff8026db89>] do_softirq+0x31/0x90
 [<ffffffff8029138b>] ksoftirqd+0x5f/0xbf
 [<ffffffff802338c6>] kthread+0xfe/0x132
[Thu Feb 14 18:05:50 2013] [<ffffffff8025fb2c>] child_rip+0xa/0x12
 [<ffffffff802337c8>] kthread+0x0/0x132
 [<ffffffff8025fb22>] child_rip+0x0/0x12


Code: 48 8b 55 10 48 8b 88 c0 03 00 00 75 06 8b 74 24 30 eb 04 8b
RIP  [<ffffffff880e66ee>] :mpt2sas:_scsih_qcmd+0x476/0x6e4
 RSP <ffffffff8079bd38>
CR2: ffff88002fc89010
 <0>Kernel panic - not syncing: Fatal exception
[Thu Feb 14 18:05:51 2013] (XEN) Domain 0 crashed: rebooting machine in 5 seconds.


Leave a comment

About this Entry

This page contains a single entry by Nicholas Bebout published on February 14, 2013 6:18 PM.

Mailing Lists was the previous entry in this blog.

further breakage on rehnquist is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.