the IPv6 crash. Again. boutros

| | Comments (0)
so, there is a problem in CentOS6/xen, for some reasons, sometimes IPv6 won't work.  It doesn't forward the multicast packets as it must for the neighbour discovery policy to work. 

in the places where IPv4 would use broadcast (FF:FF:FF:FF:FF) ethernet frames to map IPs to mac addresses, IPv6 uses multicast (x3:xx:xx:xx:xx:xx, but specifically 33:xx:xx:xx:xx:xx for IPv6 neigh discovery)  which should be treated the same by a bridge but apparently isn't always.

so the idea is that when IPv6 doesn't work, you do a brctl setportmcrouter xenbr0 [interface not seeing multicast] 2

this usually fixes it.   But sometimes?  it crashes the box, with the following backtrace:


BUG: soft lockup - CPU#0 stuck for 60s! [swapper:0]
CPU 0:
Modules linked in: xt_physdev ipt_MASQUERADE iptable_nat ip_nat netloop netbk blktap blkbk bridge lockd sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink iptable_filter ip_tables ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables be2iscsi ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic ipv6 xfrm_nalgo crypto_api uio cxgb3i libcxgbi cxgb3 libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec dell_wmi wmi button battery asus_acpi ac parport_pc lp parport joydev sg pcspkr i2c_i801 igb serio_raw serial_core i2c_core 8021q dca tpm_tis tpm tpm_bios dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod ahci libata raid10 shpchp mptsas mptscsih mptbase scsi_transport_sas sd_mod scsi_mod raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 0, comm: swapper Not tainted 2.6.18-308.20.1.el5xen #1
RIP: e030:[<ffffffff886df191>]  [<ffffffff886df191>] :bridge:br_dev_queue_push_xmit+0x1fc/0x200
RSP: e02b:ffffffff8079fcc0  EFLAGS: 00000206
RAX: 0000000000000001 RBX: ffff8800219e1280 RCX: 000000000000020b
RDX: 0000000000000000 RSI: ffff8800219e1280 RDI: 0000000000000000
RBP: ffffffff886df1e6 R08: 0000000000000000 R09: ffffffff886def95
R10: 0000000000000000 R11: 0000000000000001 R12: ffff88002931b000
R13: ffff88002931b220 R14: ffff880018803a80 R15: ffffffff886df1e6
FS:  00002aaf6e0c26e0(0000) GS:ffffffff80637000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000

Call Trace:
 <IRQ>  [<ffffffff886df1e4>] :bridge:br_forward_finish+0x4f/0x51
 [<ffffffff886df24f>] :bridge:__br_forward+0x69/0x9c
 [<ffffffff886ded7e>] :bridge:deliver_clone+0x36/0x3d
 [<ffffffff886deda9>] :bridge:maybe_deliver+0x24/0x35
 [<ffffffff886dee20>] :bridge:br_multicast_flood+0x66/0x106
 [<ffffffff886dfd41>] :bridge:br_handle_frame_finish+0x0/0x1d3
 [<ffffffff886dfe21>] :bridge:br_handle_frame_finish+0xe0/0x1d3
 [<ffffffff886e0099>] :bridge:br_handle_frame+0x185/0x1a4
 [<ffffffff8022143d>] netif_receive_skb+0x3a8/0x4c4
 [<ffffffff88286e60>] :igb:igb_poll+0x73e/0xb55
 [<ffffffff8020d00b>] net_rx_action+0xb4/0x1c6
 [<ffffffff80212f44>] __do_softirq+0x8d/0x13b
 [<ffffffff80260da4>] call_softirq+0x1c/0x278
 [<ffffffff8026eb89>] do_softirq+0x31/0x90
 [<ffffffff802608d6>] do_hypervisor_callback+0x1e/0x2c
 <EOI>  [<ffffffff802063aa>] hypercall_page+0x3aa/0x1000
 [<ffffffff802063aa>] hypercall_page+0x3aa/0x1000
 [<ffffffff8026ffc8>] raw_safe_halt+0x87/0xab
 [<ffffffff8026d573>] xen_idle+0x38/0x4a
 [<ffffffff8024ad6d>] cpu_idle+0x97/0xba
 [<ffffffff80762b11>] start_kernel+0x21f/0x224
 [<ffffffff807621e5>] _sinittext+0x1e5/0x1eb

Leave a comment

About this Entry

This page contains a single entry by luke published on December 7, 2012 5:51 PM.

Final(?) status update for boutros was the previous entry in this blog.

one month credit for users effected by cerberus outage applied is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.