5 Replies Latest reply on Dec 14, 2011 1:52 PM by cwyborny

    kernel panic when I remove vlan on 82576 which enslaved by bond

    nalinaly

      Hi, all
      A kernel panic occurred when I do some operation about vlan.

       

      The operation is as below:
      ifconfig eth2 up
      modprobe bonding
      modprobe 8021q
      ifconfig bond0 up
      ifenslave bond0 eth2
      vconfig add eth2 3300
      vconfig add bond0 33
      vconfig rem eth2.3300

       

      the panic stack is as below:
      [<ffffffffa002f1c9>] panic_event+0x49/0x70 [ipmi_msghandler]
      [<ffffffff80378917>] notifier_call_chain+0x37/0x70
      [<ffffffff80372122>] panic+0xa2/0x195
      [<ffffffff80376ed8>] oops_end+0xd8/0x140
      [<ffffffff8001bea7>] no_context+0xf7/0x280
      [<ffffffff8001c1a5>] __bad_area_nosemaphore+0x175/0x250
      [<ffffffff80376318>] page_fault+0x28/0x30
      [<ffffffffa039dabd>] igb_vlan_rx_kill_vid+0x4d/0x100 [igb]
      [<ffffffffa044045f>] bond_vlan_rx_kill_vid+0x9f/0x290 [bonding]
      [<ffffffffa047e636>] unregister_vlan_dev+0x136/0x180 [8021q]
      [<ffffffffa047ed20>] vlan_ioctl_handler+0x170/0x3f0 [8021q]
      [<ffffffff802c1d3f>] sock_ioctl+0x21f/0x280
      [<ffffffff800e6d7f>] vfs_ioctl+0x2f/0xb0
      [<ffffffff800e726b>] do_vfs_ioctl+0x3cb/0x5a0
      [<ffffffff800e74e1>] sys_ioctl+0xa1/0xb0
      [<ffffffff80007388>] system_call_fastpath+0x16/0x1b
      [<00007f108a2b8bd7>] 0x7f108a2b8bd7

       

      And the nic is as below:
      [root@localhost ~]# ethtool -i eth2
      driver: igb
      version: 3.0.6-k2
      firmware-version: 1.2-1
      bus-info: 0000:04:00.0

       

      # lspci | grep Eth
      01:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)

       

      kernel version´╝Ü
      2.6.32.12-0.7 also happen in 2.6.32-131

       

      I had tried the same operation on other nics, like tg3,bnx2. But they haven’t panic.
      So I find the reason is that igb have two more netdev_ops (ndo_vlan_rx_add_vid and ndo_vlan_rx_kill_vid) than tg3 and bnx2.
      I think the reason of panic is that when ndo_vlan_rx_kill_vid has been called, the vlgrp haven’t been correctly find.

       

      My question is what the purpose of  "vlan_group_set_device(adapter->vlgrp, vid, NULL);" in method igb_vlan_rx_kill_vid?

      unregister_vlan_dev(vlan.c)  has done the same thing after ops->ndo_vlan_rx_kill_vid was called.

      If I delete it , Would it cause somthing bad?