1 Reply Latest reply on Jan 27, 2014 11:41 AM by Patrick_Kutch

    ixgbe: reply to IXGBE_VF_RESET is lost, because of ixgbe_ping_all_vfs()

    alex@zadarastorage.com

      Greeting all,

      we are seeing the following issue:

       

      A VF sends IXGBE_VF_RESET to its PF. This message is successfully accepted by the PF and handled in ixgbe_vf_reset_msg():

          e_info(probe, "VF Reset msg received from vf %d\n", vf);

          ...

          /* reply to reset with ack and vf mac address */

          msgbuf[0] = IXGBE_VF_RESET | IXGBE_VT_MSGTYPE_ACK;

          ...

          ixgbe_write_mbx(hw, msgbuf, IXGBE_VF_PERMADDR_MSG_LEN, vf);

           return 0;

       

      However, immediately after that, the PF "feels" that its link is up, and ixgbe_watchdog_link_is_up() does:

          ...

          e_info(drv, "NIC Link is Up %s, Flow Control: %s\n",

                 (link_speed == IXGBE_LINK_SPEED_10GB_FULL ?

                 "10 Gbps" :

                 (link_speed == IXGBE_LINK_SPEED_1GB_FULL ?

                 "1 Gbps" :

                 (link_speed == IXGBE_LINK_SPEED_100_FULL ?

                 "100 Mbps" :

                 "unknown speed"))),

                 ((flow_rx && flow_tx) ? "RX/TX" :

                 (flow_rx ? "RX" :

                 (flow_tx ? "TX" : "None"))));

          ...

          /* ping all the active vfs to let them know link has changed */

          ixgbe_ping_all_vfs(adapter);

       

      and ixgbe_ping_all_vfs() does for each VF:

              ping = IXGBE_PF_CONTROL_MSG;

              if (adapter->vfinfo[i].clear_to_send)

                  ping |= IXGBE_VT_MSGTYPE_CTS;

              ixgbe_write_mbx(hw, &ping, 1, i);

       

      So the VF receives a reply, which is:

      IXGBE_PF_CONTROL_MSG|IXGBE_VT_MSGTYPE_CTS = 0x20000100

      and not the expected

      IXGBE_VF_RESET|IXGBE_VT_MSGTYPE_ACK = 0x80000001

       

      So the VF thinks that the PF is "still resetting".

       

      In this particular case, we were trying to add the VF to the bond, and this operation failed due to the above problem.

       

      How this issue can be addressed?

       

      Driver versions:

       

      ixgbe: 3.11.33

       

      ixgbevf: version that comes with Ubuntu Precise 3.2.0-29.46 plus the patch "[net] ixgbevf: fix VF untagging when 802.1 prio is set"

      from here:

      http://permalink.gmane.org/gmane.linux.network/237391

       

      Thanks!

      Alex.

       

      Below is the relevant part of the kernel log from ixgbe and ixgbevf (with added debug prints):

      Jan 26 10:38:09 01-04 kernel: [   11.541962] bonding: febond: Adding slave fe10G2.

      Jan 26 10:38:09 01-04 kernel: [   11.542058] ixgbevf_open: === hw->adapter_stopped

      Jan 26 10:38:09 01-04 kernel: [   11.542059] ixgbevf_reset: === ENTER

      Jan 26 10:38:09 01-04 kernel: [   11.542060] ixgbevf_reset_hw_vf: === ENTER

      Jan 26 10:38:09 01-04 kernel: [   11.542130] ixgbe 0000:03:00.1: eth10G2: VF Reset msg received from vf 0

      Jan 26 10:38:09 01-04 kernel: [   11.542374] Send VF Reset msg REPLY to vf 0

      Jan 26 10:38:09 01-04 kernel: [   11.542428] ixgbe 0000:03:00.1: eth10G2: NIC Link is Up 10 Gbps, Flow Control: RX/TX

      Jan 26 10:38:09 01-04 kernel: [   11.542433] ixgbe_ping_all_vfs: ping VF 0

      Jan 26 10:38:09 01-04 kernel: [   11.542435] ixgbe_ping_all_vfs: ping VF 1

      Jan 26 10:38:09 01-04 kernel: [   11.542437] ixgbe_ping_all_vfs: ping VF 2

      Jan 26 10:38:09 01-04 kernel: [   11.542443] ixgbe_ping_all_vfs: ping VF 3

      Jan 26 10:38:09 01-04 kernel: [   11.542445] ixgbe_ping_all_vfs: ping VF 4

      Jan 26 10:38:09 01-04 kernel: [   11.542447] ixgbe_ping_all_vfs: ping VF 5

      Jan 26 10:38:09 01-04 kernel: [   11.542449] ixgbe_ping_all_vfs: ping VF 6

      Jan 26 10:38:09 01-04 kernel: [   11.542452] ixgbe_ping_all_vfs: ping VF 7

      Jan 26 10:38:09 01-04 kernel: [   11.542454] ixgbe_ping_all_vfs: ping VF 8

      Jan 26 10:38:09 01-04 kernel: [   11.542456] ixgbe_ping_all_vfs: ping VF 9

      Jan 26 10:38:09 01-04 kernel: [   11.542459] ixgbe_ping_all_vfs: ping VF 10

      Jan 26 10:38:09 01-04 kernel: [   11.542461] ixgbe_ping_all_vfs: ping VF 11

      Jan 26 10:38:09 01-04 kernel: [   11.542463] ixgbe_ping_all_vfs: ping VF 12

      Jan 26 10:38:09 01-04 kernel: [   11.542465] ixgbe_ping_all_vfs: ping VF 13

      Jan 26 10:38:09 01-04 kernel: [   11.542468] ixgbe_ping_all_vfs: ping VF 14

      Jan 26 10:38:09 01-04 kernel: [   11.542470] ixgbe_ping_all_vfs: ping VF 15

      Jan 26 10:38:09 01-04 kernel: [   11.542472] ixgbe_ping_all_vfs: ping VF 16

      Jan 26 10:38:09 01-04 kernel: [   11.542474] ixgbe_ping_all_vfs: ping VF 17

      Jan 26 10:38:09 01-04 kernel: [   11.542477] ixgbe_ping_all_vfs: ping VF 18

      Jan 26 10:38:09 01-04 kernel: [   11.542479] ixgbe_ping_all_vfs: ping VF 19

      Jan 26 10:38:09 01-04 kernel: [   11.542481] ixgbe_ping_all_vfs: ping VF 20

      Jan 26 10:38:09 01-04 kernel: [   11.542483] ixgbe_ping_all_vfs: ping VF 21

      Jan 26 10:38:09 01-04 kernel: [   11.542486] ixgbe_ping_all_vfs: ping VF 22

      Jan 26 10:38:09 01-04 kernel: [   11.542488] ixgbe_ping_all_vfs: ping VF 23

      Jan 26 10:38:09 01-04 kernel: [   11.542490] ixgbe_ping_all_vfs: ping VF 24

      Jan 26 10:38:09 01-04 kernel: [   11.542493] ixgbe_ping_all_vfs: ping VF 25

      Jan 26 10:38:09 01-04 kernel: [   11.542495] ixgbe_ping_all_vfs: ping VF 26

      Jan 26 10:38:09 01-04 kernel: [   11.542497] ixgbe_ping_all_vfs: ping VF 27

      Jan 26 10:38:09 01-04 kernel: [   11.542499] ixgbe_ping_all_vfs: ping VF 28

      Jan 26 10:38:09 01-04 kernel: [   11.542502] ixgbe_ping_all_vfs: ping VF 29

      Jan 26 10:38:09 01-04 kernel: [   11.542504] ixgbe_ping_all_vfs: ping VF 30

      Jan 26 10:38:09 01-04 kernel: [   11.542506] ixgbe_ping_all_vfs: ping VF 31

      Jan 26 10:38:09 01-04 kernel: [   11.557228] ixgbevf_reset_hw_vf: === msgbuf[0](0x20000100)!=0x80000001

      Jan 26 10:38:09 01-04 kernel: [   11.557892] ixgbevf_reset: === PF still resetting!

      Jan 26 10:38:09 01-04 kernel: [   11.558385] ixgbevf_reset: === DONE

      Jan 26 10:38:09 01-04 kernel: [   11.558386] Unable to start - perhaps the PF Driver isn't up yet