2 Replies Latest reply on Nov 2, 2016 7:51 AM by Smilery

    X520 DP SFP+ causing network loop?

    chertel

      Hi,

       

      yesterday one of our VMware ESXi 5.5 servers (Supermicro with X520 DP SFP+ NIC) crashed with a PSOD (Purple Screen Of Death).

       

      At the exact same time when the PSOD occured, we had a total network outage of the whole VMware cluster, as both 10G switches, where the two ports of that server's NIC are connected to, had a lot of (spanning tree-) trouble (considering theirselves as the spanning tree root bridge, flapping network ports, broadcast storms and so on).

       

      This outage lasted *exactly* until the moment when we pushed the reset button on that server.

      We already saw the same behavior in November 2015 on the same server with the same bad consequences and also solved it by resetting the server.

       

      I know that sounds really really weird but the only explanation for this behavior, that sounds reasonable for us, is, that the X520 NIC somehow turned into a kind of "bridge all traffic between the two ports"-mode causing a network bridging loop, after ESXi suddenly crashed with a PSOD.

       

      Has someone ever heard of such a weird behavior or can at least somebody imagine that this could have happened?

       

      I think it would be possible to manually and intentionally achieve that behavior by directly configuring the network card, but could it happen accidentally?

       

      Please let me know your thoughts about it.

       

      Best regards,

      Christian Hertel

       

       

       

      ----- Some additional NIC information ------

       

      ~ # esxcfg-nics -l

      Name    PCI           Driver      Link Speed     Duplex MAC Address       MTU    Description

      vmnic0  0000:02:00.00 igb         Down 0Mbps     Half   00:25:90:a4:28:56 1500   Intel Corporation 82576 Gigabit Network Connection

      vmnic1  0000:02:00.01 igb         Down 0Mbps     Half   00:25:90:a4:28:57 1500   Intel Corporation 82576 Gigabit Network Connection

      vmnic2  0000:04:00.00 ixgbe       Up   10000Mbps Full   90:e2:ba:3a:04:2c 9000   Intel Corporation 82599 10 Gigabit Dual Port Network Connection

      vmnic3  0000:04:00.01 ixgbe       Up   10000Mbps Full   90:e2:ba:3a:04:2d 9000   Intel Corporation 82599 10 Gigabit Dual Port Network Connection

       

      ~ # ethtool vmnic2

      Settings for vmnic2:

        Supported ports: [ FIBRE ]

        Supported link modes:   1000baseT/Full

        Supports auto-negotiation: Yes

        Advertised link modes:  1000baseT/Full

        Advertised auto-negotiation: Yes

        Speed: Unknown! (10000)

        Duplex: Full

        Port: FIBRE

        PHYAD: 0

        Transceiver: external

        Auto-negotiation: on

        Supports Wake-on: d

        Wake-on: d

        Current message level: 0x00000007 (7)

        Link detected: yes

       

      ~ # ethtool -i vmnic2

      driver: ixgbe

      version: 3.21.4iov

      firmware-version: 0x61c10001

      bus-info: 0000:04:00.0

       

      ~ # ethtool -k vmnic2

      Offload parameters for vmnic2:

      Cannot get device udp large send offload settings: Function not implemented

      Cannot get device generic segmentation offload settings: Function not implemented

      rx-checksumming: on

      tx-checksumming: on

      scatter-gather: on

      tcp segmentation offload: on

      udp fragmentation offload: off

      generic segmentation offload: off