3 Replies Latest reply on Apr 21, 2016 2:43 AM by Sandy_Intel

    ixl problems on FreeBSD (XL710)

    jailbird

      I'm having some strange issues with ixl(4) and a X710-DA4 card in a new-ish Intel-based server.  I'm pretty much replicating an existing setup from an older AMD machine that used 2 x X520-DA2 cards and ixgbe(4).  This is all on -CURRENT.

       

      It's meant to be a bhyve server, so the 4x10GE ports are put into a LACP-based lagg(4), then vlan(4) interfaces are bound to the lagg, and then if_bridge(4) interfaces are created to bind the vlan and tap interfaces together.

       

      The X710-DA4 is running the latest NVM from Intel (5.02):

       

       

      dev.ixl.3.fw_version: nvm 5.02 etid 80002284 oem 0.0.0

       

      dev.ixl.2.fw_version: nvm 5.02 etid 80002284 oem 0.0.0

      dev.ixl.1.fw_version: nvm 5.02 etid 80002284 oem 0.0.0

      dev.ixl.0.fw_version: nvm 5.02 etid 80002284 oem 0.0.0

       

      I've tried both the ixl driver that comes with -CURRENT (1.4.3?) and the 1.4.27 driver from Intel and am having the same problem.  The problem is this exactly (sorry it's taken me so long to get to it!):

       

      Using just one interface, one interface + VLANs, the lagg without VLANs, etc, everything works perfectly fine.  As soon as I combine lagg+vlan+bridge, all hell breaks loose.  One machine can ping one alias on the server but not the other while other machines can.  The server itself can't ping the DNS server nor the default route, but can ping things through the default route, etc.  The behavior is very unpredictable.  ssh can take a few times to get in, and then once it, "svn update" will work for a few seconds and then bomb out, etc.  This same config (except using a normal lagg instead of LACP) seems to work on ESXi, so it looks like a driver issue.

       

      He is the working config from the X520-DA2 system:

       

       

      ifconfig_ix0="-lro -tso -txcsum up"

       

      ifconfig_ix1="-lro -tso -txcsum up"

      ifconfig_ix2="-lro -tso -txcsum up"

      ifconfig_ix3="-lro -tso -txcsum up"

      cloned_interfaces="lagg0 tap0 tap1 bridge0 bridge1 vlan1 vlan2"

      ifconfig_lagg0="laggproto lacp laggport ix0 laggport ix1 laggport ix2 laggport ix3"

      ifconfig_vlan1="vlan 1 vlandev lagg0"

      ifconfig_vlan2="vlan 2 vlandev lagg0"

      ifconfig_bridge0="inet 192.168.1.100/24 addm vlan1 addm tap0"

      ifconfig_bridge1="addm vlan2 addm tap1"

      defaultrouter="192.168.1.1"

       

      Here is the "broken" config from the X710-DA4 system:

       

       

      ifconfig_ixl0="-rxcsum -txcsum -lro -tso -vlanmtu -vlanhwtag -vlanhwfilter -vlanhwtso -vlanhwcsum up"

       

      ifconfig_ixl1="-rxcsum -txcsum -lro -tso -vlanmtu -vlanhwtag -vlanhwfilter -vlanhwtso -vlanhwcsum up"

      ifconfig_ixl2="-rxcsum -txcsum -lro -tso -vlanmtu -vlanhwtag -vlanhwfilter -vlanhwtso -vlanhwcsum up"

      ifconfig_ixl3="-rxcsum -txcsum -lro -tso -vlanmtu -vlanhwtag -vlanhwfilter -vlanhwtso -vlanhwcsum up"

      cloned_interfaces="lagg0 tap0 tap1 bridge0 bridge1 vlan1 vlan2"

      ifconfig_lagg0="laggproto lacp laggport ixl0 laggport ixl1 laggport ixl2 laggport ixl3"

      ifconfig_vlan1="vlan 1 vlandev lagg0"

      ifconfig_vlan2="vlan 2 vlandev lagg0"

      ifconfig_bridge0="inet 192.168.1.101/24 addm vlan1 addm tap0"

      ifconfig_bridge1="addm vlan2 addm tap1"

      defaultrouter="192.168.1.1"

       

      I've changed the various flags in the ifconfig_ixl# lines without any obvious differences.  Both machines are connected to the same HPe 5820X switch with the same exact config, so I don't believe it's a switch issue.

       

      Any ideas? Has anybody seen something like this before?