First, thanks for using Inel Ethernet. :-)
I did some asking around of our experts, and I came up with the following:
We think that this issue should happen since the load-balance bonding mode may send traffic with different MAC address than presented by the VF driver.This will cause anti spoofing to be triggered and thus traffic will be halted.
The best thing to do is to disable anti-spoofing for these VFs. We supplied a kernel patch which implement anti-spoofing manipulation (enable/disable) per VF.
The ability to disable anti-spoofing on a per VF basis was recently added to the Linux kernel version 3.2. It requires the latest iproute2 package from kernel.org also.
If you have the kernel and iproute2 support then the feature is also in our latest sourceforge drivers.
I don’t think the feature has been backported to any common distros at this time. It takes time for these types of features (that require kernel/driver/user changes) to filter out to distros.
If you have the correct iproute2 package the command line help will show you.
In any case, it is:
# ip link set <ethx> vf <n> spoofchk [on|off]
Where <ethx> is the PF driver, <n> is the VF index and [on|off] specifies the spoof check setting
Give that a try!
Thanks for your reply. We use 2.6.38 kernel & dont intend to move up to 3.2 soon.
We temporarily downloaded ixgbe-3.7.17 driver & manually set
adapter->antispoofing_enabled = false;
in ixgbe_main. We presume that this will disable antispoofing in total (we tried this ixgbe patch primarily to work with current ip tools).
We are running into some very strange issues with this config. Let me explain the config we are trying to do to & we need your inputs on what we are missing:
# We have multiple physical servers & each run two SRIOV NIC's connected to a switch (this could be two seperate switches)
# We bond the two SRIOV PF's on each of the physical server. This bond has an IP through which the server can be reached. We tried active-loadbalancing & active-backup mode as well for this bond.
# We have multiple VM's running on these physical servers. We expose 1 SRIOV VF from each of the PF into a VM. So a VM has 2 VF's. Within the VM we bond the 2 VF's (across the two different PF's) in active-backup mode. This bond carries an IP that represents the VM.
As you can see the above config is to have the performance of SRIOV VF's for the VM & at the same time have redudancy with bonding.
Now following are the strange issues we run into:
# If within VM a VF is the active_slave & its corresponding PF underneath is not the active_slave on the bond, then for some reason the SRIOV card seems to drop packets between the VM to the physical server's IP on which it is running on. NOTE, that traffic to other physical server's/VM's appear to be fine. We dont understand why, but tcpdump at several point shows that NIC being the only one equation where the packet doesnt flow out. If we change the active_slave within VM to be the VF whose PF underneath is the active_slave then the packets flow through fine. We are clueless how to debug this at SRIOV. Are their other reasons (after spoof checking is disabled) that SRIOV card can drop packets? Is it possible to enable debug logs for them?
# There are several times when arp packets or IP packets are just dropped at SRIOV card level. We dont know why, but this causes randomly IP's to VM's or physical servers not to work. Its very erratic & we dont know what exact point causes it. Again any pointers on how to debug this further with SRIOV card + driver would be helpful.
We initially tried active-tlb, but to keep the oddities to minimum resorted to active-backup. We tried to disable L2LBen (L2 switch b/n PF & VF), but this doesnt help as well.
This apart, do you know if anyone has tried something similar to the config I listed above? Primarily wanted to know if this is something a workable solution.
Are there some specific options to be set out of the several options that the driver has for bonding to work?
Thanks again. Looking for your reply.
I was informed that to do what you are looking for will required the 3.2 kernel. Intel submitted for the kerneal and the networking stack to enable fuller teaming capabilities with SR-IOV.
Wish I had an easier answer for you. Remember that SR-IOV technology is in both hardware and software, takes a while for the software side of the OS to catch-up.
the issues you are running into are based on how linux deals with bonding, you are essentially running two active bonds on the same interface which is seen at the switch. so things get confused there..
to see what I'm talking about configure a port on the switch so you can sniff the traffic over these..
you will have to use a non active bond, in one locatiton. either the pf or the vf. to keep the switch from becoming confused.
or you could use ip link set <interface> vf <vf number> vlan vlan number in the host, and this may help keep them seperated.
each pair of vfs would need to be in a seperate vlan. I have not personnaly tried this configuration as of yet. but it at least passes the eyeball test as something that might work. running them all in the same vlan wont. regardless of antispoofing.
Thanks Robgar for your insight.
Yes we realized this. Now we switched to "active-backup" mode of bonding & got it working.
One strange thing is even with "active-backup" mode we ran into couple of issues. With a pair of VF's bonded (active-backup) within VM, we couldnt reach a pair of PF's bonded (active-backup) on a host that was hosting the VM. This depended primarily on how the "active slave" of the bond was. If the "active slave" of both bonds were on opposite NICs, then the network worked, otherwise not. (We still dont understand the reason for this)
To get over this we finally resorted to using VF's on physical host level (i.e. at hypervisor level) for the bonds. With this everything works fine.
doing the bonding at the host level, would be at least at this point in time my preffered configuration.
why the active backup only works when the active is on a different port. Is a interesting question.
One that I don't have a good answer for. I will have to duplicate this type of configuration to attempt to
figure that out. So let me just say I'll get back to you on that.
We cannot stay with just bonding at host-level as then we will have to setup bridge/tap interface for the VM (we basically lose the SRIOV capability). So we do bonding at host level (as we have a need for VM's to reach hosts) & also bonding at VM level. Using VF's both at host & VM level seems to work for all cases with "active-backup" mode.
Yes this was published back in July. Here is the blog announcement that has the link in it:
Hope you find it of use.
I was working on the same problem for one of my customers.
Basically we were presenting 2 VFs (each from separate PF) to guest VM. Within guest VM intention was to use bonding to make networking resilient. Based on the test I did, only single configuration worked for me.
I used the following options (active/standby bonding mode):
BONDING_OPTS=”mode=1 miimon=200 fail_over_mac=1”
The important option is: fail_over_mac=1. Failover works perfectly OK. No "spoofed packets detected" are anymore generated.
Yes, I can confirm that, thanks Greg W. It looks like fail_over_mac is the only way you can make this work without disabling spoofing, which the current iproute2 package in ubuntu 13.04!! doesn't have updated to support. Also the drivers don't have that as an option. I think FPP is a better option for bandwidth staved apps using virtio-net but for latency sensitive applications active-backup bonding with pci pass through is about all we need.