Thanx for using Intel Ethenet adapters. Thanx even more for playing around with SR-IOV!
I'm no longer the guy supporting virtualization questions, however the nice guy who is, happens to be swamped, so I'll step in and see what I can do.
While someting logical such as all of the 1st 63 VF's being on the 1st port and the subesequent 63 being on the 2nd port would make sense and make the world a better place, that is not how it works.
From what I've learned it's all a matter of timing - when the OS boots and starts enumerating devices, much of it is done in parallel threads, so it's a bit of a 'race' if you will. Meaning there is no logic behind the VF bus #'s and the PF. Fun stuff!
Hopefully in the future the OS vendors will add some better logic to this. Int the mean time, what I do is (on Red Hat 6.1 anyhow) look in the /sys/class/net/ethX/device directory. Then do a ls -LD on the virtfn's listed there and it allows me to see the bus # associated with a VF on a PF for ethX.
Here is a script that some smart guys gave me. It is NOT guaranteed to work for you (so please don't give me grief if it doesn't :-) ).
if [ -z "$1" ]; then
echo "usage: lsvf <etherdev> [vf]"
if [ ! -d "/sys/class/net/$1" ]; then
echo "lsvf: interface $1 not found"
if [ -z "$2" ]; then
ls -ld /sys/class/net/"$1"/device/virt* | cut -f 11 -d ' ' | cut -b 4-
ls -ld /sys/class/net/"$1"/device/virtfn"$2" | cut -f 11 -d ' ' | cut -b 4-
The 'meat' of the script is :ls -ld /sys/class/net/"$1"/device/virtfn"$2"
I'm sure somebody who knows scripting can probably expand on this to make life easier for themselves when assigning traffic to specific VF's, until the distro's decide to make it easier for all of us.
Hoping this will help.
Your answer was really helpful!
The 'virtfn' entries do the trick. They also help with commands like
ip link set dev ethX vf <NUM>
because I can use the number glued to virtfn (virtfn0 etc.)
Also, I found that there are 'physfn' entries on the VFs which provide the back reference.
I would like to add that I really enjoyed watching your SR-IOV and VMDq videos and reading your SR-IOV primer and driver companion document.
by now I have played quite some time with SR-IOV. I like it, but there are also many things that I don't understand. I will ask some questions, perhaps if not you, then somebody else can help me.
1) Is there a way to check whether a particular VF has been consumed by a VM or is free to use?
2) When I create the max amount of VFs (63), I see a strange phenomena. The ubuntu sudo command gets stuck each time that I issue it. Strange?
3) When SR-IOV is not enabled and no VFs are created, I see that 24 tx/rx queues are opened for the NIC. It looks like one tx/rx queue is opened per core (and I have 24 cores). When I enable SR-IOV and create even a single VF, I see that only two queues are opened for the PF: one tx queue and one rx queue. Is that an expected behaviour?
I am asking, because I see that in this case, when I do some NW performance test (iperf), I see that rx queue generates all interrupts on the same core and this core is pegged with 100% in softirq. So the performance of the PF is quite poor in this case. I tried setting IRQ affinity for the rx queue to, e.g., 0xFFF, but it has no effect...only the least-significant-bit of the mask is in effect. In kernel documentation I read that this is an expected behavior for IA64 architecture, but in my case this is x64.
Also, a bit off-side question: as we bought several Intel SR-IOV NICs, are we entitled to receive Intel support, or is this forum the place to ask such questions?
All good questions. Wish I had better answers than what I'm about to give you.
There is no user space tool or setting that I know of that keeps track of which VF's have been assigned to a VM.
I have it on good authority (my SR-IOV guru) that the KVM VMM will keep track of this internally, but nothing we know if in user space.
I likewise have no answer for the behavior in ubuntu when you create the max # of VF's.
This leads to the support issue. SR-IOV is a feature that requires all the pieces to work nicely together. BIOS, Platform, OS and Device. Each open source distro is really your 1st line of support. I try to answer general questions on this forum, but the fact is that each disto picks what pieces to add. For example you may not have the latest drivers for the 82599 in one distro but another will have it. Or patches to the kernel for MSI-X interrupt support.
We try to get our support into the kernel, but different distros may add various voodoo ingredients that we have no control nor insight into.
SR-IOV is still kind of in its beginning stages - all the basic building blocks are there, now the OS needs to add more robust support to make the lives of peopls like you and me much easier.
do you think you can look at one more question?
I have been experimenting with using VFs on the host (not on the guest). And I see that in principle it is possible to use a VF on the host like an extra network interface, meaning assign an IP address to it.
My question is: can I bond the PF together with some of its VFs using Linux 'bonding' driver? Do you think it should work?
Basically what I am trying to look at is dividing networking resources between the VMs and the host itself. So, say, if spawn 63 VFs, I can use like 40 VFs for my VMs, and keep the rest of the VFs bonded together with the PF, in order to ensure that the host itself gets fair part of NIC's resources. Does this make sense?
Yes!! You can do this.
We call the Flexible Port Partitioning. I'm working on a video right now demonstrating this, and I'm giving a session at IDF in San Francisco next week in addition to manning the Intel booth doing a demo:-)
Check back here soon and you will see an announcment for my video as well as a white paper on the topic that I'm trying to get finished up.