SR-IOV is not just for virtualization anymore!

 

In my blog posting last week, I pointed you to a video  I made discussing and demonstrating Intel Flexible Port Partitioning (FPP).  If you haven’t watched the video yet, I think it is worth a view (though I am admittedly biased ).

 

 

FPP is a new way to look at using SR-IOV to partition up a single, discrete Intel Ethernet connection into multiple Ethernet devices within an open source Operating System.   The quickest and easiest way to understand what I’m talking about is to watch the video.

 

We have also finished and published a whitepaper that discusses FPP in more details.  It discussed what it is, how it works and how it is useful for both Intel 10 Gigabit 1 Gigabit controllers.

 

The whitepaper: An Introduction to Intel Flexible Port Partitioning Using SR-IOV Technology Technical Brief is intended to be the 1st part of a 2 part series.  We are already working in the 2nd part, which will detail the usage of the different tools used to configure FPP and some additional usage models of FPP.

 

 

 

Not sure what SR-IOV is?  Click on the link to see a video explanation.  I also have the PCI-SIG SR-IOV Primer you might want to check out.

 

 

We hope you find the paper of use.  Enjoy!

As former employees of Fulcrum Microsystems, we are very pleased to now be part of Intel Corporation. Our group has remained largely intact within Intel. Our 10GbE and 40GbE switch silicon provides an ideal complement to Intel Ethernet controllers and embedded processors used in applications such as large data center networks, broadband access and network security.

 

Fulcrum switches are now a key part of Intel’s data center strategy. Large data centers require efficient scalability that cannot be provided with traditional enterprise networking gear. There has been a lot of press recently about the importance of flat networks as more and more east-west traffic starts to dominate the data center. In addition, a single client transaction can spawn multiple server-to-server workflows making latency important in networking equipment purchasing decisions. New data center networking requirements such as virtualization and data center bridging (DCB) elevate the need for coordination between Ethernet controllers and network switches.

 

By adding Fulcrum switch silicon to its data center product portfolio, Intel owns a complete family of silicon components to enable these new large and efficient data centers. Fulcrum switches have been designed from the ground up to provide low latency data center fabrics that can scale up to tens of thousands of 10GbE server port. In addition, Fulcrum switches can be optimized with Intel Ethernet controllers to provide superior virtualization and DCB features than can be found in competing solutions. As part of Intel, we will see our vision of the future data center networks become a reality.

So, were you one of the ones that were able to visit me at booth #905 at the Intel Developer Forum in San Francisco last week and watch the Intel Flexible Port Partitioning demonstration?  Well, if you weren’t, or if you are looking to see the demonstration again, it’s your lucky day!

 

Intel Flexible Port Partitioning (FPP) is the ability to use SR-IOV Virtual Functions, which have up until now been thought of as a strictly virtualization technology, and use them in a bare-metal (or mixed) Open Source OS.  This provides a way to very flexibly and efficiently carve up your Ethernet ports.

 

I was so taken aback with the overwhelmingly positive response to the demonstration and the two chalk talks, as well as the standard session Brian Johnson and I presented last week, that when I came home from San Francisco, I combined our session material with the demonstration and produced a “from-the-hip” video explaining the technology and showing a demonstration.

 

I have a 15 minute cap in my YouTube account; the video is 14 minutes and 59 seconds.  Hope you will excuse the occasional ‘uhm’

 

I hope you find the demo as interesting as the folks that visited me in the booth did.  Here is a sample of some of the comments I received:

  • Wow – you guys are killing everybody else!  Nobody else is doing this kind of thing!”

  • “Hey – I saw some cool video a while back with little Ethernet packets and guys moving all over the screen that explained SR-IOV very very well – do you know where I can find that?” – was referring to my YouTube video

 

  • “Holy cow, that’s cool!  I’ve got to hook you guys up with our Ethernet architect so we can get even more support for that!”

  • “Oh my gosh!  That is exactly what we need!  You just made my day and will make my architects very ,very happy”

 

  • “Very nice demo, on a great OS.  Really shows the power of FPP; we need to get a joint paper out.”

  • “Wow, that is pretty slick – I’m going to have to go talk to some folks when I get back to the office.”

 

The whitepaper I promised is in the final stages as well and should be published within the next couple of weeks–so keep checking back here for the announcement.

 

Not sure what SR-IOV is?  Click on the link to see a video explanation.  I also have the PCI-SIG SR-IOV Primer you might want to check out.

 

Enjoy,

 

- Patrick

 

dougb

Maximum Teaming

Posted by dougb Sep 16, 2011

A question I get on occasion is how to make my teaming go faster.  This webpage outlines a TON about teaming.  It's a great “go to” reference and lists abilities the infrastructure must have.

 

In terms of the throughput, I'm looking at a test pass report in which we ran some teaming testing.  In a two Intel® Ethernet Server Adapter X520 team under Windows* in our testing we saw 18Gb doing just TX, almost 19Gb doing just RX, and almost 32Gb bi-direction(BX).  That's 18 out of 20, 19 out of 20 and 32 out 40 (10TX1 + 10TX2 + 10RX1 + 10RX2).  Where did the 8Gb go?  Overhead!  Each time somebody has to act on that packet, it takes time, CPU and trips to memory.  Each step and packet touch slows things down enough for it to add up. Also remember that the usable data throughput of 10GbE unidirectional is about 9.49Gb.  Headers, checksums, consume the rest.

 

We used 16 1Gb clients per port for this test.  That's 32Gb RX and 32Gb TX so that should provide saturation.  With 10Gb clients you can lower the number needed.  The ultimate goal with over-saturating the traffic load is to ensure that traffic generation isn’t the bottleneck for your benchmark.

 

Make sure your switch is up to the task.  Most people assume the switch can handle it, but some switches don’t have a lot of backplane throughput to handle the saturation traffic levels.  And for some teaming options, like 802.3ad, you will need specific switch configurations.  These may need additional license.  Consult your switch vendor for more details.

 

On the system side, in the BIOS we get aggressive with the Power Management related setting.  On high performance runs, we don’t care about power.  We usually turn off processor C-states and Speed Step off so the processor doesn't try to sleep during the test.  The time lost coming out of the sleep states will cost you performance.  Performance at its core isn’t about bandwidth; it truly is about time.  The processor guys are epic in keeping these transitions as fast as possible, but in the Ethernet performance land, we can’t spare time for anything.  At 10Gb, that can be 11 million packets per second.  Even with a 4 Gigahertz CPU, nanoseconds lost can mean packets lost.

 

A dynamic team seems faster than static ones since we use dynamic more for our testing. It also keeps you away from proprietary solutions.  Make sure your server has LOTS of RAM.  Ours have 12GB.  Obviously you’ll need an O/S that can address all that memory.  Clients had only 2GB each.

 

Sorry I can't share the test report (it includes can’t-share info on 3rd party’s adapters) but I think I captured the high points to help you get your teams tuned.  Let me know if you have questions in comment section.

 

And, as always, thanks for using Intel® Ethernet

My time as the virtualization guy in the Intel LAN group has all but ended.  My last major task will be over in just one more week.  I will be co-teaching a course at IDF in San Francisco next Thursday (September 15th)  entitled Using Industry Standards to Get the Most Out of 10 Gigabit Ethernet in Linux* Virtualization and Cloud Environments.

 

 

The bulk of this session will detail new usage models for SR-IOV, including solutions for migration of VM’s using SR-IOV.  I will also be discussing Flexible Port Partitioning, which is where we use SR-IOV Virtual Functions in the standard Linux kernel to provide a highly flexible mechanism to carve up your Ethernet connection.

 

 

My partner in crime (Brian Johnson) and I gave a version of this presentation at IDF in Beijing earlier in the year, it was very well received.  We have been working very hard to improve it since then – so we are hoping it is even better and even easier to follow.

 

So if you are going to be at IDF, please join us.

 

 

We will also be doing a live demonstration of Flexible Port Partitioning in the Intel Booth - #903 in the Data Center Zone.  I will be in the booth most of the time to answer questions.

dougb

VMworld* The after report

Posted by dougb Sep 8, 2011

VMworld was a neat show to go to.  Here is virtualization virtuoso Waseem at the booth showing off our demo. He’s the one on the left without the backpack.
More about the demo later.

VMworldFollowupBlogPicture.JPG

In case you didn’t make it to the show, here are the highlights from an Intel® Ethernet viewpoint:


1)      People are deploying 10G and want to know the tips and tricks.  Brian Johnson had a great session that was attended by almost 1500 people.  It was so popular they needed a second session.  Almost 10% of the crowd went to one of the sessions.  If not for Hurricane  Irene cutting attendance, it would have been more.


2)      Ecosystem partners are taking notice of Intel Network solution as a best in class partner.  We received nice comments from partners in their classes.  Steve Herrod from VMware said they achieve 1M IOPS on vSphere* 5 with “with software FcoE and Intel.”  Pat Gelsinger from EMC had this to say:


“And we’re also very happy – and Steve may have stolen my thunder at little bit this morning, Mr. Herrod – our 4x more bandwidth through VNX, that we were able to show 10 GB per second through a single VM utilizing the FCoE technology from Intel running as a software stack of FCoE, you know.  No funky ASICs are required, right, able to run truly in a full software stack riding the Intel technology curve.  And taking advantage of a single VNX 7500 for approximately a 4x increase in the bandwidth of a single array into a single VM.  Truly, a powerful result.”


Mr. Gelsinger was kind enough to stop by our booth later and watch our demo.


3)      We met a great bunch of people.  Felix, Avram, Larry, Jean and a whole bunch of people that just stopped by to see what we were there to talk about.   We had blogger Roger Lund stop by for a quite a bit to kick the tires on our setup.


4)      What they came to the booth to see was our demo of 10 Gigabit and Unified Networking.  We had two servers each running vSphere 5.  One had 6 VMs, of various sizes of RAM from 1GB to 32GB, and FCoE and iSCSI storage.  We even booted one of the server boxes via iSCSI, allowing it to be diskless.  For the coup de grace of our demo, we used vMotion* to move all 6 VMs at the same time from one box to another.  As we started the vMotion, we would ask the visitor to guess how long it would take.  We had answers like 2 minutes, all the way to 10 minutes.


Using both ports of the Intel® Ethernet Server Adapter X520-DA2, it took around 20 seconds to move them all.  At the same time!


We had several people that were so excited by this performance, they starting planning their X520 deployment right there on the show floor.  You’ll see some of those in future whitepapers.


5)      Waseem (he’s so camera shy, he’ll probably get after me for showing that much of him in the picture!) knows so much about virtualization and virtualization deployments that he was able to diagnose BIOS problems, driver incompatibilities, and corrected a couple of installs from the demo!  I learned a ton from him about virtualization and you’ll read about some of that in the months ahead here on the blog.


6)      On a personal note, I got to do some fun stuff.  If you look deep into the background, you’ll see another show vendor that brought along a world record holder to compete against show attendees.  She beat them all, including me!


If you were at VMworld share your story in the comments section!  Hope to see you soon at another Intel® Ethernet event!

Filter Blog

By author: By date:
By tag: