Skip navigation
dougb

Any Requests from the Audience?

Posted by dougb Jan 27, 2012

The Wired blog will be adding videos this year!  In the spirit of helping this meet the needs of the community, I figured I’d open it up to requests.  (No Free Bird requests please!).  We can have explanations of technologies (like FCoE, or Flow control), how-tos (like teaming setup) or things like video datasheets or even interviews with the people that help shape Intel® Ethernet products.

 

But unless you comment, I can’t supply the videos you need to help you understand and deploy Ethernet.

 

Thanks for visiting!

Our 1 Gigabit and 10 Gigabit products have some specific access rules.  The manuals outline the details, but they are all located in the definitions chapter and I’ve skipped over my fair share of those types of chapters.  This blog will cover the when, where and why of these rules and where to watch out in your implementation.

 

Why is usually a late question to ask, but we’re starting with it.  All the MAC registers are 32 bit, dword accesses.  Some registers (like the VLAN and multicast hash tables) are much bigger, but you still access them one dword at a time.  Outw, UINT32, use what you must, but make sure it does an atomic 32 bit read of the register.  Anything bigger or smaller will not latch the internal logic and it won’t complete.  Why 32 bits?  It makes the access engine very straightforward and it keeps the MAC logic discreet.  By not having byte, or qword accesses available, we can keep the MAC register logic down. That means fewer gates and that means smaller products and less heat to dissipate.

 

So when doing accesses to registers that “span” across two 32 bit registers (like RAR and some stats registers) read the lower dword first.  This will help avoid carries from the lower register to the upper register during the time between the two reads.  When writing to a register that spans, write the latching part last.  In the case of the RAR, there is an “Address valid” bit.  If you write that first, there will be a timeframe (albeit small) that half right addresses will be valid.

 

That’s for the MAC registers.  Now to deal with the NVM/EEPROM.  This is where bits, bytes and words really get messy.  Most NVM devices are expressed as size X bits.  Most of our documentation uses words.  Tools like ethtool use accesses in bytes.  Outch!  Most engineers can do the math in their head, but with some of our features (iSCSI, management) requiring certain sizes of NVM, missing the size means missing the feature.  Not pretty.  To make matters worse, there are byte swapping issues when you start talking about word vs byte.  X Y become YX when reading words.  Since the manuals are in words, you need to keep track.  The Intel-provided tools do it all in words, but ethtool and the Linux guys like it in bytes.  A script should be able to keep track of it for you.

 

Being aware of the access size requirements the Intel® Ethernet devices have will keep you from misreading the NVM or not being able to get data out of the device.  If you have any tips for the NVM or the registers let me know in the comments!

 

Thanks for using Intel® Ethernet.

dougb

Go with the Flow Control

Posted by dougb Jan 13, 2012

Flow control is a key part of keeping your 1 Gigabit and faster network running smoothly.  Somewhere along the line some websites started telling people to turn off flow control so their network would go faster.  In the short term this might be fine; in the long term you’re going to see bigger problems and probably drop more packets than you’ll make up by being able to send as needed.  The problem people would say is that Flow Control stops the traffic, and this costs performance.  Absolutely it stops traffic.  But it stops the traffic the receiver doesn’t have room for!  The flow control is like a stop light controlling access to the highway.  Instead of letting them all in at once, when there is no room for them and gumming up the works even further, flow control gives protection to the receiver.  This protection allows for long term speed and less dropped packets.

 

Consider the following data set that was gathered using ethtool -S eth0 from a real system.

NIC statistics:

     rx_packets: 329461
      tx_packets: 302120
      rx_bytes: 34897969
      tx_bytes: 32293428
      rx_no_buffer_count: 39147
      rx_missed_errors: 1097931
      rx_flow_control_xon: 0
      rx_flow_control_xoff: 0
      tx_flow_control_xon: 228
      tx_flow_control_xoff: 1098233

 

Let’s look at it in detail.  Tx Flow control XOFF is the NIC telling the link partner, “I’m overwhelmed, stop the packets”, Rx Flow Control is the Link partner telling the NIC “I’m overwhelmed, stop the packets”.  Note the difference.  TX FC is transmitting TO the partner, RX FC is receiving FROM the partner.  In this case, the NIC is basically screaming, I’m overwhelmed (More XOFF than packets), and the rx_no_buffer_count and rx_missed_error confirms it.  What this means is the NIC has no resources and is actively dropping packets.  But FC is on!  Why are we still dropping/missing packets?  The link partner is not honoring the flow control packets! In this case, the link partner has sent 1.4 million frames, but only 300K got through because the link partner didn’t care about flow control.  With flow control the packets might take a little longer to get there, but they will get there.

 

Looking at the data, see the 228 XON?  The NIC only caught up 228 times.  That’s not so good.  So what was causing all these missed packets?  Most likely cause is a slow PCI express and/or slow memory implementation.  Packets come all the time and memory slowness and getting combined with another busy device on a few narrow lanes can mean not enough PCI Express bus between something like the ESB or a switch cascaded off a switch.

Moving to 10 Gigabit it is, well, ten times worse.  You have 1/10th the time and ripple effect of delaying a packet moves faster.  It was so problematic that Data Center Bridging (DCB) and DCBx came out to make flow control end to end.  Instead of just link partner to link partner, DCBx allows one overwhelmed end point to tell the overwhelming source to chill out.  This moves the delay caused by flow control to the point most able to deal with it.  While some backplanes of switches can temporarily store terabits of data, having the starting node just not send it right now is the best result.  We’ll do a deeper dive on DCBx another time, but with it you get effectively lossless Ethernet with DCBx and that lets you do FCoE and other storage technologies.

 

Thanks for using Intel Ethernet and turn on your Flow Control!!

The e1000 driver has been around for many years.  But the products that are supported by the driver are starting to end of life.  That means keeping the driver in an active state becomes less and less useful.  But we remain committed to the driver, and here is how things will be different moving forwards.  The e1000 driver on SourceForge.net* will not be changing. Any bug fixes or changes will be made only to the kernel.org version. (It’s so important I’ve colored it red! ;)) We’ll still have the e1000 driver on SourceForge.net, and thanks to it being open source, you can make your own fixes should you run into them.  We have no plans around other drivers at this time, and we’ll update you here on the blog should anything change.

 

NOTE: The e1000 driver will implement the change to a kernel only support model on/around Q2 of 2012.  If you have open issues you would like to see resolved, please submit them through http://sourceforge.net/projects/e1000/ as soon as possible.

Filter Blog

By date: By tag: