Skip navigation

"Play it again Sam."  "Luke, I am your father"  "If you build it they will come".  These are all often quoted misquotes.  Great quotes, but not what the original says.  Looking into the high tech world, we have a list of misquotes of our own.  The one for today is "PCI Express* is 2.5 Gigabits per second".  This is actually a dangerous quote in that it actually creates confusion on the nature of the bus.    Let's look at why.

 

The PCI Express protocol is based on transfers per second.  The listing of the speed is better expressed as 2.5 Giga Transfers per second.  The PCI Express is also an 8bit/10bit symbol based system, so the 2.5 Giga Transfers(GT/s) comes down to just around 2 Gigabits of transactional data.  A functional speed of 250MB/s per lane. When combined with the overhead of the networking device it comes down just a smidgen more.  Now lets talk PCI Express 2.0, also known as PCIe v2.0, AKA Gen 2, AKA Miguel Sanchez.

 

Okay the last one isn't true, and only the "PCI Express® Base Specification Revision 2.0" name is actually embraced by the PCI Special Interest Group.  The base specification for 2.0 includes both a 2.5 Giga Transfer mode and a 5.0 Giga Transfers mode.  You can be PCIe v2.0 compliant and have only 2.5 Giga Transfers.  The first Intel Networking part that supports PCIe v2.0 5.0 GT/s is the Intel® 82599 Ethernet Controller.  Our Intel® 82576 Ethernet Controller is PCIe v2.0 compliant, and runs at 2.5 GT/s.  For a dual 1 Gigabit port (like the 82576), the higher speed is just not necessary.  A dual 1 Gigabit on a x4 would not come close to saturating the bus.  5.0 GT/s isn't needed for 1 Gigabit unless your going to try to do dual port over a x1 lane width.  But the x1 physical connector doesn't hold a card in the slot very well given the physical size required to do a dual port.  That does make x2 a very attractive interface, but x2 is fairly rare so far.  Most vendors (including us) are staying with the x4 and 2.5 GT/s for now.

 

Let's go for the big wrap up.

1)  PCI Express speed is defined in terms of Transfers, not Bytes, or bits.

2)  PCI Express v2.0 includes both 2.5 Giga Transfers a second and 5.0 Giga Transfers modes.

3)  The 82576 does PCIe v2.0 at 2.5 GT/s and the 82599 does v2.0 at 5.0 GT/s (and 2.5 GT/s if that's all you got)

4)  Thanks for using Intel Networking products.

 

Computers do wacky things at times.  Because of their nature, they don't naturally have negative numbers.  There's a couple of choices, two's compliment and one's complimentto do signed numbers.  And I won't even get started on wacky ones (look down further at the one's complement link.  If you dare!)  Why this comp sci talk on a networking blog?

 

Checksums.

 

Data movement can be dangerous work.  It often goes over a lossy media.  Things get dropped, there are bit error rates and all this adds up for the potential of bad, missing or general mischief with the data.  In Ethernet land, we have a packet CRC to help make sure things are okay, but with UDP, TCP and other data movers it might go over Modems, Serial cables, ISDN, ATM, Wireless, avian carriersor signal flags.  So the stacks have their own checksums to make sure that no matter the media they travel over the data can be checked to be valid.  Valid enough I should say since the checksum isn't a guarantee.  But that's a whole 'nother post.

 

For a UDP packet, a checksum value of 0000h means that the packet does not include a checksum. Thus, if the packet is supposed to include a checksum, and the checksum calculation happens to result in a value of 0000h, then the actual value used for the checksum field is FFFFh.  Whoops!  Why?  In the one's complement math system that the stacks use, there is a concept of "negative zero".  FFFFh is negative zero and 0000h is positive zero.  Really.

 

The preferred method of verifying the IP, UDP or TCP checksum is to sum the covered data (in 16-bit quantities) using one's complement arithmetic and comparing the result to FFFFh. If it matches then the checksum is valid; otherwise it is invalid. This method is described in RFC 1071 (section 1, item 3) and in RFC 1624 (section 5).

 

This allows the UDP checksum to be verified without the end system worrying about converting a checksum of 0000h into FFFFh.

It also allows the IP and TCP checksum calculation to use this replacement. Because the values 0000h and FFFFh are equivalent in 1's complement arithmetic, the sum will still be FFFFh for a valid checksum.

 

In order to simplify the design, our hardware always converts a checksum value of 0000h into FFFFh. Because the RFC method accounts for this, there should be no conflict with well-written TCP/IP stacks.  Older DOS stacks and some embedded stripped down stacks are typically the ones that cause trouble.  If you can't correct the behavior of the stack, just turn off checksum offloads and let the stack do it.  Most stacks will use positive zero when its zero and not negative zero.

 

Time for the big review:

1)  In one's complement world of stack checksums, 0000h and FFFFh are both Zero.

2)  Older and stripped down stacks might get that wrong, so watch out.

3)  Thanks for using Intel networking products.

Filter Blog