This article is out of date!!  Use this article instead:  http://communities.intel.com/community/wired/blog/2012/03/16/even-more-ecc-updates

 

     ECC, FCC, GCC, its all big soup of acronym, most of which you would think doesn't apply to networking.  Sure GCC compiles our drivers under Linux*, and FCC might someday monitor the interwebs more (but doesn't today), but ECC still stands out as the one that might not apply.

But it does.

     ECC has many real "definitions" error correcting circuits, error correcting code, or error correction code, but they all do the same thing.  It helps keep data intact within the chip memory.   ECC uses a special algorithm to encode information in a block of bits that contains sufficient detail to permit the recovery of a single bit error in the safeguarded data.  This protocol will not only detect single bit errors, but will transparently correct them on the fly.   Double errors will be flagged as an error and the device will try to get software’s attention about it.  Related to ECC is parity.  Parity will keep track of the quantity of bits in total and track them as either even or odd.  Should this parity change while it is the chip memory, it will be flagged as an error.  Since which bit went rogue you can't tell, this is a poor man's protection. Also if more than 1 bit changes parity check can miss it.  (Warning!  HTML Table!)

 

 

Product

Packet Buffer

(In band Traffic)

Managability

(out of band traffic)

82546

None

None

82571

ECC

Parity

82580 **new**

ECC

ECC

82575 and 82576

ECC

ECC

82598 and 82599

ECC

ECC

82573 /  82574 / 82583

None

None

82559 to 82551

None

None

 


     Both ECC and parity have a basic limitation in that if the error is large enough, it will look okay.  ECC is far more resistant to this.    We try to make sure bad things don't happen to your data, but it still might happen.  And while it will try to tell you when it does go bad, sometimes it still won't notice.  That's why Intel lawyers get edgy around articles like this.  Multiple bit errors are very rare, and probably will cause other problems to the machine.  Data integrity isn't made with a single point safety net.  If you want to guarantee your data, use a multiple layered approach since its unlikely that the all them will fail.

Big finish!
1)  ECC can help your undestand what happens to your data
2)  Our more recent products support it
3)  Thanks for reading the blog