12 Replies Latest reply on Oct 29, 2012 3:09 PM by mark_h_@intel

    OFED Drivers for 10Gig-E cards?

    Eric Theriault

      Hi--

       

       

      I've read that the OFED environment supports 10Gig-E, but that I need drivers to plugin to extend the functionality.  Do all of Intel's 10Gig-E cards support OFED drivers on Linux or is it a subset?  If the latter, which have support?

       

      For context, I'd like to compare SDP over Infiniband to 10Gig-E performance for our application.  If there is another type of environment that you recommend instead, please let me know.  Thanks!

       

       

       

      Eric

        • 1. Re: OFED Drivers for 10Gig-E cards?
          mark_h_@intel

          The adapters you will want to look at for comparison are the NetEffect™ Ethernet Server Cluster Adapters. OFED support is available for those adapters. The adapters are availabe for CX4, Direct Attach (DA) twinax copper, or SFP+ SR optics.

           

          Once you get the adapters you want, get the latest drivers at http://downloadcenter.intel.com/Detail_Desc.aspx?DwnldID=20867.

           

          Mark H

          • 2. Re: OFED Drivers for 10Gig-E cards?
            Eric Theriault

            Hi--

             

             

             

            Thanks Mark for the answer.  I've purchased two of these cards and have installed them into two CentOS 6.3 machines.  The software has installed without a hitch.  Once installed, I did manual configuration similar to what I had done with some Infiniband cards we had, however, the two machines intermittently speak to each other.

             

            Basically what I did to configure them after the software was installed as to execute an "ifconfig ethX 192.168.168.X netmask 255.255.255.0" on both of the machines; when I ping the other machine, sometimes I'm lucky and I get a packet over, but usually, it experiences 100% packet loss ("796 packets transmitted, 0 received, +597 errors, 100% packet loss, time 795909ms").

             

            Looking at TCPDump on those interfaces, I see a bunch of these:

             

            14:13:39.673494 STP 802.1w, Rapid STP, Flags [Proposal], bridge-id 8000.78:fe:3d:57:98:ae.820c, length 43

             

            and a few of these:

             

            14:13:55.891721 ARP, Request who-has 192.168.168.43 tell 192.168.168.42, length 28

             

            If I look at the arp table, the HWaddress was originally unknown.  If I force it to a hardware address and then ping, I get:

             

            === ping

            PING 192.168.168.42 (192.168.168.42) 56(84) bytes of data.

            64 bytes from 192.168.168.42: icmp_seq=22 ttl=64 time=0.145 ms

            64 bytes from 192.168.168.42: icmp_seq=23 ttl=64 time=0.145 ms

            64 bytes from 192.168.168.42: icmp_seq=29 ttl=64 time=0.146 ms

            ...

            --- 192.168.168.42 ping statistics ---

            33 packets transmitted, 3 received, 90% packet loss, time 32413ms

            ===

             

            And on the other machine I see the some packets getting there:

             

            === tcpdump

            [eyt@machine2 ~]$ sudo /usr/sbin/tcpdump -i eth4

            tcpdump: verbose output suppressed, use -v or -vv for full protocol decode

            listening on eth4, link-type EN10MB (Ethernet), capture size 65535 bytes

            14:17:46.411535 STP 802.1w, Rapid STP, Flags [Proposal], bridge-id 8000.78:fe:3d:57:98:ae.820c, length 43

            14:17:48.359763 STP 802.1w, Rapid STP, Flags [Proposal], bridge-id 8000.78:fe:3d:57:98:ae.820c, length 43

            14:17:50.344836 STP 802.1w, Rapid STP, Flags [Proposal], bridge-id 8000.78:fe:3d:57:98:ae.820c, length 43

            14:17:52.176824 STP 802.1w, Rapid STP, Flags [Proposal], bridge-id 8000.78:fe:3d:57:98:ae.820c, length 43

            14:17:53.621149 IP 192.168.168.43 > 192.168.168.42: ICMP echo request, id 14618, seq 22, length 64

            14:17:53.621167 IP 192.168.168.42 > 192.168.168.43: ICMP echo reply, id 14618, seq 22, length 64

            14:17:54.152565 STP 802.1w, Rapid STP, Flags [Proposal], bridge-id 8000.78:fe:3d:57:98:ae.820c, length 43

            14:17:54.621008 IP 192.168.168.43 > 192.168.168.42: ICMP echo request, id 14618, seq 23, length 64

            14:17:54.621019 IP 192.168.168.42 > 192.168.168.43: ICMP echo reply, id 14618, seq 23, length 64

            14:17:56.035934 STP 802.1w, Rapid STP, Flags [Proposal], bridge-id 8000.78:fe:3d:57:98:ae.820c, length 43

            14:17:57.879230 STP 802.1w, Rapid STP, Flags [Proposal], bridge-id 8000.78:fe:3d:57:98:ae.820c, length 43

            14:17:59.838295 STP 802.1w, Rapid STP, Flags [Proposal], bridge-id 8000.78:fe:3d:57:98:ae.820c, length 43

            14:18:00.622136 IP 192.168.168.43 > 192.168.168.42: ICMP echo request, id 14618, seq 29, length 64

            14:18:00.622147 IP 192.168.168.42 > 192.168.168.43: ICMP echo reply, id 14618, seq 29, length 64

            14:18:01.658207 STP 802.1w, Rapid STP, Flags [Proposal], bridge-id 8000.78:fe:3d:57:98:ae.820c, length 43

            14:18:03.542941 STP 802.1w, Rapid STP, Flags [Proposal], bridge-id 8000.78:fe:3d:57:98:ae.820c, length 43

            14:18:05.543580 STP 802.1w, Rapid STP, Flags [Proposal], bridge-id 8000.78:fe:3d:57:98:ae.820c, length 43

            ===

             

            But then moments after, I can't ping it anymore.  I don't see any interesting messages in the syslog of the machines.

             

            This feels like a configuration issue of some sort, however, I was unable to find further documentation.  Any thoughts?  Thanks.

             

             

             

            Eric

            • 3. Re: OFED Drivers for 10Gig-E cards?
              mark_h_@intel

              Can you verify which version of OFED you are using? 1.5.4.1? You might also post whatever details you get from Ethtool on the Ethernet interfaces, e.g. ethtool -i.

               

              The messages make me think that the issue is with the configuration of RSTP on the swittch, so that you are being blocked at the switch ports. Is there anything in the switch logs that might offer some clues?

               

              I don't have any experience in troubleshooting this type of issue, so I am lookng for someone else who might be able to come up with some ideas.

               

              Mark H

              • 4. Re: OFED Drivers for 10Gig-E cards?
                Eric Theriault

                Hi--

                 

                 

                 

                I've installed 1.5.4.1.  The two machines are directly connected to each-other.  I am trying to get these machines downgraded to CentOS 6.2 to see if that could change the problem or not.

                 

                ethtool returns the following on both machines:

                 

                ===

                /sbin/ethtool -i eth4

                driver: iw_nes

                version: 1.5.0.0

                firmware-version: 76.85

                bus-info: 0000:04:00.0

                ===

                 

                Another tool which was fruitful in IB-land was ibv_devinfo which says:

                 

                ===

                hca_id: nes0

                        transport:                      iWARP (1)

                        fw_ver:                         76.85

                        node_guid:                      0012:5503:52d0:0000

                        sys_image_guid:                 0012:5503:52d0:0000

                        vendor_id:                      0x1255

                        vendor_part_id:                 256

                        hw_ver:                         0x5

                        board_id:                       NES020 Board ID

                        phys_port_cnt:                  1

                                port:   1

                                        state:                  PORT_ACTIVE (4)

                                        max_mtu:                4096 (5)

                                        active_mtu:             1024 (3)

                                        sm_lid:                 0

                                        port_lid:               1

                                        port_lmc:               0x00

                                        link_layer:             Ethernet

                 

                ===

                 

                The only difference on the other machine is the noid_guid and sys_image_guid which is "0012:5503:5324:0000".

                 

                Thanks for any insight or ideas.

                 

                 

                 

                Eric

                • 5. Re: OFED Drivers for 10Gig-E cards?
                  Eric Theriault

                  Hi--

                   

                  Just to close the loop -- it turns out that the two machines weren't directly connected together and that the issue was something on the switch configuration.  It has been addressed and so my basic validation appears to work -- I'm now going to start performance testing this configuration and so I'll let you know if this observation changes.  Thanks for your assistance.

                   

                  Eric

                  • 6. Re: OFED Drivers for 10Gig-E cards?
                    mark_h_@intel

                    Hi Eric,

                    Thanks for the update. I'm looking forward to your testing results.

                     

                    Mark H

                    • 7. Re: OFED Drivers for 10Gig-E cards?
                      Eric Theriault

                      Hi Mark--

                       

                      How do you configure SDP for these cards?  I had installed choice #4 (see below) and I do not get a libsdp.so or any kernel modules which suggest SDP, and the "/usr/bin/sdpnetstat" OFED utility is not installed.

                       

                      4) Install Intel driver, RDMA verbs, and MPI components
                      Provides accelerated network,
                      Intel Multicast Acceleration, DAPL, and
                      MPI capability for clusters.

                       

                      In the Infiniband world, the installer would have installed an ib_sdp kernel module and provided a libsdp.so, which from there, I could do add a "LD_PRELOAD=libsdp.so" before the server and client and be able to create an SDP connection between them.

                       

                      Any insight or pointers?  Thanks!

                       

                      Eric

                      • 8. Re: OFED Drivers for 10Gig-E cards?
                        mark_h_@intel

                        I have not been involved directly with any OFED setups, so I am passing on second-hand information that I am getting when I ask the Intel experts. With the NetEffect adapters, you have RDMA (iWARP) support that would help your performance. There is no SDP support in the NetEffect adapter. If you are not going to use RDMA, the X520 adapters might perform better in comparison tests with InfiniBand.

                         

                        If I find out anything else, I will pass it on.

                         

                        Mark H

                        • 9. Re: OFED Drivers for 10Gig-E cards?
                          Eric Theriault

                          Thanks Mark.  I guess my question is less about SDP and more about RDMA.

                           

                          Under Infiniband, my understanding is that RDMA is only exposed in a few programming models, including the IB-Verbs, MPI, and SDP.  Conceptually, SDP can be thought of a TCP stack where if the source and destination ports are capable of the SDP protocol, it will use RDMA to execute the transfer (glossing over lots of detail).  The real beauty of SDP is that it doesn't require changes to the applications to be used.

                           

                          With that in mind, I was expecting that iWARP would somehow need to know who is RDMA capable and who isn't, either explicitly (via configuration) or implicitly.  If it is explicit, I'm clearly not configuring this so I would love to know how to do it.  If it is implicit, then I would like to know how it is being created and how I could verify that my two hosts are actually iWARPing away.

                           

                          So far, my performance tests aren't great (i.e.: worst than our 82599EB cards), so I suspect that I'm doing something wrong -- I'm just trying to figure out what that could be.

                           

                          Eric

                          • 10. Re: OFED Drivers for 10Gig-E cards?
                            mark_h_@intel

                            Checking with the expert that I know I am told, SDP is not supported by the NetEffect adapter so no matter how the adapter is setting up RDMA connections, SDP is not going to work.

                             

                            To check if the NE020 cards are ‘iwarping’ (i.e. running RDMA traffic), you can look at the extended counters in ethtool –S. RDMA is running if QPs created/destroyed increase.

                            ethtool –S ethx

                            CreateQPs: ####

                            DestroyQPs: ####

                             

                            I hope this helps.

                             

                            Mark H

                            • 11. Re: OFED Drivers for 10Gig-E cards?
                              Eric Theriault

                              Hi Mark--

                               

                              Thanks for the response.  Based on your answer, it would seem that iWARP would "just happen"; is that right? Sadly those values are presently 0's

                               

                              $ sudo /sbin/ethtool -S eth4 | grep QP

                                   ModifyQP Timeouts: 0

                                   CreateQPs: 0

                                   SW DestroyQPs: 0

                                   DestroyQPs: 0

                              $

                               

                              Is there any information to suggest why this could be the case?  Please let me know.  Thanks.

                               

                              Eric

                              • 12. Re: OFED Drivers for 10Gig-E cards?
                                mark_h_@intel

                                In order to run RDMA, your application must be RDMA-aware, such as through an MPI or written to take advantage of RDMA connections. UNH has a training course on writing applications for RDMA that might be helpful: https://www.openfabrics.org/resources/training/training-offerings.html. If you are running an RDMA application on the NetEffect adapter and need help configuring your environment, a good first place to start would be the RDMA latency and bandwidth tests packaged in the OFED distribution.

                                 

                                I am going to send you an email to continue the troubleshooting.

                                 

                                Mark H