1 Reply Latest reply on Oct 18, 2012 4:47 PM by mark_h_@intel

    Possible ALB/RLB Bug with Microsoft Failover Clustering in 2008R2

    kylebrandt

      A few of us users on Server Fault (and Stack Exchange, the company that runs the site and Stack Overflow) seem to have run into a bug with Microsoft failover clustering when ALB/RLB is enabled. What seems to happen is that sometimes after the virtual (aka floating IP) is removed from a server and transfered to another, the server that no longer has the server keeps sending out ARP messages claiming that it still has the VIP (I saw this behavior in wireshark).

       

      You can read more about what has been going on at windows server 2008 r2 - Cluster failover and strange gratuitous arp behavior - Server Fault . The difficulty for us is that our theory is that this lies between Failover clustering and teaming, so we are between to very large companies -- so we need to reach out hoping to get the eyes of development at one of the companies. Any ideas?

        • 1. Re: Possible ALB/RLB Bug with Microsoft Failover Clustering in 2008R2
          mark_h_@intel

          Hi Kyle,

          Windows Server cluster failover and adapter teaming are tested together by Intel, and this configuration should work. I notice that the driver versions recorded in the thread on serverfault.com are a few years old. Are you using the latest drivers and ANS (teaming driver)? You can download the latest drivers and software, version 17.4, from http://downloadcenter.intel.com/Detail_Desc.aspx?DwnldID=21228.

           

          The thread on serverfault.com appears to be related to using IPv4, so I will assume that is the IP version. If you are using IPv6, let me know.

           

          What are the models of your Intel adapters? This might not matter since the issue seems related to teaming, but then again, there could be something specific to the NIC or NIC driver.

           

          The whitepaper at http://www.intel.com/network/connectivity/resources/doc_library/white_papers/254031.pdf explains details of how the ALB/RLB team works. For receive load balancing, the driver uses ARPs to balance responses to the MAC addresses of the team members. If upgrading to the latest drivers (from the 17.4 package) does not resolve this issue, try turning off the Receive Load Balancing (RLB) option that is available as part of the ALB team configuration. Let me know if disabling RLB makes the problem go away.

           

          Mark H