1 2 Previous Next 27 Replies Latest reply on Sep 4, 2012 4:17 PM by mark_h_@intel

    VMQ on a team which is trunked - 10 GbE

    martius

      Hi,

       

      I'm running into a strange situation when I enable VMQ on my two NIC team used for trunking in Hyper-V.

       

      I use VMLB as teaming type. When I do a Live Migration with VMQ enabled on the team I will loose more than the usual 1 ping to a VM.

       

      The NIC's that I'm using are: Intel(R) Ethernet Server Adapter X520-2

      Driver version is: 16.6 from Intel (http://downloadcenter.intel.com/Detail_Desc.aspx?agr=Y&DwnldID=18725&ProdId=3153&lang=eng)

       

      Servers are Dell PowerEdge R810 and the are connected to Dell PowerConnect 8024F switches.

       

      When I disable VMQ everything is as it should be. I loose at most 1 ping during a Live Migration.

       

      In the release notes (http://downloadmirror.intel.com/18725/eng/readme.txt) there is some text regarding teming and VMQ:

       

      In the Teaming Known Issues

      --------------------

       

        Teaming VMQ-enabled devices may disable VMQ on the NICs

        -------------------------------------------------------

        If you create a team out of VMQ-enabled devices, VMQ may become disabled on

        all devices in the team. To work around this issue, create the team first,

        then enable VMQ on an adapter. If all adapters in the team are capable of VMQ,

        VMQ will become enabled on the team.

       

      I tried creating the team with VMQ disabled and enabled. No difference.

       

      Any tips?

       

      Regards

        • 1. Re: VMQ on a team which is trunked - 10 GbE
          Patrick_Kutch

          Thanks for using Intel Ethernet and visiting our forum.

           

          I've sent your question off to our virtualization team and will post a response when they get back to me.

          • 2. Re: VMQ on a team which is trunked - 10 GbE
            martius

            Any answer yet?

             

            I'm getting the same with 1 GbE Intel Quad ET cards now?

             

            Same driver, same sympthoms.

            • 3. Re: VMQ on a team which is trunked - 10 GbE
              Patrick_Kutch

              The eval team is going to try to reproduce this issue in our lab soon.

              • 4. Re: VMQ on a team which is trunked - 10 GbE
                Blackduke77

                HI, I have just spent weeks trying to fix an issue with my two three node clusters where servers could not communicate between nodes. I have had cases with Microsoft and Dell as I use Dell M610's to no avail.

                 

                I found this post tonight and it seems to have resolved my issue, I was using a single nic (no team) with the hyper-v profile and it did not work, I have turned off VMQ as per this post on all nodes and it seems to have resolved my issue. I am using driver version 15.5.2

                 

                Is there a fix for this soon, as I could really use VMQ

                 

                thanks

                • 5. Re: VMQ on a team which is trunked - 10 GbE
                  Patrick_Kutch

                  Our validation team is still trying to reproduce the issue your reported.  Thus far they have been unable to and are requesting some additional information:

                   

                  “Has everything required for Live Migration (VM traffic, LM traffic, iSCSI traffic, management traffic) running on the trunk, or does he have separate networks/connections for management and iSCSI? “

                   

                  If you could provide details on your configuration it may help us to make progress.

                   

                  thanx,

                   

                  Patrick

                  • 6. Re: VMQ on a team which is trunked - 10 GbE
                    Blackduke77

                    I have seperate network cards for the other networks, I had two dedicated Intel nics for VM traffic and could reproduce the issue with a team and without.

                     

                    I am more than happy to work with your team remotely if they wnt to run any tests or see it happening on my network. 

                    • 7. Re: VMQ on a team which is trunked - 10 GbE
                      brycel

                      Hello,

                       

                      We are experiencing the same issue across two different types of NIC.

                       

                      The first issue we have with VMQs is on our 10Gb Intel(R) Ethernet Server Adapter X520-2.  When we do a migration to a target host that doesn't have any VMs currently running on it we get the network drop out as discussed here.  If the target has running VMs then the network traffic doesn't drop which is very bizzare!

                       

                      We also have some 1Gb Intel(R) Gigabit ET Quad Port Server Adapters and when VMQs are running the VMs always have network drops when migrating even if the target hosts have running VMs.

                       

                      I have opened a ticket with Microsoft and they have been trying the basics around getting VMQs setup, I don't think they understand that VMQs are running and working, it is the way the network stack is dealing with them during a migration.

                       

                      I hope this helps isolate the issue and we can get a fix soon as we are having to disable VMQs on certain setups to prevent any issues.

                       

                      Regards,

                      Bryan

                      • 8. Re: VMQ on a team which is trunked - 10 GbE
                        martius

                        Hi,

                         

                        I have 2 Hyper-V clusters with the same issue.

                         

                        I test everything with a VM with al least 16GB of RAM because then the Live Migration period is long enough the reporoduce the issue.

                         

                        Bassically the setup of cluster 1 is as follow:

                         

                        Windows Hyper-V Server 2008 R2 SP1

                         

                        3 x Dell R810 with:

                        2 x Intel Xeon E7 2860, 10 core

                        256 GB of RAM

                        4 x OnBoard Broadcom 5709C

                        3 x Dual Intel Ethernet Server Adapter X520-2

                        Intel driver version: 16.6

                        Broadcom driver: 14.4.8.4

                        Broadcom Management Software: 14.4.11.3

                         

                        Switches:

                        2 x Dell PowerConnect 8024F

                        2 x Dell PowerConnect 6248

                         

                        The Intel Ethernet Server Adapter X520-2 cards are connected to the PowerConnect 8024F. These switches arre "stacked" bij use of VRRP. I have tested the Live Migration with VMQ enabled and disabled with the swithes "stacked" and non stacked. Same result.

                         

                        First test was as follow:

                        1. Created a team of two Intel Ethernet Server Adapter X520-2 in VMLB

                        2. Create External Virtual Switches in Hyper-V Manager, no parent partition connection

                        3. Configuring all the settings as advised here on the : http://blogs.technet.com/b/cedward/archive/2011/04/13/hyper-v-networking-optimizations-part-2-of-6-vmq.aspx

                        4. Did some tests with a VM between the 3 nodes. All the same result. After 5-10% the pings start loosing and the connection to the VM is lost when using RDP.

                         

                        Second test was as follow:

                        1. Single Intel Ethernet Server Adapter X520-2

                        2. Create External Virtual Switches in Hyper-V Manager, no parent partition connection

                        3. Configuring all the settings as advised here on the : http://blogs.technet.com/b/cedward/archive/2011/04/13/hyper-v-networking-optimizations-part-2-of-6-vmq.aspx

                        4. Did some tests with a VM between the 3 nodes. All the same result. After 5-10% the pings start loosing and the connection to the VM is lost when using RDP

                         

                        Third test:

                        1. Created a team of two Intel Ethernet Server Adapter X520-2 in VMLB

                        2. Configuring all the settings as advised here on the : http://blogs.technet.com/b/cedward/archive/2011/04/13/hyper-v-networking-optimizations-part-2-of-6-vmq.aspx

                        3. Create External Virtual Switches in Hyper-V Manager, no parent partition connection

                        4. Did some tests with a VM between the 3 nodes. All the same result. After 5-10% the pings start loosing and the connection to the VM is lost when using RDP

                         

                        Fourth test:

                        1. Single Intel Ethernet Server Adapter X520-2

                        2. Configuring all the settings as advised here on the : http://blogs.technet.com/b/cedward/archive/2011/04/13/hyper-v-networking-optimizations-part-2-of-6-vmq.aspx

                        3. Create External Virtual Switches in Hyper-V Manager, no parent partition connection

                        4. Did some tests with a VM between the 3 nodes. All the same result. After 5-10% the pings start loosing and the connection to the VM is lost when using RDP

                         

                        In all tests the Live Migration was successful BUT the RDP connection to the VM and pings were lost during the Live MIgration process. Also sometimes the RDP connectin wasn't reconnected after the Livie Migration was succesfull finished.

                         

                        When disabling VMQ in the single NIC or the teamed NOC configuration, Live Migration was still succesfull BUT the biggest difference was that the RDP connection stayed up and there was no pingloss during the Live Migration.

                         

                        As mentioned I have a second cluster where I'm experiencing this issue. Main difference here is that this environment only uses Intel Ethernet Server Adapater Quad ET. Same Drivers, same OS. The hosts in this cluster are 2 Dell R710's with one X5670~6cores and 64GB of RAM. Same test setups, same result. Disable VMQ and LIve MIgration is succesfull in all area's.

                         

                        Hopefully this helps. If any more question please reply!

                        • 9. Re: VMQ on a team which is trunked - 10 GbE
                          martius

                          “Has everything required for Live Migration (VM traffic, LM traffic, iSCSI traffic, management traffic) running on the trunk, or does he have separate networks/connections for management and iSCSI? “


                          To answer the questions you asked.

                           

                          1. I have seperate networks for Live Migration, Virtual Machine, iSCSI and Management Traffic.

                           

                          • Live Migration Traffic: Single Intel Ethernet Server Adapter X520-2
                          • Virtual Machine Traffic: Two Intel Ethernet Server Adapter X520-2 on diffrerent cards in different risers in a VMLB team. This team then is used to create a virtual network without connection to the parent partition.
                          • iSCSI Traffic: Two Intel Ethernet Server Adapter X520-2 on diffrerent cards in different risers
                          • Parent Partition Traffic (management traffic): Two onboard Broadcom 5709C cards in a SFT team (1/4 and 3/4)
                          • Cluster / Heartbeat Traffic: Two onboard Broadcom 5709C cards in a SFT team (2/4 and 4/4)

                           

                          2. The Virtual Machine Traffic is connected configured to be used as a trunk.

                           

                          3. Yes I have seperate networks/connections for Management and iSCSI.

                          • 10. Re: VMQ on a team which is trunked - 10 GbE
                            Blackduke77

                            Hi have the same setup as Martius

                            • 11. Re: VMQ on a team which is trunked - 10 GbE
                              martius

                              Tried latest driver 16.7. Unfortunatly no success.

                               

                              The issue starts when the Live Migration enters the "Brow-out" part of the migration. With a VM with 16GB of RAM I'm still getting 5-6 ping loss when executing a Live Migration.

                              • 12. Re: VMQ on a team which is trunked - 10 GbE
                                brycel

                                Any update on this?  The issue is quite a show stopper for VMQ and live migration.

                                • 13. Re: VMQ on a team which is trunked - 10 GbE
                                  Patrick_Kutch

                                  Sorry for the long delay - I am afrid this issue got lost in the confusion of the holidays and the fact that I was away in Antarctica for a month.

                                   

                                  Back now and am happy to report that our eval team has reproduced this issue.  A defect has been filed and the engineering team has added it to their list of tasks to work on.  At this time I do not have an E.T.A. on a release; when I have one I will pass it along.

                                   

                                  Make sure to keep poking me to keep me honest though.

                                  • 14. Re: VMQ on a team which is trunked - 10 GbE
                                    brycel

                                    Thats good news that you have been able to recreate the issue.  Hopefully the issue is not too complex to resolve.

                                     

                                    Will keep bumping this occasionally.

                                    1 2 Previous Next