6 Replies Latest reply on Oct 3, 2017 8:47 AM by browniecake Branched from an earlier discussion.

    Re: Intel X710 vs VMWare ESX: crash and reboot-ESCI lgos suddenly stop

    browniecake

      Hello, when you were running into this issue, where the ESXI logs suddenly stopped or did they provide any information? I may have a similar set up, but my ESXI logs go dark immediately with no helpful information.

        • 1. Re: Intel X710 vs VMWare ESX: crash and reboot-ESCI lgos suddenly stop
          Intel Corporation
          This message was posted on behalf of Intel Corporation

          Hi Browniecake,

            Thank you for posting in Wired Communities. Can you share more information about your setup? OS version, NIC model, driver version and other relevant information.

          Thanks,
          Sharon
           

          • 2. Re: Intel X710 vs VMWare ESX: crash and reboot-ESCI lgos suddenly stop
            browniecake

            I've got nothing in the logs that show anything indicative of why my host keeps rebooting.

             

             

            1x Dell R640 running ESXI 6.5 CPU is 2x 6138 GOLD. 768GB RAM. Host Seems to reboot during any View Horizon Suite linked clone activity, (vMotion, Provisioning, Cloning, rebooting.)

             

             

            * Firmware Inventory**

            * ESXI version 5969303

            * Component FW Version

            * Power Supply.Slot.1 00.24.7D

            * Power Supply.Slot.2 00.24.7D

            * Integrated Remote Access Controller 3.00.00.00

            * Lifecycle Controller 3.00.00.00

            * Dell 64 Bit uEFI Diagnostics, version 4301, 4301X09, 4301.10 4301X09

            * Dell OS Driver Pack, 17.05.21, A00 17.05.21

            * OS COLLECTOR, 3.0, A00 3.0

            * iDRAC Service Module Installer, 3.0.1, A00 3.0.1

            * System CPLD 1.0.1

            * Identity Module 1.02

            * Intel(R) Ethernet Converged Network Adapter X710 - 3C:FD:FE:29:24:C2 18.0.16

            * Intel(R) Ethernet Converged Network Adapter X710 - 3C:FD:FE:29:24:C0 18.0.16

            * Intel(R) Ethernet Converged Network Adapter X710 - 3C:FD:FE:27:A7:C2 18.0.16

            * Intel(R) Ethernet 10G X710 rNDC - 24:6E:96:76:30:22 18.0.16

            * Intel(R) Ethernet 10G X710 rNDC - 24:6E:96:76:30:26 18.0.16

            * Intel(R) Ethernet 10G X710 rNDC - 24:6E:96:76:30:24 18.0.16

            * Intel(R) Ethernet 10G 4P X710 SFP+ rNDC - 24:6E:96:76:30:20 18.0.16

            * Intel(R) Ethernet Converged Network Adapter X710 - 3C:FD:FE:27:A7:C0 18.0.16

            * BIOS 1.1.7

            * Dell HBA330 Mini 13.17.03.00

             

             

            This is all the ESXI logs show, everything appears to be different for every reboot. It looks like the first thing the kernel logs during a boot is "VMB: 112: mbMagic: 2badb002, mbInfo 0x101688" so I am copying and pasting the last log entry of the vmkernel before the reboot.

             

             

            1st reboot:

             

             

                 2017-09-29T09:07:47.463Z cpu25:67940 opID=cb5d515b)FDS: 586: Enabling IO coalescing on driver 'deltadisks' device '143a14-rer0970_2-checkpoint-digest-sesparse.vmdk'

                VMB: 112: mbMagic: 2badb002, mbInfo 0x101688

             

            2nd reboot:

             

             

                2017-09-28T20:17:06.483Z cpu14:79323)BC: 5028: Failed to flush 1 buffers of size 8192 each for object 'vmware.log' f530 28 3

                59ca35aa 20549898 6e246686 60187696 2cc04984 652 0 0 0 0 0: No connection

                2017-09-28T20:17:06.484Z cpu14:79323)WARNING: BC: 6285: failed to flush buffer cache pool 3

                2017-09-28T20:17:06.484Z cpu14:79323)WARNING: UserFile: 1856: Error forcing buffered writes to disk: No connection

                VMB: 112: mbMagic: 2badb002, mbInfo 0x101688

             

             

            3rd reboot:

             

             

                2017-09-28T18:29:32.302Z cpu34:101276)Deactivating Daemon ESXShell.

                2017-09-28T18:29:32.705Z cpu9:101276)Daemon ESXShell deactivated.

                VMB: 112: mbMagic: 2badb002, mbInfo 0x101688

             

             

             

             

            4th reboot:

             

             

                2017-09-28T10:50:46.059Z cpu24:105395)Swap: vm 105376: 5175: Finish swapping in migration swap file. (faulted 0 pages,

                pshared 0 pages). Success.

                2017-09-28T10:50:46.209Z cpu78:68721)Config: 706: "SIOControlFlag2" = 0, Old Value: 1, (Status: 0x0)

                2017-09-28T11:57:56.702Z cpu29:66139)BC: 3571: Pool 2: Blocking due to no free buffers. nDirty = 26 nWaiters = 1

                VMB: 112: mbMagic: 2badb002, mbInfo 0x101688

             

             

            Happy to provide any additional info. Checked iDRAC but the only entry about the reboot says "System CPU Resetting." There are no cooling, thermal, or health warnings.

            • 3. Re: Intel X710 vs VMWare ESX: crash and reboot-ESCI lgos suddenly stop
              Intel Corporation
              This message was posted on behalf of Intel Corporation

              Hi Browniecake,

                 Thank you for the information, just to clarify so you mean with the presence of X710 , the host will reboot? Is there any chance to remove the NIC to isolate the issue? 

              Thanks,
              Sharon


               

              • 4. Re: Intel X710 vs VMWare ESX: crash and reboot-ESCI lgos suddenly stop
                browniecake

                this has confirmed to be a power issue. The UPS was overloaded and causing this host to reboot unexpectedly.