5 Replies Latest reply on Jul 11, 2014 1:50 AM by cehrhardt

    Need help in decoding a reproducible machine check  exception

    cehrhardt

      I have a fujitsu E734 laptop with a haswell CPU and HD4600 graphics. The machine runs a rather unusual setup that includes virtualization (VT-x and VT-d) and graphics card pass through. The guest is Windows 7  with the most recent intel driver for the graphcs card. Thinks work mostly fine including 3D.

       

      However, if I plug in an external monitor I get a machine check exception. The machine check details are reported throug MSR 0x411 (MC4_STATUS) and the following

      value is returned when reading the MSR: 0xba00000011000402.

       

      AFAICS this decodes to an unclassified error (the 0x0402 part) and the details of the error are given by 0x1100. However, this error code does not seem to be listed in the

      programmers manual for the CPU model in question. CPUID reports Family 0x6 Model 0x3c.

       

      Is it possible that the haswell machine check error codes are not yet listed in the developers manual?

       

      Can anybody explain what the exact meaning of this machine check error code?

       

          regards    Christian

        • 1. Re: Need help in decoding a reproducible machine check  exception
          kevin_intel

          Hi cehrhardt,

           

          I am going to verify more of this but in the meantime, I will need to get the following information:

           

          1. Is this happening when connecting an external monitor only?
          2. is this error a BSOD?
          3. Can you attach here your dxdiag report?

           

          Kevin M

          • 2. Re: Need help in decoding a reproducible machine check  exception
            cehrhardt

            Hi kevin_intel,

             

            thanks for looking into this. Regarding your questions:

             

            1. It happens reproducibly when I connect an external VGA monitor for the first time. I haven't seen this happen at other occasion, i.e. bad RAM or CPU is most definitely not the issue here.
            2. No it is not a BSOD. As mentioned in the original post, Windows 7 runs in a virtual machine environment (with direct access to the IGD hardware, though). The exception shows up in the VMM host and generates a trap there. The VMM software is not something that is available to the general public at this time.
            3. I'll get back  to you with a dxdiag report tomorrow. However, I'm not confident that it will be of much use given that the problem happens outside of windows.

             

            regards   Christian

            • 3. Re: Re: Need help in decoding a reproducible machine check  exception
              cehrhardt

              Hi,

               

              attached is the dxdiag output as promised.       regards   Christian

              • 4. Re: Need help in decoding a reproducible machine check  exception
                kevin_intel

                HI cehrhardt,

                 

                Thanks for the information. It is Indeed a strange situation. As you say this not hardware level issue but a graphics driver issue. At this point I am searching for what can be the cause but in the meantime, can you please provide images? I would really like to know how the graphics adapter is shown under Device Manager (Under Virtual machine Operating System).

                 

                Kevin M

                • 5. Re: Need help in decoding a reproducible machine check  exception
                  cehrhardt

                  Hi kevin_intel,

                   

                  I'm not sure I understand you correctly. Do you want me to open the graphics device in the Windows device manager and screenshot the results? If so, which tabs and views are you interested in? Note that the screenshots will show text in german, i.e. it is probably better if I try to extract the information manually....

                   

                  Here's some of the information in text form:

                  • Intel(R) HD Graphcis 4600 connected to PCI-Bus 0, Device 2, Function 0.
                  • Latest intel beta driver (other drivers don't seem to make a diffence though):
                    • Driver date: 2014/06/16
                    • Version: 10.18.10.3652
                  • Resources:
                    • MMIO (BAR): at 00000000FE400000 - 00000000FE7FFFFF
                    • MMIO (BAR): at 00000000E0000000 - 00000000EFFFFFFF
                    • I/O Ports: at C080 - B0BF
                    • IRQ: 0x16 (decimal 22)
                    • Legacy VGA: I/O Ports at 03B0 - 03BB
                    • Legacy VGA: I/O Ports at 03C0 - 03DF
                    • Legacy VGA: MMIO at A0000 - BFFFF
                  • No resource conflicts detected.
                  • Additional HW resources that are accessible in the virtual machine at their native locations but not really shown in the device manager:
                    • BIOS operation region
                    • VGA stolen memory
                    • Several graphics related PCI config space registers in device 00:00.0.
                  • Let me know which properties from the "Details" tab might be interesting.

                   

                  One thing that might be triggering unusual code paths in the VMM driver is the fact that the VM simply hides MSI/MSI-X capabilities of the graphics card, i.e. the graphics driver has to fall back to the legacy IRQ.

                   

                  Some additional information regarding the setup that might be of help:

                  • I have complete source code for the VMM host software. I can do modifications in order to extract interesting information if that is of any help.
                  • The VMM host software traps into the kernel debugger at the time of the exception, i.e. I can extract interesting information from there, too.
                  • AMT is configured and a serial console is active for debugging purposes. However, I'm pretty sure that the crash happens with AMT disabled, too.
                  • I can continue the machine after the machine check exception. If I disconnect the VGA adapter Windows is sometimes able to recover from the situation (by terminating and restarting the driver). No output shows up on the external screen.

                   

                  Do you have any information on the exact meaning of the error code given in the original message?

                   

                     regards      Christian