4 Replies Latest reply on Nov 1, 2015 2:35 PM by joe_intel

    Trying to diagnose MCE

    Harland.Corbin

      I am getting repeated lockups/reboots on a laptop.  I'm seeing these MCE errors logged:

       

      Hardware event. This is not a software error.

      MCE 0

      CPU 0 BANK 0

      TIME 1444394565 Fri Oct  9 08:42:45 2015

      MCG status:

      MCi status:

      Error overflow

      Uncorrected error

      Error enabled

      Processor context corrupt

      MCA: BUS Level-0 Local-CPU-originated-request Generic Memory-access Request-did-not-timeout Error

      BQ_DCU_READ_TYPE BQ_ERR_HARD_TYPE BQ_ERR_HARD_TYPE

      timeout BINIT (ROB timeout). No micro-instruction retired for some time

      failure that caused IERR

      STATUS f200084000000800 MCGSTATUS 0

      MCGCAP 806 APICID 0 SOCKETID 0

      CPUID Vendor Intel Family 6 Model 23

      Hardware event. This is not a software error.

      MCE 1

      CPU 0 BANK 5

      TIME 1444394565 Fri Oct  9 08:42:45 2015

      MCG status:

      MCi status:

      Error overflow

      Uncorrected error

      Error enabled

      Processor context corrupt

      MCA: BUS Level-3 Generic Generic Other-transaction Request-did-not-timeout Error

      BQ_DCU_READ_TYPE BQ_ERR_AERR2_TYPE BQ_ERR_AERR2_TYPE

      received parity error on response transaction

      MCE driven

      STATUS f200001014000e0f MCGSTATUS 0

      MCGCAP 806 APICID 0 SOCKETID 0

      CPUID Vendor Intel Family 6 Model 23

      Hardware event. This is not a software error.

      MCE 2

      CPU 1 BANK 5

      TIME 1444394565 Fri Oct  9 08:42:45 2015

      MCG status:

      MCi status:

      Error overflow

      Uncorrected error

      Error enabled

      Processor context corrupt

      MCA: BUS Level-3 Generic Generic Other-transaction Request-did-not-timeout Error

      BQ_DCU_READ_TYPE BQ_ERR_HARD_TYPE BQ_ERR_HARD_TYPE

      received parity error on response transaction

      MCE driven MCE is observed

      STATUS f200001030000e0f MCGSTATUS 0

      MCGCAP 806 APICID 1 SOCKETID 0

      CPUID Vendor Intel Family 6 Model 23

       

      What hardware component is giving this error?

      I have tried to update the system BIOS, which has not helped.

      I have run memtest, which showed no errors.

      Are there any utilities available to help troubleshoot/diagnose this issue?

       

      The machine is a Dell Latitude D830 laptop.

      It was running the A15 BIOS, but is showing the same behavior with the A17 BIOS.

      I am currently running Ubuntu 14.04.3 LTS

      I can boot it to Windows 7 for troubleshooting/diagnostics.

       

      Thanks,

      Harland