4 Replies Latest reply on Jun 27, 2015 12:14 PM by djl47

    What is the next diagnostic step?

    djl47

      tl;dl version:   Bought a used laptop.  Installed Fedora 22.  System logs machine check events in Fedora Journal and fails IPDT. 

       

      I recently bought a used Toshiba Satellite P75-A with an Intel(R) Core(TM) i7-4700MQ CPU @ 2.40GHz.  I brought the system home, removed the HDD and installed an SSD, then installed Fedora 22.  Over the next few days I saw some machine check events and discovered a mix of thermal events and other errors.  I wasn't interested in disassembling and reassembling a laptop so I had a PC shop run a thermal diagnostic and reseat the heat sink with high quality thermal paste.  This resolved the thermal issues although the processor can still run hotter than I like (Assuming lm_sensors is reporting accurate temperatures.)   I did some investigation and downloaded the Linux bootable IPDT iso and ran it per instructions.   The first test failed without any output.  Through the process of elimination I found that the processor fails the PCH test.  With this test disabled all of the remaining tests pass.  

      Where do I go from here?  Is this a bad processor, a bad MB or both?  

       

      Here is a representative example of the machine check events that Fedora wrote to journalctl.

      Jun 18 22:23:35 xanadu mcelog[857]: Hardware event. This is not a software error.

      Jun 18 22:23:35 xanadu mcelog[857]: MCE 0

      Jun 18 22:23:35 xanadu mcelog[857]: CPU 0 BANK 5

      Jun 18 22:23:35 xanadu mcelog[857]: MISC 38a0000086 ADDR ff882100

      Jun 18 22:23:35 xanadu mcelog[857]: TIME 1434691415 Thu Jun 18 22:23:35 2015

      Jun 18 22:23:35 xanadu mcelog[857]: MCG status:

      Jun 18 22:23:35 xanadu mcelog[857]: MCi status:

      Jun 18 22:23:35 xanadu mcelog[857]: Uncorrected error

      Jun 18 22:23:35 xanadu mcelog[857]: MCi_MISC register valid

      Jun 18 22:23:35 xanadu mcelog[857]: MCi_ADDR register valid

      Jun 18 22:23:35 xanadu mcelog[857]: Processor context corrupt

      Jun 18 22:23:35 xanadu mcelog[857]: MCA: corrected filtering (some unreported errors in same region)

      Jun 18 22:23:35 xanadu mcelog[857]: Generic CACHE Level-2 Generic Error

      Jun 18 22:23:35 xanadu mcelog[857]: STATUS ae0000000040110a MCGSTATUS 0

      Jun 18 22:23:35 xanadu mcelog[857]: MCGCAP c09 APICID 0 SOCKETID 0

      Jun 18 22:23:35 xanadu mcelog[857]: CPUID Vendor Intel Family 6 Model 60

      Jun 18 22:23:35 xanadu mcelog[857]: Hardware event. This is not a software error.

      Jun 18 22:23:35 xanadu mcelog[857]: MCE 1

      Jun 18 22:23:35 xanadu mcelog[857]: CPU 0 BANK 6

      Jun 18 22:23:35 xanadu mcelog[857]: MISC 38a0000086 ADDR ff881e40

      Jun 18 22:23:35 xanadu mcelog[857]: TIME 1434691415 Thu Jun 18 22:23:35 2015

      Jun 18 22:23:35 xanadu mcelog[857]: MCG status:

      Jun 18 22:23:35 xanadu mcelog[857]: MCi status:

      Jun 18 22:23:35 xanadu mcelog[857]: Uncorrected error

      Jun 18 22:23:35 xanadu mcelog[857]: MCi_MISC register valid

      Jun 18 22:23:35 xanadu mcelog[857]: MCi_ADDR register valid

      Jun 18 22:23:35 xanadu mcelog[857]: Processor context corrupt

      Jun 18 22:23:35 xanadu mcelog[857]: MCA: corrected filtering (some unreported errors in same region)

      Jun 18 22:23:35 xanadu mcelog[857]: Generic CACHE Level-2 Generic Error

      Jun 18 22:23:35 xanadu mcelog[857]: STATUS ae0000000040110a MCGSTATUS 0

      Jun 18 22:23:35 xanadu mcelog[857]: MCGCAP c09 APICID 0 SOCKETID 0

      Jun 18 22:23:35 xanadu mcelog[857]: CPUID Vendor Intel Family 6 Model 60

      Jun 18 22:23:35 xanadu mcelog[857]: Hardware event. This is not a software error.

      Jun 18 22:23:35 xanadu mcelog[857]: MCE 2

      Jun 18 22:23:35 xanadu mcelog[857]: CPU 0 BANK 8

      Jun 18 22:23:35 xanadu mcelog[857]: MISC 38a0000086 ADDR ff881fc0

      Jun 18 22:23:35 xanadu mcelog[857]: TIME 1434691415 Thu Jun 18 22:23:35 2015

      Jun 18 22:23:35 xanadu mcelog[857]: MCG status:

      Jun 18 22:23:35 xanadu mcelog[857]: MCi status:

      Jun 18 22:23:35 xanadu mcelog[857]: Error overflow

      Jun 18 22:23:35 xanadu mcelog[857]: Uncorrected error

      Jun 18 22:23:35 xanadu mcelog[857]: MCi_MISC register valid

      Jun 18 22:23:35 xanadu mcelog[857]: MCi_ADDR register valid

      Jun 18 22:23:35 xanadu mcelog[857]: Processor context corrupt

      Jun 18 22:23:35 xanadu mcelog[857]: MCA: corrected filtering (some unreported errors in same region)

      Jun 18 22:23:35 xanadu mcelog[857]: Generic CACHE Level-2 Generic Error

      Jun 18 22:23:35 xanadu mcelog[857]: STATUS ee0000000040110a MCGSTATUS 0

      Jun 18 22:23:35 xanadu mcelog[857]: MCGCAP c09 APICID 0 SOCKETID 0

      Jun 18 22:23:35 xanadu mcelog[857]: CPUID Vendor Intel Family 6 Model 60

       

       

      Intel Processor Diagnostic Tool from the Linux Bootable ISO.

      --- IPDT64 - rev  2.20.0.0.L.MP ---

      --- Start Time: 06/20/2015 14:02:13---

      --- Skipping Config ---

      --- Reading CPU Manufacturer ---

      Expected --> GenuineIntel

      Detected --> GenuineIntel

      Found --- Genuine Intel Processor ---

      --- Temperature Test ---

      Temperature Test Passed!!!

      Temperature = 40 degrees C below maximum.

      --- Reading Brand String ---

      Detected Brand String:

      Intel Core i7-4700MQ   2.40GHz

      Brand String Test Passed!!!

      --- Reading CPU Frequency ---

      Expected CPU Frequency is --> 2.40

      Detected CPU Frequency is --> 2.39223

      CPU Frequency Test Passed!!!

      --- FSB NOT Supported on this Processor ---

      --- Running Base Clock test ---

      Detected Base Clock --> 104

      Base Clock test Pass ---

      ..QPI rate Test not supported..

      ..Skipping QPI rate Test..

      Skipping QPI rate Test

      --- Running Floating Point test ---

      Million Floating Points per Second, MFLOPS --> 403.2

      Floating Point Test Pass ---

      --- Running Prime Number Generation Test ---

      Operation Per Second--> 4.65205e+06

      Prime Number Generation Test Pass ---

      --- Reading Cache Size --- 

      - Detected L1 Data Cache Size --> 4 x 32

      - Detected L1 Inst Cache Size --> 4 x 32

      - Detected L2 Cache Size --> 1024

      - Detected L3 Cache Size --> 6144

      Cache Size Test Passed!!!

      --- Determining MMX - SSE capabilities ---

               --- CPU FEATURES DETECTION FOR ---

                      ---        MMX SSE       ---

      MMX         - MMX Supported -->    Yes

      SSE         - SSE Supported -->    Yes

      SSE2         - SSE2 Supported -->    Yes

      SSE3         - SSE3 Supported -->    Yes

      SSSE3         - SSSE3 Supported -->    Yes

      SSE4.1         - SSE4.1 Supported -->    Yes

      SSE4.2         - SSE4.2 Supported -->    Yes

              --- MMX SSE - capabilities check complete ---

      MMX Test Result --- PASS

      SSE Test Result --- PASS

      SSE2 Test Result --- PASS

      SSE3 Test Result --- PASS

      SSSE3 Test Result --- PASS

      SSE4.1 Test Result --- PASS

      SSE4.2 Test Result --- PASS

      MMX SSE Testing Passed !!

      --- Determining AVX AES PCLMULQDQ capabilities ---

               --- CPU FEATURES DETECTION FOR ---

                     --- AVX/AES/PCLMULQDQ ---

      AVX         - Advanced Vector Extensions Supported -->    Yes

      AVX OS Support   - AVX Operating System Supported -->        Yes

      AES         - Advanced Encryption Standard Supported -->    Yes

      PCLMULQDQ     - Polys Carry-Less Multiply Supported -->    Yes

          --- AVX AES PCLMULQDQ capabilities check complete ---

      AVX Compare Test Result --- PASS

      AES Test Result --- PASS

      PCLMULQDQ Test Result --- PASS

      AVX AES PCLMULQDQ Testing Passed !!

      --- Reading Memory Size ---

      Detected Memory Size is --> 15.60GB

      --- Integrated Memory Controller Stress Test ---

      --- Integrated Memory Controller Stress Test Pass!!! ---

      Integrated Memory Controller Test Pass!!!

      --- Platform Controller Hub Test Disabled ---

      --- Querying for Intel(R) Integrated Graphics Device (IGD) ---

      ..Detected 8086 as Vendor ID on Device 2 on Intel(R) processor..

      ..Intel(R) Integrated Graphics Device Presence Detection Passed..

      ..2D Graphics Visual Display Passed..

      ..Graphics Visual Display Passed..

      ..Rotating Display Passed..

      --- CPU Load ---

      --- Load Level = 8

      CPU Load Passed!!!

      --- Temperature Test ---

      Temperature Test Passed!!!

      Temperature = 2 degrees C below maximum.

      --- Test End Time: 06/20/2015 14:06:12---