2 Replies Latest reply on Dec 10, 2011 10:11 PM by benlbit

    Intel HD Graphics Blue Screen




      I've got an issue with 2 (out of a possible 5 in use so far) - IEI NANO-HM551 boards which use the Intel HD Graphics. I'm running Windows Embedded Standard 2009.


      The IEI boards have a Core i5 installed and 2GB of RAM. I have some systems using the LVDS port (48bit - 1920x1080 - 3 are setup like this, 2 are ok whilst the 3rd is not). I also have 2 which are running LVDS 48bit - 1920x1080 and simultaneously running the HDMI port - 1920x1020 with the driver setup to run the desktop in Extended mode. One of these is also experiencing the problem.


      The symptom is: our WPF .NET 4.0 application suddenly stops updating the UI.


      It's been difficult to trace the problem because it happens at random, usually no less than 19+ hours after the OS and application have been running. In other words, to try and reproduce the problem - either it doesn't happen or will take at least 24 hours. The system running with 2x displays it can happen every few hours or every 24+ hours, unfortunately I don't have the luxury to access this system and use it as a test bed given it's deployed to a remote location. All I can do is remotely reboot it to get the application running again.


      I've managed to get the problem to occur once in the lab with one of the systems that is connected via LVDS to a single monitor. I tried to poke around so I could hopefully discover what was going on. At this point I tried to see if I could interact with our WPF application and discovered it wasn't responding to input. In-fact, I couldn't open other applications either, except I could get into Task Manager and at that point terminate our application. Upon terminating our application, this resulted in the system blue-screening and rebooting. It happened so quickly, I couldn't see the exception code.


      I was however luckily enough the blue-screen recorded a dump file which I've since analysed. Here's the output:


      Microsoft (R) Windows Debugger Version 6.12.0002.633 X86
      Copyright (c) Microsoft Corporation. All rights reserved.

      Loading Dump File [c:\Windows\DUMP180c.tmp]
      Mini Kernel Dump File: Only registers and stack trace are available

      WARNING: Inaccessible path: 'c:\windows\i386'
      Symbol search path is: srv*c:\symbols*http://msdl.microsoft.com/download/symbols
      Executable search path is: c:\windows\i386
      Windows XP Kernel Version 2600 (Service Pack 3) MP (4 procs) Free x86 compatible
      Product: WinNt, suite: EmbeddedNT
      Built by: 2600.xpsp.080413-2111
      Machine Name:
      Kernel base = 0x804d7000 PsLoadedModuleList = 0x805634c0
      Debug session time: Sun Dec  4 12:16:31.359 2011 (UTC + 11:00)
      System Uptime: 0 days 19:22:29.862
      Loading Kernel Symbols
      Loading User Symbols
      Loading unloaded module list
      *                                                                             *
      *                        Bugcheck Analysis                                    *
      *                                                                             *

      Use !analyze -v to get detailed debugging information.

      BugCheck 100000EA, {889c25e8, 88c4db68, f78cecbc, 1}

      Unable to load image igxpdx32.DLL, Win32 error 0n2
      *** WARNING: Unable to verify timestamp for igxpdx32.DLL
      *** ERROR: Module load completed but symbols could not be loaded for igxpdx32.DLL
      Probably caused by : igxpdx32.DLL ( igxpdx32+8785 )

      Followup: MachineOwner

      2: kd> !analyze -v
      *                                                                             *
      *                        Bugcheck Analysis                                    *
      *                                                                             *

      The device driver is spinning in an infinite loop, most likely waiting for
      hardware to become idle. This usually indicates problem with the hardware
      itself or with the device driver programming the hardware incorrectly.
      If the kernel debugger is connected and running when watchdog detects a
      timeout condition then DbgBreakPoint() will be called instead of KeBugCheckEx()
      and detailed message including bugcheck arguments will be printed to the
      debugger. This way we can identify an offending thread, set breakpoints in it,
      and hit go to return to the spinning code to debug it further. Because
      KeBugCheckEx() is not called the .bugcheck directive will not return bugcheck
      information in this case. The arguments are already printed out to the kernel
      debugger. You can also retrieve them from a global variable via
      "dd watchdog!g_WdBugCheckData l5" (use dq on NT64).
      On MP machines it is possible to hit a timeout when the spinning thread is
      interrupted by hardware interrupt and ISR or DPC routine is running at the time
      of the bugcheck (this is because the timeout's work item can be delivered and
      handled on the second CPU and the same time). If this is the case you will have
      to look deeper at the offending thread's stack (e.g. using dds) to determine
      spinning code which caused the timeout to occur.
      Arg1: 889c25e8, Pointer to a stuck thread object.  Do .thread then kb on it to find
      the hung location.
      Arg2: 88c4db68, Pointer to a DEFERRED_WATCHDOG object.
      Arg3: f78cecbc, Pointer to offending driver name.
      Arg4: 00000001, Number of times "intercepted" bugcheck 0xEA was hit (see notes).

      Debugging Details:

      FAULTING_THREAD:  889c25e8




      LAST_CONTROL_TRANSFER:  from 804f0a8c to 804f0f08

      a9d19444 804f0a8c 89b13000 ff000011 00000253 nt!RtlFindClearBits+0x230
      a9d1945c 8054e1b5 89b13000 00000011 00000253 nt!RtlFindClearBitsAndSet+0x14
      a9d194a8 80551858 00000001 00010018 00000000 nt!MiAllocatePoolPages+0x4c0
      a9d19510 bf802b0c 00000001 00010018 43544e49 nt!ExAllocatePoolWithTag+0x109
      a9d19530 bf85b6c4 00010018 43544e49 00000000 win32k!HeavyAllocPool+0x74
      a9d1954c bf375785 00000000 00010008 43544e49 win32k!EngAllocMem+0x33
      WARNING: Stack unwind information not available. Following frames may be wrong.
      a9d195a4 bf556728 a9d195d0 e228b8b8 e228b8b8 igxpdx32+0x8785
      00000000 00000000 00000000 00000000 00000000 igxpdx32+0x1e9728

      STACK_COMMAND:  .thread 0xffffffff889c25e8 ; kb

      bf375785 ??              ???


      SYMBOL_NAME:  igxpdx32+8785

      FOLLOWUP_NAME:  MachineOwner

      MODULE_NAME: igxpdx32

      IMAGE_NAME:  igxpdx32.DLL


      FAILURE_BUCKET_ID:  0xEA_IMAGE_igxpdx32.DLL_DATE_2009_11_10

      BUCKET_ID:  0xEA_IMAGE_igxpdx32.DLL_DATE_2009_11_10

      Followup: MachineOwner


      I'm 80% confident the issue is not related to our application because I have systems which are working perfectly fine on the same hardware using the same OS image with the same drivers and they have never had a problem. On the on the other extreme I have a system that's experiencing the issue every few hours and the only way we can resolve it is with a reboot.


      One system has the problem repeatedly, for this I tried wiping the disk and reinstalled everything again (thinking maybe .NET 4.0 was corrupt) but this only lasted a short while (about 28 hours) before the issue reappeared. Meanwhile the system running next to it - continues to fire away like it should, not skipping a beat.


      The closest I have come to find so far that has any significance is this article on Microsoft - http://support.microsoft.com/kb/293078 - which indicates it's a driver issue. If this is the case, I don't understand why this wouldn't be a problem (i.e. consistent) across all 5 systems running this same motherboard, same CPU, same RAM, same hard drive as well as the exact same XP Embedded image, same version of .NET and same WPF .NET application. If the drivers were the issue, surely it would be affecting more than 2 of the 5 systems (I have at least 5 more to be built) especially given the identical hardware and software configurations.


      Intel - are you able to help me resolve this?




        • 1. Re: Intel HD Graphics Blue Screen

          Hello Ben,


          Thank you for the detailed explaination that you have provided about the issue. Regrettably, we only provide technical support for Desktop Versions of Windows, as embedded versions are not supported in our department.


          We do hope that you can obtain assistance on the community from other users with similar configurations.

          • 2. Re: Intel HD Graphics Blue Screen

            I haven't 100% confirmed this yet, but here's what I have found.


            The WPF Application has a number of animations.


            In order to get those animations to appear smooth, I was utilising an Option of 'BitmapCache' - which causes the Graphics Card to cache the visuals in GPU memory.


            Turns out this Cache isn't being flushed and over time the GPU is running out of memory.


            Interestingly, the HD Graphics allows up to 1024MB but as soon as it reaches about 950MB - 960MB, it's all over, the OS needs to be restarted.


            On my development machine, it uses an nVidia Graphics set and it does flush the cache (albeit the screen blinks similar to when you change resolution).


            I've turned off almost all BitmapCaching and the problem has resolved (testing for past 6 hours and counting - so far so good - only 250MB of GPU memory consumed).


            Unfortunatly the animations are choppy, but the app is not consuming up all the GPU memory which means it shouldn't crash.


            I owe a huge amount of thanks to Pete Brown from Microsoft for his time in helping me find the problem as well as giving me some pointers in my application.


            Now it would be nice if Intel could fix the driver so it flushes the GPU cache.


            It should be noted - the behaviour of the graphics driver is irrelivant to the OS installed, the issue is unchanged on a full copy of Windows XP Professional (which I ended up needing to install so I could profile the application on the Intel hardware in Visual Studio because VS wouldn't install on Windows Embedded).


            Maybe for the next release of the Intel HD Graphics Driver you could get the GPU Cache to flush?