Just adding my two cents worth to yours...
Never seeing the temperature go over 100c is always going to be the case. Each of the processor's Digital Thermal Sensors (DTS) provide a measure of the die's temperature offset from the processor's Maximum Junction Temperature (Tjmax; the temperature where the processor also begins throttling performance to protect the processor from thermal damage; it is typically 100c). That is, it tells you how far (how many degrees) the temperature is below Tjmax. If the actual temperature is above Tjmax, the sensor returns an offset of 0 (zero). Bottom line, if your processor's Tjmax is 100c, you will never see a reading above 100c, even if the processor is actually hotter.
Your processor should not be getting this hot. If it is, something is wrong. This can be caused by a number of issues:
- The cooling solution is insufficient for the system.
- The cooling solution is improperly attached to the processor (physically and/or the Thermal Interface Material (TIM) is insufficient) and the cooling solution (thus) cannot sufficiently extract the heat being generated by the processor.
- The configuration of the cooling solution is incorrect/insufficient.
In this case, as I have a couple of KY units, I can say that item 3 is definitely true and, unfortunately, I believe that either item 1 or item 2 is also true (I lean towards it being item 1). The default configuration of the cooling solution is (for lack of a better word) poor (I will detail a better configuration in a moment) but, even with this better configuration in place, I can still see temperatures rising to levels that they should not (a clear indication of the solution being insufficient.
As I said, I don't like their default configuration. They increment the blower's (it's not a fan!) speed too slowly and then jump to 100% very abruptly and very near the Tjmax limit. What I use is as follows. This ensures that the response stays linear and that there is headroom to protect against the Thermal Load Line being exceeded too often, using an approximation of the Tcontrol temperature implemented in the Desktop versions of this processor.
Note the following:
- If you want the minimum blower speed to be lower than this, you need to adjust the Minimum Temperature (and possibly Duty Cycle Increment) so that the Duty Cycle reaches 100% at ~83c.
- If you want to use a Minimum Duty Cycle of 30%, set the Minimum Temperature to 60c. If you want to use a Minimum Duty Cycle of 25% (the lowest you should ever use, based upon recommendations from the blower manufacturers!), set the Minimum Temperature to 58c.
- Alternatively, if you want to delay the response to increased heat and are willing to have the blower speed increase at a higher (and slightly more noticeable) rate, you could switch the Duty Cycle Increment to 4%. Then, for Minimum Duty Cycle values of 40, 30 and 25, the Minimum Temperature should be set to 68, 65 and 64, respectively.
So, all this said, is this the cause of your spontaneous resets? I do not believe so. If it happens within 30 seconds of the start of Prime95, the thermal situation has not had sufficient time to degrade to the point where it could cause something like this. It is happening when you are at idle as well. Still, improving the cooling configuration will help avoid some of the high thermals (alas, not all)...
I'll look into this when I get home tonight. I will post a video of what I'm doing, possibly also showing the innards of the NUC while I'm at it, and I'll post it on youtube.
In the meantime, a couple of random thoughts:
>> "So, all this said, is this the cause of your spontaneous resets? I do not believe so. If it happens within 30 seconds of the start of Prime95, the thermal situation has not had sufficient time to degrade to the point where it could cause something like this.
I disagree -- Overclockers everywhere use Prime95 as a torture test to ensure CPU stability when running hot, and temps rise up extremely fast when I look at the monitor. It's not surprising at all that it would become too hot in the blink of an eye (well, in a 30-second blink that is)
>> "It is happening when you are at idle as well."
It's not the same thing. I will get a complete freeze at idle (well, running an OpenGL screen saver), but Prime95 causes a hard reset instead.
>> "Still, improving the cooling configuration will help avoid some of the high thermals (alas, not all)..."
BTW, as I said, I will report and post pics/videos of the problems I'm seeing. I also ran MEMTEST for a couple of days and there did not seem to be a problem with the memory.
I'm starting to wonder if the ambient temperature inside the NUC becomes hot enough to affect the Samsung NVMe to the point where it would cause a fault?
Also, I have at home some Artic Solver thermal paste at home. How easy would it be to change the TIM on the Skull Canyon?
I also might prop open the Skull Canyon and blow some air on it, see if it helps...
Temperature misreading information might come from Bios. Make sure you have updated to the latest 0042.
Adding thermal solution to your processor would be similar as you were adding more solution to a laptop computer. You will need to remove the fan.
I believe already have Bios 0042.
Otherwise, not sure what you're saying with regards to "adding thermal solution would be similar as you were adding more solution to a laptop"...
Oops, I was pretty sure I had upgraded to Bios 0042, but apparently, I was wrong. I had Bios 0037.
I'm running Prime95 as I write this, and the fan's not going absolutely bonkers as it did just a while ago and Hardware monitor says that temperatures are sticking around 83 degrees so I'm assuming it's appropriately throttling.
Upgrading to 0042 has helped tremendously, thank you for making me doubt myself and double-check.
Now that my Prime95 issue seems to be solved, I will now see if I still get hangs with the OpenGL screen saver. I will report in a few days. If I don't, then you can assume my issue to be fixed..
Well, Win10's "Ribbons" screen saver still hangs my Skull Canyon at idle apparently. So this is not over for me unfortunately.
I will have to perform more graphics tests.
I'm still testing the Ribbons screen saver.
Right now I am running the latest drivers (4590 -- I was previously running 4552). So far, so good, but I was waiting a couple more days, maybe another week or so, before claiming victory.
Well, yes and no, or at least not deliberately -- in the sense that the screen saver kicks in whenever I am not using the computer so not sure that counts as "testing"...
I have been using the NUC only sporadically since 15 February -- and I've not rebooted since.
Graphics driver 4590 seems stable, and I am no longer having any issues.
Solution that worked for me:
- Update all Win10 drivers using DriverEasy (gfx driver was 4552 at the time)
- Update to BIOS 0042 (I was at 0037 even though I bought the NUC in December 2016 at retail)
- Update to Graphics Driver 4590.
The only remaining issue I am having is that whenever the computer comes back from wake state, windows get resized to a small size (I am running on a 3440x1440 screen) but that is a separate issue discussed in another thread.
I think we can close this thread.
Thank you for sharing details about this matter.
In case you need more assistance or if you have any questions do not hesitate to contact us back.