I decided not to give up on this... sent my entire system to Intel. It seems that they know about the problem of WHEA warnings. According to the engineer working with me, he was told by another engineer that the errors are "really no big deal, i.e. the card communicated something the OS didn't understand, so it logs it." I will get a more direct response, but in the meantime, this opinion might lower customer anxiety. In a perfect world, Intel might get in touch with whoever is out of spec... meanwhile, it is hard to believe that the huge burst of WHEA 17 errors before every hang is coincidental.
Thanks for keeping us updated.
I have been using the procedure I listed a few post back to stay ahead of the errors. (reload the driver after each reboot)
It has been working perfectly for me once I do that.
Still I would like to see a fix.
Thanks for taking the initiative and sending your box to Intel. I have the same issue on my system. I have been pursuing it through my vendor as well. I stumbled onto this problem while exploring the source of background disk activity. Hopefully, you will hear something definitive from Intel soon. Please let us all know.
I hope this goes out to everyone following this thread... After several weeks of testing, here is the verdict from Intel Engineering:
> It appears the ATI card is throwing a Replay Timer timeout error
> which is being reported through the PCI Express root bridge upstream
> from the ATI card.
> I believe it to be an issue with the ATI card+ATI driver combo
> causing a PCI Express Correctable error reported through the IIO PCI
> Express bridge device.
So, it is indeed the ATI card. I reported this to Diamond (Diamond MM 4890), and they keep closing the ticket without any resolution. Tomorrow I attack AMD and Diamond... stay tuned.
i have the nvidia gtx 295 quad sli for video, and have the same whea 17 issue. is it all video cards then? can that be assumed?
Stand by. I'll see if I can't get Intel to elaborate.
Add another Unhappy camper. I sold my rampage II gene to a friend and got this dx58so (don't ask why). I game a lot. And i see this error as soon as i start *any* game.
The system doesn't always crash, but it usually do so once in a while. The odd thing is when it does crash i can't hard reset the machine. If i press the power button the computer turns itself off then on.
BOARD: Intel Dx58so (with the latest bios on intel' website)
CPU: Intel Core i7-920; D0
MEM: Corsair 1GBx3 DDr3-1600 dominator RAM
VIDEO: ATi (XFX) 4870-1GB
Epilog: well, no good deed goes unpunished. I got my machine back. Smashed. The offending PCIe slot holding the video card split right down the middle. CoolMaster V8 bent, PS won't cycle on, case fans blades broken off and stator coil wire all over the inside of the box. Problem passed to AMD. Two replies. One says "our competitor wants this to be our problem but it isn't" and the other "this is too technical for first line support, will pass to higher level; please expect a call" (of course, no call), and finally, Diamond Multimedia, who have twice closed the case without telling me why.
Sorry to hear this HMF, I hope someone steps up and replaces your MB and accessories.
As I reported a while back the reloading of the driver after boot solves the issue for me I have been doing it for weeks now with 100% success.
Once I boot to the desktop I reload the intel driver for the PCI express root port 3 - 340a. No matter what I do from then on I do not receive the whea error. until the next boot that is. It's a pain in the a** but it works until someone takes ownership for it and fixes it. IF EVER!
Intel does have a new bios out for the x58 (5020). Among other changes it also mentions
"Fixed issue where the PCIe Compliance bit was being set
incorrectly causing Secondary Boot Request to cause some cards to
fail to train correctly. "
I have not tested the bios myself, but could this be the fix?
I meant to add this in my post above. I updated to the latest bios as well 5020 dated 2/24 and it did not fix the issue.
Intel is graciously stepping up to replace the board. I will eat the rest; dealing with the UPS process wouldn't be worth the time (anybody old enough to remember those Samsonite commercials with the gorilla?). Latest info on the message from Intel:
Vista and Win7 have native support for PCI Express so it is most likely Windows that is turning on PCIe Advanced Error Reporting (AER) and then the video card/driver is causing errors that are reported through the IOH Root Port. One possibility to workaround this is to find a way to have Windows disable PCIe AER.
They are check on the work-around. Gotta say, Intel is trying to be helpful and it looks like it's not even their problem.
I refuse to let this go, on behalf of dozens of people (some on this forum) who I know have this problem. I replied (lost my cool) to AMD after being told "we don't solve our competitors' problems" by AMD line support (inane! I wonder if they hire adults, too.). Here is what they said:
Thank you for providing us with additional information. Since the previous Service Request (SR # XXXXXXXX) is now closed, we have recreated another Service Request (SR # XXXXXXXX) for Level 2 to review with you at a deeper level. Thank you for your patience.
They requested system reports, and I re-sent the reports they requested last June (no kidding. Just as well, as I await replacement parts for the machine that got destroyed by UPS on its way back from Intel). Stay tuned.
Count me among the DX58SO owners experiencing this problem. I have three similar DX58SO systems partially generating this error message.
The specifications for these systems are as follows:
DX58SO with the latest Feb.XX 2010 BIOS
GIGABYTE ATI Radeon HD4850 1GB Fan-less Video Card
6 – 8 GB of RAM from 3 different manufacturers (Crucial, Corsair and Patriot)
PC Power and Cooling 500W – 750W PSU
The 1st system is my workstation running Windows 7 Ultimate x64 (referred to as wrk-w7 from here on).
The 2nd is a Media Center setup/backup workstation running Windows 7 Ultimate x64 (referred to as mcb-w7 from here on).
The 3rd is my server running Windows Server 2008 R2 – x64 (referred to as svr-w8 from here on).
Wrk-w7 is where my largest problem exists. I have the latest chipset drivers and the latest ATI drivers and the system still generates WHEA-L-17 errors frequently. For the past 3 weeks (running for 9 months) the system has been experiencing random OS lock-ups (this system has 6GB of Corsair RAM) with very little log detail to point me the correct direction.
Mc-w7 does generate some WHEA-L-17 errors but nowhere near the volume as seen on wrk-w7. I am unaware of its OS ever locking up either, but I would assume it lives an easier life than wrk-w7 does (this system has 6GB of Patriot RAM).
Svr-w8 doesn’t generate any WHEA-L-17 errors. Could be because its OS (Windows Server 2008 R2) doesn’t log them unlike W7. Or it could be due to the lack of fancy graphic components such as aero (this system has 8GB of Crucial RAM).
On wrk-w7 & mc-w7 I do have the theme set to Windows Classic as I personally don’t need the fanciness that is Aero. Regardless of this these systems still generate WHEA-L-17, although at different rates.
I’ve tried different ATI drivers on wrk-w7 since the system lock-ups have appeared. Tried the driver from off of Gigabyte’s website; tried the Microsoft Update recommended driver; tried the latest ATI-provided driver all to no real change that I can speak of.
I’m still tinkering with wrk-w7 software, but I’m getting close to buying an NVIDIA-based video card and if need be different RAM.
I dont think changing your ram is going to help but the nvidia card may. I have solved my issues by reloading the intel driver after I reboot. I leave my PC on most of the time so its not that big a deal to me. Check up a few posts I explain what I have done. See if the helps you OS lock ups.