Tracking down the true cause of a Blue Screen of Death is painstaking (and sometimes painful) work. The latest version of Microsoft’s Crash Analyzer Wizard helps you decode the cryptic details in a crash dump file. Here’s how it works.
Even experienced IT pros can be baffled by the Blue Screen of Death. The official name for this dreaded condition is a STOP error (programmers may also know it as a bug check), but the informal BSOD label is far more familiar. By definition, a STOP error is a catastrophic event caused by an unrecoverable error in a kernel-mode driver or memory. When it occurs, Windows dumps the contents of memory to a file and displays a STOP error in white text on a blue screen—thus the name.
In this post, I introduce you to an obscure but enormously helpful tool that can point the finger directly at the cause of a BSOD in a matter of minutes. But before I get to those details, some background is in order.
The first step in troubleshooting is to rule out problems with the PC itself. BSODs can be caused by overheating or by defective hardware, such as a faulty memory module or even a bad cable. If the hardware checks out OK, you can assume the cause of a BSOD is related to software: a Windows component, a kernel-mode driver for a hardware device, or a program that uses file-system filter drivers to access hardware at a low level, such as antivirus or disk-burning software. If the same STOP error code occurs repeatedly, even if it can’t be easily reproduced, that’s a strong indication that you have a software problem.
You can find tantalizingly incomplete bits of evidence in the codes displayed on the blue screen itself. After the fact, you can find these same details in Event Viewer. Windows 7 also displays the information in the new Reliability Monitor, under the heading Critical Events. Here, for example, is a STOP error I encountered recently on a PC running Windows 7:
Clicking the View Technical Details link displays this description, which duplicates the information that was displayed on the blue screen as part of the original event:
The hexadecimal bug-check code, in this case 7E, provides a high-level description of the problem. The full error code is
SYSTEM_THREAD_EXCEPTION_NOT_HANDLED, and a bit of searching turns up an MSDN article which provides more details about the error, but no solution.
In fact, the sparse information on the bug-check screen is rarely enough to solve a problem. The real keys to BSOD troubleshooting are stored in dump files that are created after the system crashes and before it restarts. If you’re willing to attach a debugger to the system and wait for it to crash, you can debug the event in real time.
If the troubled PC is sitting on a user’s desktop, there’s an easier way to crack open the saved error report: Use Microsoft’s Crash Analyzer Wizard. This tool is a component of the Microsoft Diagnostics and Recovery Toolset (DaRT). which is available as part of the Microsoft Desktop Optimization Pack (MDOP) to any organization that has deployed Windows as part of a Volume License contract with Software Assurance. But it’s also available as a free download to mere mortals with a TechNet or MSDN subscription, who can use it for test and evaluation purposes.
The Crash Analyzer Wizard uses two external components to work its magic:
- The Debugging Tools for Windows are typically used by developers during the course of creating drivers, applications, and services for Windows. They’re available in 32-bit and 64-bit versions and are required to open crash dump files.
- Symbol packages contain information that the debugger needs to interpret variables and other details in the dump file. When debugging a Windows crash dump file, you must use the symbol package that matches the operating system that created the crash dump.
You can install the Crash Analyzer Wizard locally and download the most recent version of the Debugging Tools for Windows and the symbols for the current version of Windows. That’s a sensible strategy if you’re doing long-term testing on your own development PC, but it’s overkill if you’re troubleshooting a recent crash on a client’s PC. For that job, use the ERD Commander Boot Media Wizard. You supply the installation media for the operating system you’re debugging, and the wizard creates an ISO image that you can use to burn a recovery disk that includes an assortment of useful troubleshooting tools. (You’ll need to create separate recovery media for x86 and x64 versions of Windows 7.)
When you boot from the recovery disk and choose the DaRT option, you see the complete set of tools shown here:
Choose Crash Analyzer and you’re on your way. The wizard uses the PC’s network connection to download the necessary components and symbols for the target operating system and then analyzes the most recent crash dump file stored on the target PC.
When the wizard completes its analysis, it displays the results in a summary dialog box like the one shown here:
You can examine more complete results by clicking the Details button, but in this case the identification was enough to help me identify the root cause of the problem. The Volume Shadow Copy driver (
Volsnap.sys) is part of Windows, and a quick search turned up Knowledge Base article 960038: You receive a 0x0000007E Stop error on Windows Server 2008-based computers that host Hyper-V virtual machines when you use the Hyper-V writer to back up virtual machines. This machine was indeed running the Hyper-V Manager tool to connect to virtual machines on a machine running Windows Server 2008. Downloading and installing the available hotfix prevented the problem took a minute or two, and the problem hasn’t recurred.
The Crash Analyzer Wizard can’t find the cause of every crash. Even when it does succeed in pinpointing a problem, you might be stymied if a replacement driver is unavailable or if uninstalling the offending application isn’t an option. But at least you’ll have information you can use to open a trouble ticket with Microsoft or the developer of the faulty code.
The other utilities on the ERD recovery disk are incredibly handy as well, which is why it has become an essential part of my troubleshooting toolkit.
Want more like this? Sign up for the weekly IT Expert Voice newsletter so you don’t miss a thing!