Everything is perfect; you've upgraded to Windows 7. It's fully patched, all drivers are updated, security is tight, maybe you even have new hardware... yet the old Blue Screen of Death (BSOD) taunts you from your new high definition screen.
The good news is that you can quickly solve the problem in most cases by using the Windows debugger tool. It's simple and free.
Back in the Window XP era (2005), we wrote a tutorial on solving Windows crashes. This is an updated version that will make you the master of system crash resolution in your home or office.
Is crash resolution different for different versions of Windows?
The same approach to resolve system crashes applies to the many variants of Windows, says Andre Vachon, principal development lead at Microsoft. "The latest releases of Microsoft Windows use the same operating system kernel, the same primary interfaces, drivers work on both server and client, and the debugger uses the same debug files. Further, we used the same code base and source tree to compile both 32- and 64-bit versions."
With that in mind and for simplicity I will refer to Windows 7. However, not only will the information apply to other current releases, much of it will apply to legacy versions back to Windows 2000.
Why Windows 7 crashes
Windows became more stable as it matured. And, while the operating system has gone from 16-bit to 32-bit and now 64-bit, the features have become more extravagant and the footprint much larger, it is actually harder to bring down.
Still, it does fall over. However, the reasons for such system failures have not changed from the XP days.
Windows takes advantage of a protection mechanism that lets multiple applications run at the same time without stepping all over each other. Known now as User Mode and Kernel Mode, it was originally known as the Ring Protection scheme.
Kernel Mode (Ring 0) software has complete and unfettered access to the hardware. Software operating here is normally the most trusted because it can execute any instruction and reference any address in the system. Crashes in Kernel Mode are complete system failures requiring a reboot. This is where you find the operating system kernel code and most drivers.
User Mode (Ring 3) software cannot directly access the hardware or reference any address freely. It must pass instructions, perhaps more accurately requests, through calls to APIs. This feature enables protection for the overall operation of the system, regardless of whether an application makes an erroneous call or accesses an inappropriate address. Crashes in User Mode are generally recoverable, requiring a restart of the application but not the entire system. This is where you find most of the code running on your computer ranging from Word to Solitaire and some drivers.
So with much of the software running in User Mode these days, there is simply less opportunity for applications to corrupt system level software and, for that matter, each other. However, kernel mode software is not protected from other kernel mode software. For example, if a video driver erroneously accesses a portion of memory assigned to another program (or memory not marked as accessible to drivers) Windows will stop the entire system. This is known as a Bug Check and the familiar Blue Screen of Death is displayed.
Crash causes by the numbers
While the numbers vary, they do not vary much. When combining data reported from several sources including my own 20 years dealing with crash prevention and resolution, a trend becomes clear. About 70% of Windows system crashes are caused by third party drivers operating in Kernel Mode, 15% is unknown, 10% is from faulty hardware (more than half from bad memory) and only about 5% from faulty Microsoft code.
An important point that is not well known is that most crashes are repeat crashes. This is so because most admins are not able to resolve system crashes immediately. As a result those crashes tend unfortunately to occur again... and again. More often than not, these events recur over weeks and in many cases over months before being resolved. By using the information in this article to solve crashes when they first occur, you will prevent many subsequent crashes.
Getting Started: System Requirements
To prepare to solve Windows 7 system crashes using WinDbg you will need a PC with the following:
- 32-bit or 64-bit Windows 7/Vista/XP or Windows Server 2008/2003
- Approximately 25MB of hard disk space (this does not include storage for dump files or for symbol files)
- Live Internet connection
- Microsoft Internet Explorer 5.0 or later
- The latest version of WinDbg comes as an option in the Windows SDK. The SDK download file is called winsdk_web.exe, is 498KB in size, and can be downloaded for free. (Note that after installing the debugger you can delete the large download file thus freeing up lots of space.)
- A memory dump (the page file must be on C: for Windows to save the memory dump file)
After downloading the Windows SDK and running the Setup wizard, select the Debugging Tools for Windows option under Common Utilities.
Configure Startup and Recovery
This is annoying. Someone made it very non-intuitive to locate the dialogue box needed to check that your system is set to take the appropriate actions during a BugCheck, including whether to automatically restart and what size dump files to save.
Find the Startup and Recovery dialog box:
- Select the Start button at the bottom left of your screen
- Select Control Panel
- Select System and Security
- From the options in the right column, select System
- From the left column select Advanced system settings to display the System Properties box
- In the System Properties box select the Advanced tab
- In the Startup and Recovery area select the Settings button
Ensure Startup and Recovery settings are correct
Under System failure:
- Check Write an event to the system log
- Check Automatically restart
- Select Kernel memory dump
- Ensure dump file to be written to %SystemRoot%\MEMORY.DMP
- Check Overwrite any existing file to save hard drive space
Note that this will mean that your system will save both a kernel dump file and a minidump file. However, while you will have a minidump for every event, only the last kernel dump will be saved.
Launching the debugger: To launch WinDbg select the following:
Start > All Programs > Debugging Tools for Windows > WinDbg
If you are going to use it with any frequency, simplify launching the program by pinning it to the Startup menu or send a shortcut to the desktop.
What's the big deal about symbols?
Before you jump in to save the day by finding the miscreant module in a dump file you have to be sure the debugger is ready. Most importantly you have to be sure it will locate the symbol files for the precise version of the operating system that you are troubleshooting.
Symbol tables are a byproduct of compilation. When a program is compiled, the source code is translated from a high level language into machine code. At the same time, the compiler creates a symbol file with a list of identifiers, their locations in the program and their attributes. Some identifiers are global and local variables and function calls. A program doesn't require this information to execute. Therefore, it can be taken out and stored in another file, reducing the size of the final executable.
Smaller executables take up less disk space and load into memory faster than large ones. But there is a flip side: When a program causes a problem, the operating system knows only the hex address at which the problem occurred. You need something more than that to determine which program was using that memory space and what it was trying to do. Windows symbol tables hold the answer and having access to symbols specific to your system's memory is like putting place names on a map. Conversely, analysing a dump file with the wrong symbol tables would be like finding your way through San Francisco with a map of Boston.