Will a Windows bug check (BSOD) turn on the monitor? - winapi

I am working on a Windows Embedded Standard 2009 project deploying on an Atom powered tablet. We have some known Windows bug check crashes (BSOD) that I am working through. We also have a bug where the tablet becomes unresponsive with the screen off; requireing a hard power cycle to recover. I am pursuing a theory that the unresponsive tablet is a BSOD crash that happened with the screen off. We have EWF turned on which prevents a memory dump from writing to know if a BSOD occurred. We turn the monitor off after user inactivity using user32.dll SendMessage(Handle, WM_SYSCOMMAND, SC_MONITORPOWER, MONITOR_OFF).
Will a Windows bug check (BSOD) turn on the monitor if it was turned off previously programmatically?
Thanks!

Can't speak for tablets, but the bug check isn't going to jump into a long series of power maangement routines, just to output the fact that it's exploded. The bug check does as little as is physically possible, since once you're in BSOD mode, the system is by definition already crashed and not stable and in a highly unknown state. Starting to call other complicated subsytems is not going to happen, as the BSOD may very well have happened in the very routine(s) it's trying to call.

No, bug checks do not turn on the monitor (doesn't matter if it went to sleep due to inactivity or your message).
Your best bet is to leave a kernel debugger attached.

While bugcheck does not go through any power management code, it does make operations that would usually wake a monitor up. Bug-check changes screen resolution and switches to text-mode. If you have a kernel-debugger attached (or just configured), the system waits for the kernel debugger response and will not display the blue-screen text until you hit "g".
In the default configuration it will also attempt to create a crash-dump and reboot. If you suspect a bugcheck look for memroy.dmp in the windows directory or connect a kernel debugger.

Related

Recommendation for logging hanging computer

A customer of mine swears that my application causes his computer to freeze after some hours.
I watched my application carefully on his computer using TaskManager for hours, I tracked GDI resources, RAM usage and CPU.
Nothing obvious.
The customer allows me to debug his computer.
The EventViewer states:
Error: 04-19-2018 13:49:30 The system has been shutdown unexpectedly at 04-19-2018 00:27:04.
That's when the user noticed that the system didn't respond anymore and shut it down by long-pressing the On / Off button.
Before that, no critical errors were logged, only an event 264 warning 3 hours before the freeze.
First, check the System event log (see comments for how). Windows logs an entry there when restarting after an abnormal shutdown. If the system was able to log anything during (or just before) the crash, you'll find it in the vicinity of the abnormal shutdown entry.
If there's nothing helpful in the event logs that points to a software problem, my first suspect would be a temperature related hardware issue. It's pretty easy to check and rule out heavy dust buildup in the heatsinks preventing enough airflow.

Detect BSOD on Windows

Is there any way to detect if a BSOD has just occurred(before the OS shutdowns)?
Thanks,
The kernel provides limited functionality for drivers to be informed of a bugcheck (i.e., BSOD) via KeRegisterBugCheckCallback. In the callback routine, you can attempt a graceful shutdown of your applications, etc. However, given that the system is bugchecking, most functionality will not be available.
Not to my knowledge. A BSOD usually means a hardware malfunction which results in the computer not being able to work/run.
If you have experienced one BSOD it would be wise to investigate the report document as the BSOD is most likely to return.
Most frequent causes are drivers not being up to date or memory malfunction. I have also seen hard drives cause BSODs.

Can a simple program be responsible for a BSOD?

I've got a customer who told me that my program (simple user-land program, not a driver) is crashing his system with a Blue Screen Of Death (BSOD). He says he has never encountered that with other program and that he can reproduce it easily with mine.
The BSOD is of type CRITICAL_OBJECT_TERMINATION (0x000000F4) with object type 0x3 (process): A process or thread crucial to system operation has unexpectedly exited or been terminate.
Can a simple program be responsible for a BSOD (even on Vista...) or should he check the hardware or OS installation?
Just because your program isn't a driver doesn't mean it won't use a driver.
In theory, your code shouldn't be able to BSOD the computer. It's up to the OS to make sure that doesn't happen. By definition, that means there's a problem somewhere either in hardware or in code other than your program. That doesn't preclude there being a bug in your code as well though.
The easiest way to cause a BSOD with a user-space program is (afaik) to kill the Windows subsystem process (csrss.exe). This doesn't need faulty hardware nor a bug in the kernel or a driver, it only needs administrator privileges1.
What is your code exactly doing? The error message ("A process or thread crucial to system operation has unexpectedly exited or been terminate.") sounds like one of the essential system processes terminated. Maybe you are killing a process and unintentionally got the wrong process?
If somehow possible you could try to get a memory dump from that customer. Using the Debugging Tools for Windows you can then further analyze that dump as described here.
1Windows doesn't prevent you from doing so because it "keeps administrators in control of their computer". So this is by design and not a bug. Read Raymond's articles and you will see why.
Short answer is yes. Long answer depends on what is you program is suppose to do and how it does it?
Normally, it shouldn't. If it does, there must be either
A bug in the Windows kernel (possible but very unlikely)
A bug in a device driver (not necessarily in a device your program uses, this could get quite complicated)
A fault in the hardware
I would bet on option number two (device driver) but it would be interesting if you could get us a more detailed dump.
Well, yes it can - but for many different reasons.
That's why we test on different machines, operating systems, hardware etc..
Have you set some requirements for your program and is your user following them?
If you can't duplicate it yourself, and your program doesn't need admin to run, I'd be a bit suspicous about
The stability of that system's hardware
The virus/malware status of that system.
If you can get physical access to the client box, it might be worth running a full virus scan with an up-to-date scanner, and running a full memtest on it.
I had a system once that seemed stable, except that a certian few programs wouldn't run on it (and would sometimes crash the box). Memtest showed my RAM had some bad bits, but they were in higer sims, so they only got accessed if a program tried to use a lot of RAM.
No, and that is pretty much by definition. The worst thing that you can say is that a user-land application may have "triggered" a Windows bug or a driver bug. But a modern desktop Operating System is fully responsible for its own integrity; a BSOD is a failure of that integrity. Therefore the OS is responsible, and only the OS.
(Example of a BSOD bug that your application alone could expose: a virus scanner implemented as a driver, that crashes when executing a file from sector 0xFFFFFFFF, a sector that on this one machine just happens to contain one DLL of your application)
I had problems when exit my app without stopping all the processes and BD connections when the program ends (I crashed the entire IDE). I place the "stopping and disconnecting" code in the "Terminate" of "Form_Closed" event of my main form and the problem wa solved, I don't know it this is your situation.
Another problem can be if the user is trying to access the same resources your app is using (databases, hardware, sockets, etc). Ask him/her about what apps he/she is using when the BSOD happens.
A virus can't be discarded.

Invoke Blue Screen of Death using Managed Code

Just curious here: is it possible to invoke a Windows Blue Screen of Death using .net managed code under Windows XP/Vista? And if it is possible, what could the example code be?
Just for the record, this is not for any malicious purpose, I am just wondering what kind of code it would take to actually kill the operating system as specified.
The keyboard thing is probably a good option, but if you need to do it by code, continue reading...
You don't really need anything to barf, per se, all you need to do is find the KeBugCheck(Ex) function and invoke that.
http://msdn.microsoft.com/en-us/library/ms801640.aspx
http://msdn.microsoft.com/en-us/library/ms801645.aspx
For manually initiated crashes, you want to used 0xE2 (MANUALLY_INITIATED_CRASH) or 0xDEADDEAD (MANUALLY_INITIATED_CRASH1) as the bug check code. They are reserved explicitly for that use.
However, finding the function may prove to be a bit tricky. The Windows DDK may help (check Ntddk.h) - I don't have it available at the moment, and I can't seem to find decisive info right now - I think it's in ntoskrnl.exe or ntkrnlpa.exe, but I'm not sure, and don't currently have the tools to verify it.
You might find it easier to just write a simple C++ app or something that calls the function, and then just running that.
Mind you, I'm assuming that Windows doesn't block you from accessing the function from user-space (.NET might have some special provisions). I have not tested it myself.
I do not know if it really works and I am sure you need Admin rights, but you could set the CrashOnCtrlScroll Registry Key and then use a SendKeys to send CTRL+Scroll Lock+Scroll Lock.
But I believe that this HAS to come from the Keyboard Driver, so I guess a simple SendKeys is not good enough and you would either need to somehow hook into the Keyboard Driver (sounds really messy) or check of that CrashDump has an API that can be called with P/Invoke.
http://support.microsoft.com/kb/244139
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\i8042prt\Parameters
Name: CrashOnCtrlScroll
Data Type: REG_DWORD
Value: 1
Restart
I would have to say no. You'd have to p/invoke and interact with a driver or other code that lives in kernel space. .NET code lives far removed from this area, although there has been some talk about managed drivers in future versions of Windows. Just wait a few more years and you can crash away just like our unmanaged friends.
As far as I know a real BSOD requires failure in kernel mode code. Vista still has BSOD's but they're less frequent because the new driver model has less drivers in kernel mode. Any user-mode failures will just result in your application being killed.
You can't run managed code in kernel mode. So if you want to BSOD you need to use PInvoke. But even this is quite difficult. You need to do some really fancy PInvokes to get something in kernel mode to barf.
But among the thousands of SO users there is probably someone who has done this :-)
You could use OSR Online's tool that triggers a kernel crash. I've never tried it myself but I imagine you could just run it via the standard .net Process class:
http://www.osronline.com/article.cfm?article=153
I once managed to generate a BSOD on Windows XP using System.Net.Sockets in .NET 1.1 irresponsibly. I could repeat it fairly regularly, but unfortunately that was a couple of years ago and I don't remember exactly how I triggered it, or have the source code around anymore.
Try live videoinput using directshow in directx8 or directx9, most of the calls go to kernel mode video drivers. I succeded in lots of blue screens when running a callback procedure from live videocaptureing source, particulary if your callback takes a long time, can halt the entire Kernel driver.
It's possible for managed code to cause a bugcheck when it has access to faulty kernel drivers. However, it would be the kernel driver that directly causes the BSOD (for example, uffe's DirectShow BSODs, Terence Lewis's socket BSODs, or BSODs seen when using BitTorrent with certain network adapters).
Direct user-mode access to privileged low-level resources may cause a bugcheck (for example, scribbling on Device\PhysicalMemory, if it doesn't corrupt your hard disk first; Vista doesn't allow user-mode access to physical memory).
If you just want a dump file, Mendelt's suggestion of using WinDbg is a much better idea than exploiting a bug in a kernel driver. Unfortunately, the .dump command is not supported for local kernel debugging, so you would need a second PC connected over serial or 1394, or a VM connected over a virtual serial port. LiveKd may be a single-PC option, if you don't need the state of the memory dump to be completely self-consistent.
This one doesn't need any kernel-mode drivers, just a SeDebugPrivilege. You can set your process critical by NtSetInformationProcess, or RtlSetProcessIsCritical and just kill your process. You will see same bugcheck code as you kill csrss.exe, because you set same "critical" flag on your process.
Unfortunately, I know how to do this as a .NET service on our server was causing a blue screen. (Note: Windows Server 2008 R2, not XP/Vista).
I could hardly believe a .NET program was the culprit, but it was. Furthermore, I've just replicated the BSOD in a virtual machine.
The offending code, causes a 0x00000f4:
string name = string.Empty; // This is the cause of the problem, should check for IsNullOrWhiteSpace
foreach (Process process in Process.GetProcesses().Where(p => p.ProcessName.StartsWith(name, StringComparison.OrdinalIgnoreCase)))
{
Check.Logging.Write("FindAndKillProcess THIS SHOULD BLUE SCREEN " + process.ProcessName);
process.Kill();
r = true;
}
If anyone's wondering why I'd want to replicate the blue screen, it's nothing malicious. I've modified our logging class to take an argument telling it to write direct to disk as the actions prior to the BSOD weren't appearing in the log despite .Flush() being called. I replicated the server crash to test the logging change. The VM duly crashed but the logging worked.
EDIT: Killing csrss.exe appears to be what causes the blue screen. As per comments, this is likely happening in kernel code.
I found that if you run taskkill /F /IM svchost.exe as an Administrator, it tries to kill just about every service host at once.

Terminating intermittently

Has anyone had and solved a problem where programs would terminate without any indication of why? I encounter this problem about every 6 months and I can get it to stop by having me (the administrator) log-in then out of the machine. After this things are back to normal for the next 6 months. I've seen this on Windows XP and Windows 2000 machines.
I've looked in the Event Viewer and monitored API calls and I cannot see anything out of the ordinary.
UPDATE: On the Windows 2000 machine, Visual Basic 6 would terminate when loading a project. On the Windows XP machine, IIS stopped working until I logged in then out.
UPDATE: Restarting the machine doesn't work.
Perhaps it's not solved by you logging in, but by the user logging out. It could be a memory leak and logging out closes the process, causing windows to reclaim the memory. I assume programs indicated multiple applications, so it could be a shared dll that's causing the problem. Is there any kind of similarities in the programs? .Net, VB6, Office, and so on, or is it everything on the computer? You may be able to narrow it down to shared libraries.
During the 6 month "no error" time frame, is the system always on and logged in? If that's the case, you may suggest the user periodically reboot, perhaps once a week, in order to reclaim leaked memory, or memory claimed by hanging programs that didn't close properly.
You need to take this issue to the software developer.
The more details you provide the more likely it will be that you will get an answer: explain what exact program was 'terminating'. A termination is usually caused by an internal unhandled error, and not all programs check for them, and log them before quitting. However I think you can install Dr Watson, and it will give you at least a stack trace when a crash happens.

Resources