Alternatives to NtQueryInformationProcess for detecting undead processes? - windows

I'd like to detect when someone terminates a suspended debugged process without informing the debugger. (For example, get to a breakpoint in a console app, and close the app's console window.) The process goes into a zombie-like state and cannot be interacted with further until the debugger releases its hold.
This state appears to set the PROCESS_EXTENDED_BASIC_INFORMATION::IsProcessDeleting flag when gathering information on the process via NtQueryInformationProcess, but both the flag, structure, and function are effectively undocumented and marked "do not use" on MSDN.
Is testing this flag reliable? Is there a better, "official" API I can use?
(Yes, I know IsProcessDeleting is also set when the process is (surprise, surprise) shutting down normally. This is not a problem from my perspective.)

Nope, not that I can see. NtQueryInformationProcess isn't going away anytime soon though, if that function was removed hundreds of apps would be broken by it.

Related

Cocoa program can't be stopped

I'm trying to write an OS X app that uses a serial port. I found an example (cocoa) and got it running in Xcode 4. On the first run, it opens the port and I'm able to exchange data with the hardware.
If I try to change the port the program goes rogue. The pinwheel starts and the UI is unresponsive. I can't stop the program from Xcode, nor can I kill it from Terminal, or Force Quit. Force Quit of Xcode doesn't do it. Although the PID goes away with a kill from Terminal, the UI is still present with the merrily spinning pinwheel.
The only way out is a re-boot. Any ideas on how to track down the errant code are welcome. I'm new to Cocoa/Objective C, so simple terms are better.
Most likely it became a zombie. It should show up in ps auxww (or similar) with a 'Z' in its status. Activity Monitor might also still show it.
This is relatively common when working with hardware, such as a serial port. Zombies can arise for either of two reasons, most likely the first in this case:
The process is blocked in a kernel call of some kind, that's not interruptible.
The process has exited but its parent hasn't acknowledged that (via wait() or similar).
In the first case it's usually a fundamental bug or design flaw of some kind, and you may not have any good options short of figuring out exactly what code path tickles the problem, and avoiding that.
In the second case the solution is generally simple - find the parent process of your zombie and kill it. Repeat as necessary until your zombie gets adopted by a parent process that does call wait() to reap it (launchd will do this if nothing else).

Can aborting a process without resetting the clipboard chain cause trouble?

I've got a program that calls SetClipboardViewer at startup to register for clipboard change notifications. At shutdown time, it will call ChangeClipboardChain to remove itself from the chain correctly.
This is all great as long as the program runs normally. But that's got me wondering, what happens if the program gets aborted, either by me killing it under the debugger, by a crash, or by the user killing the process because something went wrong? Then the cleanup never happens. Can that cause trouble for the system somehow?
Specifically, I know Windows can remove my viewer without trouble because it's a handle and Windows can clean up all handles when a process terminates, but will this cause the next value downstream in the chain, that I was holding a reference to, to get lost somehow?
Yes, failure to remove yourself from the chain will break the chain. Deadly sin #2. Please read the whole list to be sure that you're following all of the rules.
http://www.clipboardextender.com/developing-clipboard-aware-programs-for-windows/6
Lots of apps suffer from this, including the Delphi IDE. i.e. if Delphi crashes in certain ways, it'll kill the clipboard chain (D2005 anyway).
Consider using Vista style notification on Vista/Windows7.

Hide an access violation on another application

I have an application that sometimes causes an access violation on exit. This is quite unpredictable and all attempts to locate the bug have been unsuccesful so far. The bug is harmless, as no data is lost, so I was thinking if it might be possible to just hide it.
Is it possible to have another app launch the buggy one and catch the Access Violation exception if it occurs? If yes, how?
Thanks in advance!
Yes, if the other application is a debugger. This is a non-trivial amount of work, To become a debugger, you create the process with DEBUG_PROCESS | DEBUG_ONLY_THIS_PROCESS flag, see CreateProcess flags for more information.
Once you are the debugger of the process, you will get first chance to handle all exceptions.
You could also attach to the process as a debugger just before it shuts down (assuming that you know when this is going to happen) with DebugActiveProcess
Call SetErrorMode(SEM_NOGPFAULTERRORBOX) before launching the buggy application as a child process.
The error mode is inherited to child processes and this particular flag will prevent the crash dialog from appearing.

Disabling Windows error reporting (Dr. Watson) for my process

I have an application that is hosting some unstable third-party code which I can't control in an external process to protect my main application from nasty errors it exhibits. My parent process is monitoring the other process and doing "the right thing (tm)" when it fails.
The problem that I have is that Dr. Watson is still detecting crashes in the isolated process and attaching to the processes on the way down to take a crash dump. This has the two problems of:
1. Dramatically slowing down the time that it takes for me to detect a failure because the process stays alive while the crash dump is being taken.
2. Showing annoying popups to the user asking if they want to submit the error reports to Microsoft.
Clearly I would prefer to fix the bugs in the child process, but given that it isn't an option, I would like to be able to selectively disable Dr. Watson (and Windows Error Reporting in Vista+) for that process.
I am running some of my own code in the process before handing off to the untrusted bit, so if there is an API that I can call that affects the current process that would be fine.
I am aware of: http://support.microsoft.com/default.aspx/kb/188296 which would disable Dr. Watson for the entire machine. I don't want to do that because it would make me a bad citizen to trash a machine-wide setting.
I am also aware of the WerSetFlags option in Vista+ that would seem to disable windows error reporting for the current process, but I need something that will disable Dr.Watson on earlier OS versions.
The good doctor is invoked when a process does not handle a certain exception. Therefore, the common way to go would be to handle all exceptions yourself. In your case, it is much harder since you don't own the crashing process code. What you can do then, is to inject your code into the other process at runtime, and install an exception handler that will swallow the exception causing the crash. When caught, gracefully shut down the process.
There are quite a few questions here talking about injecting code into another process. As for the crash handler, you can either set an unhandled exception filter, or add a vectored exception handler. Note that for the latter, you'll have to be careful not to swallow legit exceptions that are in fact handled inside the other process, namely find a way to recognize the crashing exception and make sure it is the only one you handle.
You want to disable the GPF popup: http://blogs.msdn.com/oldnewthing/archive/2004/07/27/198410.aspx

How to reload a crashed process on Windows

How to reload a crashed process on Windows? Of course, I can run a custom monitoring Win service process. But, for example, Firefox: it doesn't seem to install such a thing, but still it can restart itself when it crashes.
On Vista and above, you can use the RegisterApplicationRestart API to automatically restart when it crashes or hangs.
Before Vista, you need to have a top level exception filter which will do the restart, but be aware that running code inside of a compromised process isn't entirely secure or reliable.
Firefox constantly saves its state to the hard disk, every time you open a tab or click a link, or perform some other action. It also saves a flag saying it shut down safely.
On startup, it reads this all back, and is able to "restore" based on that info.
Structured exception handling (SEH) allows you to catch program crashes and to do something when it happens.
See: __try and __except
SEH can be very dangerous though and could lead to your program hanging instead. Please see this article for more information.
If you write your program as an NT service then you can set the first, second and subsequent failure actions to "Restart the service".
For Windows 2008 server and Windows Vista and Windows 7 you can use the Win32 API RegisterApplicationRestart
Please see my answer here for more information about dealing with different types of program crashes.
If I recall correctly Windows implements at least some subset of POSIX and so "must" have the signal interface (things like SIGKILL, SIGSEGV, SIGQUIT etc.).
I've never done this but on linux, but you could try setting the unexpected termination trap with signal() (signal.h).
From quick scan of docs it seems that very few things can be done while handling signal, it may be possible that even starting a new process is on forbidden list.
Now that I've thought about it, I'd probably go with master/worker pattern, very simple parent thread that does nothing but spawns the worker (that does all the UI / other things). If it does not set a specific "I'm gonna die now" bit but still dies (parent process always gets message / notification that spawned process died) then master respawns the worker. The main theme is keep master very simple and hard to die due to own bugs.

Resources