How to reload a crashed process on Windows? Of course, I can run a custom monitoring Win service process. But, for example, Firefox: it doesn't seem to install such a thing, but still it can restart itself when it crashes.
On Vista and above, you can use the RegisterApplicationRestart API to automatically restart when it crashes or hangs.
Before Vista, you need to have a top level exception filter which will do the restart, but be aware that running code inside of a compromised process isn't entirely secure or reliable.
Firefox constantly saves its state to the hard disk, every time you open a tab or click a link, or perform some other action. It also saves a flag saying it shut down safely.
On startup, it reads this all back, and is able to "restore" based on that info.
Structured exception handling (SEH) allows you to catch program crashes and to do something when it happens.
See: __try and __except
SEH can be very dangerous though and could lead to your program hanging instead. Please see this article for more information.
If you write your program as an NT service then you can set the first, second and subsequent failure actions to "Restart the service".
For Windows 2008 server and Windows Vista and Windows 7 you can use the Win32 API RegisterApplicationRestart
Please see my answer here for more information about dealing with different types of program crashes.
If I recall correctly Windows implements at least some subset of POSIX and so "must" have the signal interface (things like SIGKILL, SIGSEGV, SIGQUIT etc.).
I've never done this but on linux, but you could try setting the unexpected termination trap with signal() (signal.h).
From quick scan of docs it seems that very few things can be done while handling signal, it may be possible that even starting a new process is on forbidden list.
Now that I've thought about it, I'd probably go with master/worker pattern, very simple parent thread that does nothing but spawns the worker (that does all the UI / other things). If it does not set a specific "I'm gonna die now" bit but still dies (parent process always gets message / notification that spawned process died) then master respawns the worker. The main theme is keep master very simple and hard to die due to own bugs.
Related
I am performing system tests on a SAP system. From time to time, SAP crashes and I'd like to recover from those crashes by resetting the virtual machine to a previously saved state.
My problem is that I cannot detect such crashes reliably. I have created WER LocalDumps registry entries, but I don't get dumps.
It seems SAP has registered an unhandled exception handler and performs different tasks on different types of exceptions. Sometimes it shows a message box and terminates the application (e.g. in case of compression errors), sometimes it goes with a so-called Short Dump.
I am neither interested in the message box, nor in the short dump, so I am looking for a way to disable the unhandled exception handler of SAP. This should bring up WER, which writes the dump file and I can take actions to restart my system tests.
For performance reasons, I'd not like to restart the VM on every test.
I have tried:
I am basically familiar with unhandled exception handlers. I have applied them to my own .NET code successfully.
I looked at SetUnhandledExceptionFilter (MSDN) and similar but it applies to the calling process only and I cannot modify the code of SAP.
I read about DisableUserModeCallbackFilter but I don't think it is helpful for my case
I wonder whether there is a Registry Setting (e.g. in ImageFileExecutionOptions) or a Shim that I could activate.
According to Hans Passant's comment (which I take as an authorative answer),
There is no boss override switch built into the operating system to stop it from doing this.
I finally attached the debugger to SAP GUI at a time where the process was alive. Starting with all exceptions enabled, I narrowed down the conditions so that WinDbg would break when SAP GUI crashed (first chance, then second chance).
Heres a simple question - is there anyway that a non-console (ie a CWinApp) application can receive and process CTRL+BREAK, it would appear SetConsoleCtrlHandler doesnt do the job nor the installation of signal handlers?
I unfortunately am working with a legacy CDialog based app which is run under the control of Microsoft HPC and HPC uses CTRL+BREAK to cancel the program (assuming i guess that nobody in their right mind would have a non-console app running in the background)
Cheers.
Calling AttachConsole with ATTACH_PARENT_PROCESS should do the trick. This will attach your process to the HPC console so that it can receive the control-break signal. You should probably do this before calling SetConsoleCtrlHandler.
If that doesn't work, try AllocConsole instead. If HPC doesn't have a console of its own, it might be assuming that the sub-process will have created a new console group (this happens automatically for console-mode applications) in which case it will be sending a control-break signal to the sub-process PID. If so, it shouldn't matter whether the console group was created automatically or explicitly.
You may wish to start by making sure that HPC is indeed sending a control-break signal (presumably via GenerateConsoleCtrlEvent) by checking that SetConsoleCtrlHandler works as expected for a console-mode application. If it is calling TerminateProcess instead then there is nothing you can do about it.
I've got a program that calls SetClipboardViewer at startup to register for clipboard change notifications. At shutdown time, it will call ChangeClipboardChain to remove itself from the chain correctly.
This is all great as long as the program runs normally. But that's got me wondering, what happens if the program gets aborted, either by me killing it under the debugger, by a crash, or by the user killing the process because something went wrong? Then the cleanup never happens. Can that cause trouble for the system somehow?
Specifically, I know Windows can remove my viewer without trouble because it's a handle and Windows can clean up all handles when a process terminates, but will this cause the next value downstream in the chain, that I was holding a reference to, to get lost somehow?
Yes, failure to remove yourself from the chain will break the chain. Deadly sin #2. Please read the whole list to be sure that you're following all of the rules.
http://www.clipboardextender.com/developing-clipboard-aware-programs-for-windows/6
Lots of apps suffer from this, including the Delphi IDE. i.e. if Delphi crashes in certain ways, it'll kill the clipboard chain (D2005 anyway).
Consider using Vista style notification on Vista/Windows7.
I'm debugging plugins on Windows 7 and of course the plugin host (Cubase5.exe) occasionally crashes because of errors in the plugin. On XP or Vista, I could always restart it immediately and continue working. But on Windows 7, even though Cubase appears to close, it is still visible in Task Manager and I cannot kill it by any means. After a minute or two, it disappears by itself. In the mean time, I can't work because the plugin DLL is still locked by the process.
Does anyone know why this happens on Windows 7? I've already tried disabling Automatic Error Reporting but that didn't help. I've tried attaching cdb to Cubase, but I get:
Cannot debug pid 5252, NTSTATUS 0xC0000001
"{Operation Failed} The requested operation was unsuccessful."
Debuggee initialization failed, NTSTATUS 0xC0000001
"{Operation Failed} The requested operation was unsuccessful."
I tried following the instructions here but it appears this is only possible if I connect a second machine to my computer to debug it remotely.
I finally found the solution, using this article:
http://blogs.technet.com/b/markrussinovich/archive/2005/08/17/unkillable-processes.aspx
This required installing the Windows Debugging Tools for Windows (nice name) and LiveKd, but by following the steps outlined I was able to track which driver was causing the process to hang: it turned out to be the 64-bit driver for the M-Audio Oxygen 8 V2 controller I'm using. Unfortunately no driver update is available.
Anyway, if anyone encounters a similar problem, this is the way to solve it.
Have you tried Process Explorer by Mark Russinovich? It is really useful for "killing":)
If you have error reporting enabled, it's possible that werfault.exe has Cubase open to write a minidump for crash reporting purposes.
This is just a stab in the dark but it might be your problem.
One thing you can try is to check with Process Monitor what Cubase is doing. Set a filter so that everything with a process name containing "cubase" will be recorded. It could be that you are facing some timeout issue when Cubase wants to exit.
you can end the process the service is running under. You can find this process by going to the Services tab of the Task Manager, right-clicking, and selecting Go To Process(you need to click the Show processes from all users button.). Note that one process may host multiple services (especially if it's svchost.exe), and ending the process will kill all those services. Also, this is an unclean exit, and may cause data corruption depending on what the service(s) was doing when you killed it.
Depending on which specific service you are trying to stop, there may be a cleaner way to simulate failure.
I have an application that is hosting some unstable third-party code which I can't control in an external process to protect my main application from nasty errors it exhibits. My parent process is monitoring the other process and doing "the right thing (tm)" when it fails.
The problem that I have is that Dr. Watson is still detecting crashes in the isolated process and attaching to the processes on the way down to take a crash dump. This has the two problems of:
1. Dramatically slowing down the time that it takes for me to detect a failure because the process stays alive while the crash dump is being taken.
2. Showing annoying popups to the user asking if they want to submit the error reports to Microsoft.
Clearly I would prefer to fix the bugs in the child process, but given that it isn't an option, I would like to be able to selectively disable Dr. Watson (and Windows Error Reporting in Vista+) for that process.
I am running some of my own code in the process before handing off to the untrusted bit, so if there is an API that I can call that affects the current process that would be fine.
I am aware of: http://support.microsoft.com/default.aspx/kb/188296 which would disable Dr. Watson for the entire machine. I don't want to do that because it would make me a bad citizen to trash a machine-wide setting.
I am also aware of the WerSetFlags option in Vista+ that would seem to disable windows error reporting for the current process, but I need something that will disable Dr.Watson on earlier OS versions.
The good doctor is invoked when a process does not handle a certain exception. Therefore, the common way to go would be to handle all exceptions yourself. In your case, it is much harder since you don't own the crashing process code. What you can do then, is to inject your code into the other process at runtime, and install an exception handler that will swallow the exception causing the crash. When caught, gracefully shut down the process.
There are quite a few questions here talking about injecting code into another process. As for the crash handler, you can either set an unhandled exception filter, or add a vectored exception handler. Note that for the latter, you'll have to be careful not to swallow legit exceptions that are in fact handled inside the other process, namely find a way to recognize the crashing exception and make sure it is the only one you handle.
You want to disable the GPF popup: http://blogs.msdn.com/oldnewthing/archive/2004/07/27/198410.aspx