Detect UI operation which will "hang" the application if running in service mode - user-interface

Fellow experts!
I have faced the following dilemma: some of our tools (executables) are started as scheduled tasks, some are started as services and others as usual desktop apps with interactive Windows user. We are using the code sharing strategy for source management (this is not debatable for this question).
So the solution I want to find is the following:
Detect UI operation at run-time which leads to hanging service/background task (such as say call to Application.ShowException, ShowMessage, MessageDialog, TForm.Show etc.). And when such an action detected I want to raise the exception instead. Then the operation will fail, we will have stack trace etc. but the process will not hang up! The most problematic hang up is when some event processing is done in transaction and then in some of the code used to process event suddenly (because of error in code, design, whatever) there is UI code executed then the process hangs and the DB parts can be locked!
What I think I need to do is: Use DDetours library to intercept WinAPI calls to a certain routines and raise exception instead (so that the process does not hang, but just fail in some method). Also I know that the creation of forms and windows does not hang the app, but only the tries to show them to the user.
Is there some known method of handling this problem? Or maybe there is some list of WinAPI routine set which hangs in service mode?
Thank you in advance.

Related

Synchronous VB6 apparently behaving asynchronously! Crash

We have a legacy VB6 app which has started, from time to time, hangs. We thought it may be to do with a shift to Citrix, but can now replicate the behaviour on a thick client on Win10. We don't think that we have seen this before on earlier Windows versions, but are still checking logs to confirm that.
We experience the behaviour when tabbing into a text box and then tabbing out. As we pass through it, we are making a simple ado call to lookup/validate some data in a text box. As part of the correct program running we are logging
“Opening Dataset: SELECT ... FROM ... ”
“Opened Dataset”
Between these 2 log statements is simple ado data retrieval code with which we have had no problems previously. It is in an ActiveX dll and is running synchronously. Most importantly is that between these 2 log statements there is no DoEvents or api call which would yield control. As far as we can see, it should be a purely synchronous operation.
When the system crashes, which happens sporadically, we can see other logging statements appear between these 2 which can be either resource status (e.g. how much memory, gdi/user objects - which would usually be found because a timer has triggered in the main form) or focus type events - which aren’t timer driven - at least in our codebase.
“Opening Dataset: SELECT ... FROM ... ”
“Resource Status: ...”
“Opened Dataset”```
or
“Opening Dataset: SELECT ... FROM ... ”
“TextItem.OnLostFocus Item1 ...”
“TextItem.Validate ...”
“TextItem.OnGotFocus Item2 ...
“Opened Dataset”
So my initial question is, in what scenario can what should be a synchronous operation be interrupted and appear to act asynchronously.
For example, and we aren’t doing this, I could imagine writing some unsafe code whereby by using a multimedia timer (on another thread) and supplying an AddressOf parameter to the address of a function on one of our modules, that that timer initiates execution of our code, separate to the correct control flow. Other than something like that, I just can’t see how synchronous vb6 code could be interrupted in this way.
I’d be really grateful of any thoughts, suggestions or advice. I’m really sorry if this is soo vague. It perhaps reflects how I’m struggling to get my head round this problem.
Just to say, we tracked this down to Windows 10 plus an old (out of support) socket component we are using. It looks like it is pumping the message queue "at the wrong time" and hence we are seeing UI events appear in the middle of a synchronous process. We don't see this behaviour on earlier Windows versions.
I don't know what may have changed in Win10 which would result in this, but we obviously need to upgrade.
In our case we had a few long running timers to pull status/changes from the DB which caused this. We are using ADO with SQL Native Client and MARS, which worked great up until Windows 10 where intermittent lock ups occurred. Logging and Windbg confirmed this was happening when 2 requests where hitting the ADO connection at the same time. The error from ADO was "Unable to open a logical session" error number -2147467259, and actually caused SQL Server 2014 (running on another machine) to block all other client queries from multiple different applications and machines until the locked up app was killed. I could not replicate this in the IDE as apparently that forces timers to work the way they always did. The fix was to async our ADO implementation and put a connection manager on top of the SQL connections to force requestors to wait their turn (basically taking the Win10 async'd timer feature back out). My only performance impact was the additional few milliseconds of delaying the timer fired SQL query when it collided with a another query.

Why not launch external crash dump handler at the time the application crashes?

I am in the process of designing a crash handler solution for one of our applications that creates a crash dump file using the MiniDumpWriteDump() function. While reading up on the topic I have seen the recommendations to invoke MiniDumpWriteDump() from an external process to maximize the chance that the dump file contains the correct information. The common solution seems to be to run a watchdog process in parallel to the application process. When the application crashes it somehow contacts the watchdog process, providing it with the information that is required to create the crash dump. Then the application goes to sleep until it is terminated by the watchdog process.
I can imagine such a watchdog process being run continually as a background service. This has many implications, starting with "who creates the service?", but also "which user does the service run as?", and "how does the application contact the service?" etc. It seems a pretty heavy-weight solution which I don't feel is appropriate for the scope of my task.
A simpler approach is suggested by this SO answer: Launch a guard process on application startup that is tightly coupled to the application process. This is pretty good, but it still leaves me with the tasks of 1) keeping the information somewhere in the application how I can contact the guard process in case of a crash; and 2) making sure to terminate the guard process if the application process shuts down normally.
The simplest solution of all would be to launch the crash dump handler process at the time the crash occurs, passing all the information that is required to create the crash dump as arguments to the process. This information consists of
The process ID of the application process that crashed
The thread ID of the thread that crashed
The adress of the EXCEPTION_POINTERS structure that describes the exception that caused the crash
This "fire and forget" approach is compelling because it does not require any state retention, nor any complicated over-time process management. In fact, the approach seems so overwhelmingly simple that I cannot help but feel that I am overlooking something.
What are the arguments against such an approach?
The main argument against the "fire and forget" approach, as I called it, is that it is not safe to launch a new process at a time when the application is already in a state where it is about to crash.
Because of that I went for the "guard process" approach. It brings a number of challenges with it, for which Hans Passant has outlined a solution.
I also added a bit of code in this answer that should help with deep-copying the all-important EXCEPTION_POINTERS data structure.
Using WER, as proposed in the comments, also looks like a good alternative to writing your own guard process. I must admit I have not investigated this any further, though.

What process API do I need to hook to track services?

I need to track to a log when a service or application in Windows is started, stopped, and whether it exits successfully or with an error code.
I understand that many services do not log their own start and stop times, or if they exit correctly, so it seems the way to go would have to be inserting a hook into the API that will catch when services/applications request a process space and relinquish it.
My question is what function do I need to hook in order to accomplish this, and is it even possible? I need it to work on Windows XP and 7, both 64-bit.
I think your best bet is to use a device driver. See PsSetCreateProcessNotifyRoutine.
Windows Vista has NotifyServiceStatusChange(), but only for single services. On earlier versions, it's not possible other than polling for changes or watching the event log.
If you're looking for a user-space solution, EnumProcesses() will return a current list. But it won't signal you with changes, you'd have to continually poll it and act on the differences.
If you're watching for a specific application or set of applications, consider assigning them to Job Objects, which are all about allowing you to place limits on processes and manage them externally. I think you could even associate Explorer with a job object, then all tasks launched by the user would be associated with your job object automatically. Something to look into, perhaps.

Catching child process exceptions on windows

i'm developing a multi-platform C++ fuzzing application. The app spawns a child process and checks whether it stopped unexpectedly. I've already managed to do this on linux, however, windows exception handling mechanism is making things hard for me.
My code right now does the following:
- Call CreateProcess to spawn the process.
- WaitForSingleObject to wait for it to terminate.
- Then call GetExitCodeProcess and check if the exit code corresponds to an exception.
Everything works as it should, i've tested it with a null dereferencing test application, and i can catch the exception gracefully. However, each time i test this, a Windows error message box spawns telling me to Send or Not Send the error report. Since the fuzzer is supposed to be an automatic testing application, i'd need to somehow disable this notification, so that even if an exception is caught, the fuzzer can continue testing.
I've already tried installing a SEH handler, but had no luck(apparently these handlers aren't inherited by child processes). I've read something about using vectored exception handling, but suppose it would be the same, i believe vector handlers aren't inherited.
Could anybody help me with this problem? I don't know what to search for, i've already googled a lot and haven't found anyhing.
Thanks!
Debug API is one option. Here is a starting point in MSDN.
Following on frast's answer, you can spawn the process as a child of a process with a suitable SetErrorMode. This (inheritable) setting determines which errors will result in dialogs popping out - I found your question while trying to achieve the exact same thing for an automated testing application.
To avoid any error dialogs, use
SetErrorMode(
SEM_FAILCRITICALERRORS
| SEM_NOALIGNMENTFAULTEXCEPT
| SEM_NOGPFAULTERRORBOX
| SEM_NOOPENFILEERRORBOX);
Injection is probably overkill - better to use a wrapper process.
Try to inject the following code into your child process:
SetErrorMode(SEM_NOGPFAULTERRORBOX);
Lookup the details of SetErrorMode in MSDN.
Read about injection technique here:
Injective Code inside Import Table

Does Application.ApplicationExit event work to be notified of exit in non-Winforms apps?

Our code library needs to be notified when the application is exiting. So we have subscribed to the System.Window.Forms.Application.ApplicationExit event. This works nicely for Winforms apps, but does it also work for other types of applications such as console apps, services, and web apps (such as ASP.NET)? The namespace would suggest that it doesn't, and it presumably gets raised when Application.Exit() is called (explicitly or implictly), which may not be correct to call for these other cases.
Is there some other event which would be better in these other cases or which would be more universal (great if it works for Winforms, too)? For example, is there an event for when Environment.Exit() is called (console app)?
I found a mention of an Exited event in System.Diagnostic.Process, but this appears to be for monitoring the exit of another process, and it does not appear to be received by a process about itself (for example, Process.GetCurrentProcess().Exited += Process_Exited; Process.GetCurrentProcess().EnableRaisingEvents = true;). I would think it might only be raised after the process has actually exited, so that wouldn't work.
This is particularly for .NET 2.0 and C#.
We finally found more about this (but by then my machine had been rebuilt and lost the cookies to my unregistered profile here; hopefully, it will let met post this answer).
Further investigation eventually found a few more events which we have found helpful:
System.Windows.Forms.Application.ThreadExit - Fires when a message loop exits
System.Windows.Forms.Application.ApplicationExit - Fires when all message loops exit
System.AppDomain.CurrentDomain.DomainUnload - Fires when a domain other than the default exits
System.AppDomain.CurrentDomain.ProcessExit - Fires when the default app domain exits
System.AppDomain.CurrentDomain.UnhandledException - Fires when an uncaught exception occurs, ending the app.
Only one of the DomainUnload or ProcessExit events are possible for a given app domain, depending on whether it is the default (top-level) domain for the process or was created as a subdomain (eg. on a web server). If an application doesn't know which it might be (as in our case), it needs to subscribe to both if it wants to catch the actual unload for itself. Also, it appears that UnhandledException (which as of .NET2.0 is always fatal) may prevent the other two events, so that may be a third case to handle. These three events should work for any .NET application.
There is a caveat that the execution time for ProcessExit is bounded (about 4 seconds?), so it may not be possible to do extensive "final" work in that event handler. It needs to be something which can be done quickly.
The Application events only apply to WinForms applications (we suspect they may not apply in pure WPF applications, however). The naming can be misleading because they are named for their most basic normal usage which has certain assumptions. ThreadExit does not relate to the actual System.Threading.Thread but rather to the message loop (Application.Run())) of a UI thread, and ApplicationExit similarly relates to the collection of application Forms on one or more UI threads. Normally, once the call to Application.Run() returns, called from the entry method of a thread, the entry method quickly concludes and the thread itself then ends. And once all UI threads have exited, a WinForms app is usually all done and exits.
Another event of note is the System.Windows.Forms.Application.ThreadException event. A Windows message loop can be configured to catch exceptions which occur in handling a message and send this event rather than let them be uncaught (and thus fatal) exceptions. Catching these exceptions allows the message loop (and that UI thread) to continue running (after aborting the current message handler). There can be only one subscriber to this event at any time for a given thread (subscriptions overwrite any previous subscriber), and it must be configured before any Form is created and subscribed before entering the message loop. See the MSDN help for this event and System.Windows.Forms.Applicaton.SetUnhandledExceptionMode() for more info.

Resources