How do I resolve process hanging on CoUnitialize()? - windows

I have a native Visual C++ NT service. When the service is started its thread calls CoInitialize() which attaches the thread to an STA - the service thread uses MSXML through COM interfaces.
When the service receives SERVICE_CONTROL_STOP it posts a message in the message queue, then later that message is retrieved and the OnStop() handler is invoked. The handler cleans up stuff and calls CoUnitialize(). Most of the time it works allright, but once in a while the latter call hangs. I can't reproduce this behavior stably.
I googled for a while and found the following likely explanations:
failing to release all COM objects owned
repeatedly calling CoInitializeEx()/CoUnitialize() for attaching to MTA
failing to dispatch messaged in STA threads
The first one is unlikely - the code using MSXML is well tested and analyzed and it uses smart pointers to control objects lifetime, so leaking objects is really unlikely.
The second one doesn't look like the likely reason. I attach to STA and don't call those functions repeatedly.
The third one looks more or less likely. While the thread is processing the message it doesn't run the message loop anymore - it is inside the loop already. I suppose this might be the reason.
Is the latter a likely reason for this problem? What other reasons should I consider? How do I resolve this problem easily?

Don't do anything of consequence in the thread handling SCM messages, it's in a weird magical context - you must answer SCM's requests as fast as possible without taking any blocking action. Tell it you need additional time via STOP_PENDING, queue another thread to do the real cleanup, then immediately complete the SCM message.
As to the CoUninitialize, just attach WinDbg and dump all the threads - deadlocks are easy to diagnose (maybe not to fix!), you've got all of the parties to the crime right there in the stacks.

After very careful analysis and using the Visual Studio debugger (thanks to user Pall Betts for pointing out that getting evidence is important) to inspect all active threads I discovered that the process hang not on calling CoUninitialize(), but instead on RpcServerUnregisterIf() function called from our program code right before CoUninitialize(). Here's a sequence diagram:
WorkerThread RpcThread OuterWorld
|----| Post "stop service" message | |
|<---| | SomeRpcServerMethod() |
| Post "process rpc request" |<---------------------------|
|<----------------------------------------| waits
| |----|Wait until
|----| Process "stop service" message | |request is processed
|<---| (call OnStop()) | |by the worker thread
| | |
|----| RpcServerUnregisterIf() | |
|X<--| Wait all rpc requests complete |X<--|
| |
An inbound RPC request comes and RPC runtime spawns a thread to service it. The request handler queues request to the worker thread and waits.
Now the moonphase happens to be just right and so RpcServerUnregisterIf() is executed in parallel with the handler in the RPC thread. RpcServerUnregisterIf() waits for all inbound RPC requests to complete and the RPC handler waits for the main thread to process the request. That's a plain old deadlock.

Related

How can a synchronous WinHttp request be cancelled?

My service has a thread that may potentially be executing a WinHttpSendRequest when someone tries to stop my service.
The WinHttpCloseHandle documentation says:
An application can terminate an in-progress synchronous or asynchronous request by closing the HINTERNET request handle using WinHttpCloseHandle
But, then later on the same documentation seems to contradict this. It says:
An application should never WinHttpCloseHandle call on a synchronous request. This can create a race condition.
I've found this blog post that seems to agree I can't call WinHttpCloseHandle.
I'm wondering how can I cancel this operation so that my service can be stopped gracefully? I can't really wait for the WinHttpSendRequest to timeout naturally because it takes too long and my service doesn't stop quickly enough. I think windows reports this as an error and then forcefully kills the service in a shutdown.
An ideas would be appreciated.
Calling WinHttpCloseHandle off a background thread to force handle close is perhaps not the best solution. Still it works, and the original caller would receive something like "bad handle" error code and the request would be forcefully terminated.
It would be possibly unsafe to abuse this power, and one would rather implement async requests instead. However, in case of shutting down stale service it is going to work fine.

Catching child process exceptions on windows

i'm developing a multi-platform C++ fuzzing application. The app spawns a child process and checks whether it stopped unexpectedly. I've already managed to do this on linux, however, windows exception handling mechanism is making things hard for me.
My code right now does the following:
- Call CreateProcess to spawn the process.
- WaitForSingleObject to wait for it to terminate.
- Then call GetExitCodeProcess and check if the exit code corresponds to an exception.
Everything works as it should, i've tested it with a null dereferencing test application, and i can catch the exception gracefully. However, each time i test this, a Windows error message box spawns telling me to Send or Not Send the error report. Since the fuzzer is supposed to be an automatic testing application, i'd need to somehow disable this notification, so that even if an exception is caught, the fuzzer can continue testing.
I've already tried installing a SEH handler, but had no luck(apparently these handlers aren't inherited by child processes). I've read something about using vectored exception handling, but suppose it would be the same, i believe vector handlers aren't inherited.
Could anybody help me with this problem? I don't know what to search for, i've already googled a lot and haven't found anyhing.
Thanks!
Debug API is one option. Here is a starting point in MSDN.
Following on frast's answer, you can spawn the process as a child of a process with a suitable SetErrorMode. This (inheritable) setting determines which errors will result in dialogs popping out - I found your question while trying to achieve the exact same thing for an automated testing application.
To avoid any error dialogs, use
SetErrorMode(
SEM_FAILCRITICALERRORS
| SEM_NOALIGNMENTFAULTEXCEPT
| SEM_NOGPFAULTERRORBOX
| SEM_NOOPENFILEERRORBOX);
Injection is probably overkill - better to use a wrapper process.
Try to inject the following code into your child process:
SetErrorMode(SEM_NOGPFAULTERRORBOX);
Lookup the details of SetErrorMode in MSDN.
Read about injection technique here:
Injective Code inside Import Table

Intermittent issues with Win32 named events

Experiencing intermittent issues, related to named events when processes are running in different user contexts: WaitForSingleObject (and WaitForMultipleObjects too) for such event handle fails with WAIT_FAILED (GetLastError returns 6 - Invalid handle value).
We have an application to schedule tasks on Windows machines under user accounts, and issue happens after some tasks are completed.
Service part of application (JobManager) starting executable (JobLeader) under user account (CreateProcessAsUser) to run user task, and waiting for named event to be signaled.
Manual reset named event is created by JobLeader in the "Global\" namespace and signaled when user task is completed.
JobManager waiting in the loop, calling WFMO(WaitForMultipleObjects) with delay of 10 seconds, to see if named event or JobLeader process handle are signaled.
Periodically named event handle, opened by JobManager through OpenEvent API call, causes WFMO (WFSO is also called after to identify which handle is broken) to return WAIT_FAILED, with error code 6 - "Invalid handle value".
After reopening the event, this error may gone, or may not - WFMO may again returns WAIT_FAILED because of invalid handle value.
Interesting, that it may pass few dozens tasks without this error, and then - sequentially few tasks have it. Tasks used for testing are identical - just a cmd.exe script, dumping environment.
Anyone have ideas about this?
Regards,
Alex
Do you create the event in your JobManager and then open it in the 'JobLeader'? If not, how do you communicate the event handle (and/or name) between the two processes?
My gut tells me it's a race condition...

How can my app find the sender of a windows message?

I have an app which uses a keyboard hook procedure in a library. The wParam in the hook for one message is 255 which we think is "(reserved / OEMClear)". I'd like to work out the source of this message as it causes my application to crash in the library, and given it shouldn't be happening it would be good to identify it. The message comes in repeatedly on only one PC we have - other computers don't see the message at all.
So, is there a way to trace the source of a message sent to a window please, or all those on the system?
There is no built-in way to find out who sent the window message, not even win32k keeps track of this; you might be able to find it out with a kernel debugger and a conditional breakpoint.
However, I would argue that you don't really need this information; you need to make your app properly handle any message sent to it.
I came up with a technique for determining who is sending a win32 window message across threads/processes during one-off debugging/troubleshooting sessions. It requires making a couple of assumptions so it's not 100% reliable, but so far I haven't found a case where it didn't work.
The basic idea is to exploit the fact that, when the message arrives, the recipient window thread is typically blocked waiting in its message loop (specifically, GetMessage()). When the message is delivered, the sending thread readies the receiving thread, pulling it out of its wait state.
It turns out that Windows provides ways to precisely trace which threads are readying which other threads, using Event Tracing for Windows. Using this feature, it is often possible to determine which thread sent the message - it's the thread that readied the receiving thread. It's even possible to see what the call stack of the sending thread was at the time it sent the message, and even the kernel side (win32k) part of the stack!
The basic procedure goes like this:
Use the Windows Performance Recorder to start a trace. Make sure to include the "CPU usage" profile.
Trigger the sending of the message you are interested in.
Stop the trace.
Open the trace in the Windows performance Analyzer.
In the "CPU Usage (Precise)" graph, "Stacks" graph preset, zoom in on the time the message was received.
One way is to locate the receiving thread and determine when it woke up.
If correlation is difficult, it might be worth instrumenting the receiving thread using e.g. TraceLogging to produce a clear reference time point.
You should be able to find a context switch event where the receiving thread is readied in GetMessage.
The "Readying Process", "Readying Thread Id" and "Readying Thread Stack" columns will then show the details of the readying thread, which is likely to be the sender of the message.
For example, in the below screenshot, TID 7640 receives a shell hook message originating from WindowsTerminal.exe, TID 1104:
(I originally suggested using Spy++ or winspector, but they do not hook into the sending of messages. That doesn't even make sense! A window receives messages but they don't send them, a thread does that. I'll leave my suggestion about using a debugger.)
Sometimes debugging can help. Try downloading the windows PDB files and setting a breakpoint that hits only when one of these messages occur. Looking at the call stack at that point can often shed some light on why things are happening. Posted messages and messages send from other processes will foil this approach.
Im not sure if this does what you want it to but have a look at Process Monitor by sysinternals.
http:// technet.microsoft.com/en-us/sysinternals/bb896645.aspx
It shows everything that happens to a process so i assume it catches messages as well. The site was down at time of writing so i couldnt check.

Does Application.ApplicationExit event work to be notified of exit in non-Winforms apps?

Our code library needs to be notified when the application is exiting. So we have subscribed to the System.Window.Forms.Application.ApplicationExit event. This works nicely for Winforms apps, but does it also work for other types of applications such as console apps, services, and web apps (such as ASP.NET)? The namespace would suggest that it doesn't, and it presumably gets raised when Application.Exit() is called (explicitly or implictly), which may not be correct to call for these other cases.
Is there some other event which would be better in these other cases or which would be more universal (great if it works for Winforms, too)? For example, is there an event for when Environment.Exit() is called (console app)?
I found a mention of an Exited event in System.Diagnostic.Process, but this appears to be for monitoring the exit of another process, and it does not appear to be received by a process about itself (for example, Process.GetCurrentProcess().Exited += Process_Exited; Process.GetCurrentProcess().EnableRaisingEvents = true;). I would think it might only be raised after the process has actually exited, so that wouldn't work.
This is particularly for .NET 2.0 and C#.
We finally found more about this (but by then my machine had been rebuilt and lost the cookies to my unregistered profile here; hopefully, it will let met post this answer).
Further investigation eventually found a few more events which we have found helpful:
System.Windows.Forms.Application.ThreadExit - Fires when a message loop exits
System.Windows.Forms.Application.ApplicationExit - Fires when all message loops exit
System.AppDomain.CurrentDomain.DomainUnload - Fires when a domain other than the default exits
System.AppDomain.CurrentDomain.ProcessExit - Fires when the default app domain exits
System.AppDomain.CurrentDomain.UnhandledException - Fires when an uncaught exception occurs, ending the app.
Only one of the DomainUnload or ProcessExit events are possible for a given app domain, depending on whether it is the default (top-level) domain for the process or was created as a subdomain (eg. on a web server). If an application doesn't know which it might be (as in our case), it needs to subscribe to both if it wants to catch the actual unload for itself. Also, it appears that UnhandledException (which as of .NET2.0 is always fatal) may prevent the other two events, so that may be a third case to handle. These three events should work for any .NET application.
There is a caveat that the execution time for ProcessExit is bounded (about 4 seconds?), so it may not be possible to do extensive "final" work in that event handler. It needs to be something which can be done quickly.
The Application events only apply to WinForms applications (we suspect they may not apply in pure WPF applications, however). The naming can be misleading because they are named for their most basic normal usage which has certain assumptions. ThreadExit does not relate to the actual System.Threading.Thread but rather to the message loop (Application.Run())) of a UI thread, and ApplicationExit similarly relates to the collection of application Forms on one or more UI threads. Normally, once the call to Application.Run() returns, called from the entry method of a thread, the entry method quickly concludes and the thread itself then ends. And once all UI threads have exited, a WinForms app is usually all done and exits.
Another event of note is the System.Windows.Forms.Application.ThreadException event. A Windows message loop can be configured to catch exceptions which occur in handling a message and send this event rather than let them be uncaught (and thus fatal) exceptions. Catching these exceptions allows the message loop (and that UI thread) to continue running (after aborting the current message handler). There can be only one subscriber to this event at any time for a given thread (subscriptions overwrite any previous subscriber), and it must be configured before any Form is created and subscribed before entering the message loop. See the MSDN help for this event and System.Windows.Forms.Applicaton.SetUnhandledExceptionMode() for more info.

Resources