Diagnosing RegisterWindowsMessage leak - winapi

We are seeing atom pool resource exhaustion on production servers of one of our applications.
Using the fantastic AtomTableMonitor tool, we've isolated the issue to creation of a huge number of atoms by the RegisterWindowsMessage call. They all have names like this:
ControlOfs030D000000000270
where the number at the end changes.
My question is: How do we figure out which process is creating these atoms?
some potential resources:
https://blogs.msdn.microsoft.com/ntdebugging/2012/01/31/identifying-global-atom-table-leaks/

Atoms that begin with "ControlOfs..." are created by Borland/Embarcadero's VCL (Visual Component Library) framework in Delphi/C++Builder. These atoms are actually in the form of "ControlOfs<HInstance><ThreadID>", where <HInstance> and <ThreadID> are in hex format (so, in your case, HInstance = 0x030D0000 = 51183616, ThreadID = 0x00000270 = 624).
There is also another atom name that is created by the VCL, in the form of "Delphi<ProcessID>", where <ProcessID> is in hex format.
This means that every instance of an app that uses the VCL creates a new unique "Delphi..." atom, and its main UI thread creates a new unique "ControlOfs..." atom (these atoms are used to store TWinControl object pointers in VCL-created HWNDs via SetProp(), for use by the VCL's FindControl() and IsDelphiHandle() utility functions). Both atoms are registered with GlobalAddAtom() at app startup, and unregistered at app shutdown with GlobalDeleteAtom(), so there is no leak.
However, in Delphi/C++Builder 6 all the way up to RADStudio XE2, there is yet another atom that uses the same "ControlOfs..." name. This atom is created with RegisterWindowMessage() (for a private RM_GetObjectInstance window message), which cannot be unregistered. So, every time an affected VCL app is run, this unique atom is created and subsequently leaked.
This was eventually fixed by Embarcadero in RADStudio XE3 in 2012 (Andreas Hausladen posted a patch for earlier VCL versions). But pre-existing apps that are compiled with older versions of the VCL are affected, and there is nothing you can do to stop them from leaking without patching them to use a static name with RegisterWindowMessage().
So, to answer your question, using a combination of AtomTableMonitor and Task Manager, you should be able to figure out which apps you are running are VCL apps, and then you can check them individually for leaking atoms. Or, use SysInternals Process Monitor with a Thread Create filter to get a list of thread IDs and their creating processes over time, then you can match up those thread IDs to the leaked atom names.

You can use a tool like API Monitor and set it up to track only RegisterWindowsMessage. It will show you which process is using this function and a stack trace too (though probably not too useful without symbols).
Also, a quick Google search for ControlOfs finds https://forums.embarcadero.com/thread.jspa?threadID=47678 which matches your issue. They say it's a bug in VCL. Someone posted this fix pack if you have the code:
http://andy.jgknet.de/blog/bugfix-units/vclfixpack-10/
If you don't have the code, I suggest you look for Delphi/VCL applications in your production server and try to update them or report the issue.

Related

Lotus Notes - CreateMIMEEntity not releasing the control of .NSF file

I am using Interop.Domino to work with .NSF file. To generate the html mime entity I used the nnote but in some case it failed to generate it so in that case I took the RTFTEXT / PLIAN TEXT as output.
so I used CreateMIMEEntity for it.
NotesMIMEEntity MIMEBody = NoteDocument.CreateMIMEEntity("Body");
It works but it holds the control on the Database (.nsf file), file is getting mark as being used in another process.
By troubleshooting it it clear that above statement holds the control.
I have released all the Note objects assigned with it.Still problem remains same.
Is there are proper way to use it or release it?
The Notes core DLLs that are underneath the COM classes keep databases open in cache. The only way that I know of to close them is to terminate the process that loaded the DLLs. One option is to design code using the COM API so that it dispatches short-term worker processes to open the database, do the work, and terminate. Yeah, it's ugly and slow, but if you need a long-running service and you're using the COM API instead of the Notes C API, it's the best way.
In any case, the cached open databases should not cause a sharing violation if you are opening the database through the Domino server. If you are using "" instead of the server name when opening the database however, it's going to be a problem -- and you shouldn't even do that in short-running worker processes.

Windows Antimalware Scan Interface thread safety

The Windows Antimalware scan Interface (AMSI) contains abstractions which can be used to call the currently active virus scanner in Windows:
https://learn.microsoft.com/en-us/windows/desktop/amsi/antimalware-scan-interface-functions
There are 2 methods related to initialization:
AmsiInitialize
AmsiUninitialize
AmsiInitialize returns "A handle of type HAMSICONTEXT that must be passed to all subsequent calls to the AMSI API.".
After initialization is complete, I can use AmsiScanBuffer to scan a buffer for malware.
My question:
Can I use the same context concurrently from many threads in my application, or do I need to create one per thread from which I'm going to call the methods?
Reading the documentation, for AsmiUnitialize, it tells me that When the app is finished with the AMSI API it must call AmsiUninitialize.. This tells me that the context can be used for many calls, but it doesn't tell me anything about thread safety or concurrency.
Generally, API calls that are not specifically marked as thread-safe are not (this is usually true for any library). The easiest solution is to open an AMSI handle per thread.
(P.S. This only works with Windows Defender so far as I 've tested).

Detect the Application which requests "services.exe" to start a service in Windows

As a part of my project, I get an event notification every time a Service is Started or Stopped using the WMI class Win32_Service through an EventSink.
I want to detect the application which had requested "services.exe" to start a particular service.
Till now, I tried Monitoring ALPC calls between any process and "services.exe" and got a Message_ID every time a process communicates (sends/receives) any information to/from "services.exe" using the ALPC Class. I would like to know what these messages are so that I can decode a StartService() or a StopService() procedure.
Is there any way to detect which application starts/stops a service?
The best way to do this, in my opinion, would be from kernel-mode using the PsSetCreateProcessNotifyRoutine/Ex/Ex2 kernel-mode callback.
If you're going to be using PsSetCreateProcessNotifyRoutine, you will receive less information than if you were using the Extended version of the kernel-mode callback (the Ex one). However, you can still query information such as the image file path of the parent process (or the one being created) by using PsLookupProcessByProcessId to get a pointer to the _EPROCESS structure and then relying on SeLocateProcessImageName (undocumented, however it is accessible in WDK by default).
The SeLocateProcessImageName routine will rely internally on that _EPROCESS structure, since information like the path of the process image on-disk is all tracked by the Windows kernel there.
If you're going to be using the Ex version of the kernel-mode callback, then you eliminate the need to do what is mentioned above. The Ex version of the routine is more recent than the non-Ex version.
The routine prototype for the callback routine will be:
VOID
CreateProcessNotifyRoutineEx(
PEPROCESS Process,
HANDLE ProcessId,
PPS_CREATE_NOTIFY_INFO CreateInfo
)
As seen above, you get a pointer to the _PS_CREATE_NOTIFY_INFO structure. You can then access the ImageFileName and CommandLine fields to filter for services.exe (make sure you filter properly to not catch it for a rogue copy - so ensure full path indicates its the real one) and gain more insight into why it was being invoked (if such information is exposed via the command-line... I cannot remember - nonetheless, you can still detect its creation and be aware of who spawned it).
To determine the parent who was responsible for the process creation operation of services.exe (e.g. if it relied on the Service Manager which in turn resulted in the spawning of it), you can rely on the ParentProcessId field (under the _PS_CREATE_NOTIFY_INFO structure as well). The SeLocateProcessImageName trick will work perfectly here.
SeLocateProcessImageName is undocumented so here is the routine prototype:
NTSTATUS
NTAPI
SeLocateProcessImageName(
PEPROCESS Process,
PUNICODE_STRING *ImageName
);
At-least with the latest Windows 10 WDK, it's already available by default. If you wanted to though, you can use a dynamic import with MmGetSystemRoutineAddress.
Resources:
https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/content/ntddk/nf-ntddk-pssetcreateprocessnotifyroutine
https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/content/ntddk/nf-ntddk-pssetcreateprocessnotifyroutineex
https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/content/ntddk/nf-ntddk-pssetcreateprocessnotifyroutineex2
https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/content/wdm/nf-wdm-mmgetsystemroutineaddress
https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/content/ntifs/nf-ntifs-pslookupprocessbyprocessid
https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/content/ntddk/ns-ntddk-_ps_create_notify_info

Enumerate Windows "Power Availability Requests" with undocumented CallNtPowerInformation(GetPowerRequestList..)

Windows 7 introduced "Power Availability Requests". This feature allows applications to notify the OS that they require the display or whole system and therefore power management should be temporarily inhibited. The feature is documented here:
https://download.microsoft.com/download/7/E/7/7E7662CF-CBEA-470B-A97E-CE7CE0D98DC2/AvailabilityRequests.docx
The availability requests feature uses an object model and provides the functions PowerCreateRequest(), PowerSetRequest() and PowerClearRequest() to create requests, activate them and ultimately remove them. This functionality is very similar to the older SetThreadExecutionState() API available in Windows 2000 but allows multiple requests to be create per-thread and improves potential diagnostics by requiring each request to have a reason string.
The OS supplied POWERCFG.EXE utility can enumerate the current outstanding requests using the command:
POWERCFG -REQUESTS
Microsoft do not document how to enumerate requests with Windows API.
The CallNtPowerInformation() function in the SDK has been updated to support a new information level called "GetPowerRequestList". This looks very much like it could be the required API but is not documented.
Please does anyone know how to call CallNtPowerInformation(GetPowerRequestList..)?
Jim
Late answer, but, I found that it was easier to call this other function instead (since CallNtPowerInformation(GetPowerRequestList, ...) returned a not supported error):
PowerInformationWithPrivileges(GetPowerRequestList, 0, 0, bufout, 16384);
Function signature seemed to be the same, and you might have to define it and GetProcAddress from powrprof.dll yourself depending on what libs you have available.
Output format seemed to be a binary blob. If I had to guess, it's a list of int64's (even in 32-bit apps), first entry is # entries (call it x), next x entries are offsets in the blob for the real entries, which themselves are some kind of variable length blob/struct, probably correlating to each PowerRequest and/or type of request. Not complete info, but that should get other people started if they're serious about trying to make this work.
You need admin to call this function (you also need admin to call powercfg /requests, so this isn't too much of a surprise, though perhaps a shortcoming depending on your use case).

Twisted process is huge

A Twisted app I have was constantly getting killed due to memory problems. The program grew in size, consuming all of the system's memory before being shut down by the os. Restart and repeat.
This is on a virtual server, so I doubled the memory, and the issue resolved - the daemon stabilized at around 1.25GB of memory
Does anyone have advice on how I can best profile this to tell what/where all the memory is getting sucked up into ?
If info on the app helps, I'm using the twisted reactor and internet.timer.TimerService to poll a database for items to update through three 'services'. the items to process are pushed into a twisted.internet.defer.DeferredList , and their processing occurs in a deferToThread block. In the deferred process there are a handful of blocking operations ( fetching web pages, etc ) and a lot of HTML parsing ( beautiful soup and other libraries ). I've suggested the reactor.threadpool size to be 10 and each 'service' defers to thread using a SemaphoreService that has 10 tokens. I really expected this daemon to max out at around 400MB of memory, not 3x that.
This is more of a generic share of thoughts how I debug memory leak/usage problems in my twisted applications.
Twisted has a ssh server support, and is something which I add in to almost all of my projects in development.
The ssh provides a interactive python interpreter access to the method which has python garbage collector available and a number of helper functions which allow me to a) inspect count of the instances from a same class, b) start and stop inspection of changes of that count over time and c) to get all references of that class. The nice thing with the interactive interpreter is that it allows ad-hoc introspection of offending instances, their relation to other objects and the state of process they are in. This so far has always proven a valuable instrument to pinpoint exact location where I have forgot / unforseen the ref release problems in my projects.

Resources