NetServerEnum create Worker Threads who won't close - windows

While trying to solve a previously asked SO question of mine, I've find that even without my threads, the problem occurs.
what I have now , is a really simple single-threaded code , that calls - NetServerEnum()
. when returned, it calls NetApiBufferFree() and return from main, which supposed to end the process.
at that point, my thread truly ends, but the process won't exit , as there are 4 threads opened (not by me):
1 * ntdll.dll!TplsTimerSet+0x7c0 (stack is at ntdll.dll!WaitForMultipleObjects)
(This one opened upon the call to NetServerEnum())
3 * ndll.dll!RtValidateHeap+0x170 (stack is at ntdll.dll!ZwWaitWorkViaWorkerFactory+0xa)
(These are open when my code returns)
UPDATE:
If I kill the thread running ntdll.dll!TplsTimerSet+0x7c0 externally (using process explorer) , before return of main(), the program exit gracefully.
I thought it might be useful to know.
UPDATE2: (some more tech info)
I'm using:
MS Visual Studio 2010 Ultimate x64 (SP1Rel) on Win7 Enterprise SP1
Code is C (but compile as c++ switch is on)
Subsystem: WINDOWS
Compiler: cl.exe (using IDE)
all other parameters are default.
I'm Using a self modified entry point (/ENTRY:"entry") , and it is the only function In my program):
int entry(void)
{
SERVER_INFO_101* si;
DWORD a,b;
NET_API_STATUS c;
c = NetServerEnum ( NULL , 101 , (LPBYTE*) &si , MAX_PREFERRED_LENGTH , &b , &a , SV_TYPE_WORKSTATION, NULL , 0 );
c = NetApiBufferFree (si);
Sleep(1000);
return 0;
}
all the tested mentioned before where preformed inside a windows domain network of about 100 units.
UPDATE 3:
This problem does not occur when tested on a (non-virtual) WinXP 32bit. (same binary, though for the Win7 x64 two binary were tested - 32bit over WOW , and native x64)

When you use a custom entry point, you're bypassing the runtime library, which means you're responsible for exiting the process. The process will exit implicitly if there are no more threads running, but as you've discovered, the operating system may create threads on your behalf that you don't have control over.
In your case, all you need to do is to call ExitProcess explicitly at the end of the entry() function.
int entry(void)
{
SERVER_INFO_101* si;
DWORD a,b;
NET_API_STATUS c;
c = NetServerEnum ( NULL , 101 , (LPBYTE*) &si , MAX_PREFERRED_LENGTH , &b , &a , SV_TYPE_WORKSTATION, NULL , 0 );
c = NetApiBufferFree (si);
Sleep(1000);
ExitProcess(0);
}
In the absence of a call to ExitProcess and with a custom entry point, the behaviour you're seeing is as expected.

Related

How to avoid memory leaks using ShellExecuteEx?

Minimal, Complete, and Verifiable example:
Visual Studio 2017 Pro 15.9.3
Windows 10 "1803" (17134.441) x64
Environment variable OANOCACHE set to 1.
Data/Screenshots shown for a 32 bits Unicode build.
UPDATE: Exact same behavior on another machine with Windows 10 "1803" (17134.407)
UPDATE: ZERO leaks on an old Laptop with Windows Seven
UPDATE: Exact same behavior (leaks) on another machine with W10 "1803" (17134.335)
#include <windows.h>
#include <cstdio>
int main() {
getchar();
CoInitializeEx( NULL, COINIT_APARTMENTTHREADED | COINIT_DISABLE_OLE1DDE );
printf( "Launching and terminating processes...\n" );
for ( size_t i = 0; i < 64; ++i ) {
SHELLEXECUTEINFO sei;
memset( &sei, 0, sizeof( sei ) );
sei.cbSize = sizeof( sei );
sei.lpFile = L"iexplore.exe";
sei.lpParameters = L"about:blank";
sei.fMask = SEE_MASK_FLAG_NO_UI | SEE_MASK_NOCLOSEPROCESS | SEE_MASK_NOASYNC;
BOOL bSuccess = ShellExecuteEx( &sei );
if ( bSuccess == FALSE ) {
printf( "\nShellExecuteEx failed with Win32 code %d and hInstApp %d. Exiting...\n",
GetLastError(), (int)sei.hInstApp );
CoUninitialize();
return 0;
} // endif
printf( "%d", (int)GetProcessId( sei.hProcess ) );
Sleep( 1000 );
bSuccess = TerminateProcess( sei.hProcess, 0 );
if ( bSuccess == FALSE ) {
printf( "\nTerminateProcess failed with Win32 code %d. Exiting...\n",
GetLastError() );
CloseHandle( sei.hProcess );
CoUninitialize();
return 0;
} // endif
DWORD dwRetCode = WaitForSingleObject( sei.hProcess, 5000 );
if ( dwRetCode != WAIT_OBJECT_0 ) {
printf( "\nWaitForSingleObject failed with code %x. Exiting...\n",
dwRetCode );
CloseHandle( sei.hProcess );
CoUninitialize();
return 0;
} // endif
CloseHandle( sei.hProcess );
printf( "K " );
Sleep( 1000 );
} // end for
printf( "\nDone!" );
CoUninitialize();
getchar();
} // main
The code use ShellExecuteEx to launch, in a loop, 64 instances of Internet Explorer with the about:blank URL. The SEE_MASK_NOCLOSEPROCESS is used to be able to then use the TerminateProcess API.
I notice two kinds of leaks:
Handles leaks: launching Process Explorer when the loop is finished but the program still running, I see several blocks of 64 handles (process handles, and registries handles for various keys)
Memory leaks: attaching the visual C++ 2017 debugger to the program, before the loop, I took a first Heap Snapshot, and a second one after the loop.I see 64 blocs of 8192 bytes, coming from windows.storage.dll!CInvokeCreateProcessVerb::_BuildEnvironmentForNewProcess()
You can read some information about the handles leaks here: ShellExecute leaks handles
Here are some screenshots:
First, the PID launched and terminated:
Second: the same pids, as seen in Process Explorer:
Process Explorer also shows 64*3 open registry handles, for HKCR\.exe, HKCR\exefile and HKCR\exefile\shell\open.
One of the 64 leaked "Environment" (8192 bytes and the callstack):
Last: a screen shot of Process Explorer, showing the "Private Bytes" during the execution of the MCVE modified with a 1024 loop counter. The running time is approximately 36 minutes, the PV start at 1.1 Mo (before CoInitializeEx) and end at 19 Mo (after CoUninitialize). The value then stabilizes at 18.9
What am I doing wrong?
Do I see leaks where there are none?
this is windows bug, in version 1803. minimal code for reproduce:
if (0 <= CoInitialize(0))
{
SHELLEXECUTEINFO sei = {
sizeof(sei), 0, 0, 0, L"notepad.exe", 0, 0, SW_SHOW
};
ShellExecuteEx( &sei );
CoUninitialize();
}
after execute this code, can view handles for notepad.exe process and first thread - this handles of course must not exist (be closed), not closed keys
\REGISTRY\MACHINE\SOFTWARE\Classes\.exe
\REGISTRY\MACHINE\SOFTWARE\Classes\exefile
also private memory leaks exist in process after this call.
of course this bug cause permanent resource leaks in explorer.exe and any process, which use ShellExecute[Ex]
exactly research of this bug - here
The underlying issue here appears to be in windows.storage.dll. In
particular, the CInvokeCreateProcessVerb object is never
destroyed, because the associated reference count never reaches 0.
This leaks all of the objects associated with
CInvokeCreateProcessVerb, including 4 handles and some memory.
The reason the reference count never reaches 0 appears to be related
to the argument change for ShellDDEExec::InitializeByShellInternal
from Windows 10 1709 to 1803, executed by
CInvokeCreateProcessVerb::Launch().
more concrete here we have cyclic reference of an object (CInvokeCreateProcessVerb) to itself.
more concrete error inside method CInvokeCreateProcessVerb::Launch() which call from self
HRESULT ShellDDEExec::InitializeByShellInternal(
IAssociationElement*,
CreateProcessMethod,
PCWSTR,
STARTUPINFOEXW*,
IShellItem2*,
IUnknown*, // !!!
PCWSTR,
PCWSTR,
PCWSTR);
with wrong 6 argument. the CInvokeCreateProcessVerb class containing internal ShellDDEExec sub-object. in windows 1709 CInvokeCreateProcessVerb::Launch() pass pointer to static_cast<IServiceProvider*>(pObj) in place 6 argument to ShellDDEExec::InitializeByShellInternal where pObj is point to instance of CBindAndInvokeStaticVerb class. but in 1803 version here passed pointer to static_cast<IServiceProvider*>(this) - so pointer to self. the InitializeByShellInternal store this pointer inside self and add reference to it. note that ShellDDEExec is sub-object of CInvokeCreateProcessVerb. so destructor of ShellDDEExec will not be called until destructor of CInvokeCreateProcessVerb not be called. but destructor of CInvokeCreateProcessVerb will be not called until it reference count reach 0. but this not happens until ShellDDEExec not release self pointer to CInvokeCreateProcessVerb which will be only inside it destructor ..
may be this more visible in pseudo code
class ShellDDEExec
{
CComPtr<IUnknown*> _pUnk;
HRESULT InitializeByShellInternal(..IUnknown* pUnk..)
{
_pUnk = pUnk;
}
};
class CInvokeCreateProcessVerb : CExecuteCommandBase, IServiceProvider /**/
{
IServiceProvider* _pVerb;//point to static_cast<IServiceProvider*>(CBindAndInvokeStaticVerb*)
ShellDDEExec _exec;
TRYRESULT CInvokeCreateProcessVerb::Launch()
{
// in 1709
// _exec.InitializeByShellInternal(_pVerb);
// in 1803
_exec.InitializeByShellInternal(..static_cast<IServiceProvider*>(this)..); // !! error !!
}
};
ShellDDEExec::_pUnk hold pointer to containing object CInvokeCreateProcessVerb this pointer will be released only inside CComPtr destructor, called from ShellDDEExec destructor. called from CInvokeCreateProcessVerb destructor, called when reference count became 0, but this never happens because extra reference hold ShellDDEExec::_pUnk
so object store referenced pointer to self. after this reference count to CInvokeCreateProcessVerb never reaches 0

EnumProcessModulesEx and CreateToolhelp32Snapshot fails - whatever 32bit or 64bit

Edit:
The answer of this question is here:
https://stackoverflow.com/a/27317947/996540
When you create a project in msvc, the option /DYNAMICBASE is default enabled
now. Because of ASLR(Address space layout randomization, since Windows Vista),
everytime you run an exe, it's load address is random.
I am doing the DLL injection job recently, so I did some research into it on
google, and have read some projects. Get the load address (base address) of an
exe is important.
It seems there're two simple APIs to do this: EnumProcessModulesEx and
CreateToolhelp32Snapshot. But I never succeeded.
So this is the code sample:
void TestEnumProcessModulesEx(const char* app)
{
std::cout << "Begin TestEnumProcessModulesEx(" << mybit() << ")" << std::endl;
STARTUPINFOA startupInfo = {0};
startupInfo.cb = sizeof(startupInfo);
PROCESS_INFORMATION processInformation = {0};
if (CreateProcessA(app, NULL, NULL, NULL, FALSE, CREATE_SUSPENDED, NULL, NULL, &startupInfo, &processInformation))
{
std::vector<HMODULE> buf(128);
DWORD needed = 0;
for (;;) {
if (EnumProcessModulesEx(processInformation.hProcess, &buf[0], DWORD(buf.size()*sizeof(HMODULE)), &needed, LIST_MODULES_ALL) == FALSE) {
DWORD ec = GetLastError();
std::cout << "GetLastError() = " << ec << std::endl;
break;
}
else if (needed <= buf.size() * sizeof(HMODULE)) {
break;
}
else {
const size_t oldSize = buf.size();
buf.resize(oldSize * 2);
}
}
ResumeThread(processInformation.hThread);
WaitForSingleObject(processInformation.hProcess, INFINITE);
}
std::cout << "End TestEnumProcessModulesEx(" << mybit() << ")" << std::endl;
}
To reduce the length of this Question, the complete code - including the
CreateToolhelp32Snapshot's test code - is not listed here, but you can get it
from:
https://dl.dropboxusercontent.com/u/235920/enum_proc_mods_sample.7z
or
https://www.mediafire.com/?cry3pnra8392099
"If this function is called from a 32-bit application running on WOW64, it can
only enumerate the modules of a 32-bit process. If the process is a 64-bit
process, this function fails and the last error code is ERROR_PARTIAL_COPY
(299)." - from MSDN.
And this is a blog post about this question:
http://winprogger.com/getmodulefilenameex-enumprocessmodulesex-failures-in-wow64/
Unfortunately, this does not make sence, because whatever the specified
process is 32bit or 64bit, it fails with 299; whatever the caller process is
32-bit or 64bit, it fails with 299.
This is the output of my sample:
Begin TestEnumProcessModulesEx(32bit)
GetLastError() = 299
hello world 32bit
End TestEnumProcessModulesEx(32bit)
Begin TestEnumProcessModulesEx(32bit)
GetLastError() = 299
hello world 64bit
End TestEnumProcessModulesEx(32bit)
Begin TestEnumProcessModulesEx(64bit)
GetLastError() = 299
hello world 32bit
End TestEnumProcessModulesEx(64bit)
Begin TestEnumProcessModulesEx(64bit)
GetLastError() = 299
hello world 64bit
End TestEnumProcessModulesEx(64bit)
As you see, any combination is failed.
My OS is Windows 7 64bit pro and my compiler is VS2013.
So, what can I do ?
I have no idea about the unsuccess of EnumProcessModulesEx and
CreateToolhelp32Snapshot, let's leave this question to the expert.
My goal is to get the load address (base address) of the child process, find
the entry point and patch it - the reason to patch the entry point is here:
https://opcode0x90.wordpress.com/2011/01/15/injecting-dll-into-process-on-load/
Since DLL injection is the main purpose of mine, I have to reconsider this
question. I would use the "CreateRemoteThread & LoadLibrary Technique"
http://www.codeproject.com/Articles/4610/Three-Ways-to-Inject-Your-Code-into-Another-Proces#section_2
to do the DLL injection (In fact ASLR is not the barrier of this technique by the way),
Although there are so many limits in DLLMain
http://msdn.microsoft.com/en-us/library/windows/desktop/dn633971%28v=vs.85%29.aspx
, but do a little works is OK: Find the base address of an exe using
GetModuleHandleA(NULL), save the HMODULE returned into shared memory,
next, the caller process read shared memory and get the HMODULE.
Synchronization mechanism is necessary of course.
So, the answer is IPC. (not every IPC mechanism is safe in DLLMain by the way)

Windows fopen and the N flag

I'm reading some code that uses fopen to open files for writing. The code needs to be able to close and rename these files from time to time (it's a rotating file logger). The author says that for this to happen the child processes must not inherit these FILE handles. (On Windows, that is; on Unix it's OK.) So the author writes a special subroutine that duplicates the handle as non-inheritable and closes the original handle:
if (!(log->file = fopen(log->path, mode)))
return ERROR;
#ifdef _WIN32
sf = _fileno(log->file);
sh = (HANDLE)_get_osfhandle(sf);
if (!DuplicateHandle(GetCurrentProcess(), sh, GetCurrentProcess(),
&th, 0, FALSE, DUPLICATE_SAME_ACCESS)) {
fclose(log->file);
return ERROR;
}
fclose(log->file);
flags = (*mode == 'a') ? _O_APPEND : 0;
tf = _open_osfhandle((intptr_t)th, _O_TEXT | flags);
if (!(log->file = _fdopen(tf, "at"))) {
_close(tf);
return ERROR;
}
#endif
Now, I'm also reading MSDN docs on fopen and see that their version of fopen has a Microsoft-specific flag that seems to do the same: the N flag:
N: Specifies that the file is not inherited by child processes.
Question: do I understand it correctly that I can get rid of that piece above and replace it (on Windows) with an additional N in the mode parameter?
Yes, you can.
fopen("myfile", "rbN") creates a non-inheritable file handle.
The N flag is not mentioned anywhere in Linux documentation for fopen, so the solution will be most probably not portable, but for MS VC it works fine.

Windows API - ShellExecuteEx() didn't wait on USB drive and CD drive

I am writing a master installer with the following ShellExecuteEx() function that call a few Advanced Installer created installers (installing multiple products) one by one through a loop construct.
// Shell Execute
bool CFileHelper::ShellExecute(CString strCommandPath, CString strOptions)
{
CString strQCommandPath = CString(_T("\"")) + strCommandPath + CString(_T("\"")); //place the command in the quote to handle path with space
LPWSTR szInstallerPath = strQCommandPath.GetBuffer();
LPWSTR szOptions = strOptions.GetBuffer(MAX_PATH);
SHELLEXECUTEINFO ShellInfo; // Name structure
memset(&ShellInfo, 0, sizeof(ShellInfo)); // Set up memory block
ShellInfo.cbSize = sizeof(ShellInfo); // Set up structure size
ShellInfo.hwnd = 0; // Calling window handle
ShellInfo.lpVerb = _T("open");
ShellInfo.lpFile = szInstallerPath;
ShellInfo.fMask = SEE_MASK_NOCLOSEPROCESS; //| SEE_MASK_NOASYNC | SEE_MASK_WAITFORINPUTIDLE;
ShellInfo.lpParameters = szOptions;
bool res = ShellExecuteEx(&ShellInfo); // Call to function
if (!res)
{
//printf( "CreateProcess failed (%d).\n", GetLastError() );
CString strMsg = CString(_T("Failed to execute command ")) + strCommandPath + CString(_T("!"));
AfxMessageBox(strMsg);
return false;
}
WaitForSingleObject(ShellInfo.hProcess, INFINITE); // wait forever for process to finish
//WaitForInputIdle(ShellInfo.hProcess, INFINITE);
CloseHandle( ShellInfo.hProcess);
strQCommandPath.ReleaseBuffer();
strOptions.ReleaseBuffer();
return true;
}
The function work every well when I have this master installer and other individual product installers on hard drive.
However, if I move all of them to either USB drive or CD, the ShellExecuteEx() didn't wait for the previous product installer to complete its task. So all product installers get lunched at once; giving me the error message "Another installation is in progress. You must complete that installation before continuing this one.".
One thing puzzle me is why it works on hard drive but not on USB drive and CD drive. I need to distribute the products on CD.
Putting Sleep(500) before WaitForSingleObject(ShellInfo.hProcess, INFINITE) didn't help as well.
Work from the assumption that this is real. The installer might have noticed it was started from a removable drive and copied itself to the hard disk. Launched that copy and quit. This avoids trouble when the user pops out the media, that produces a very low-level paging fault that the process itself cannot catch. The Windows dialog isn't great and may well run counter to the installer's request to insert the next disk.
Verify this guess by comparing the process ID of the process you started vs the one you see running in Taskmgr.exe. Reliably fixing this ought to be quite a headache.

Watchdog built into the same process as the program it controls

I run a Visual C++ console test program inside the daily build. Every now and then the test would call some function that was changed by other developers improperly, descend into an infinite loop and hang thus blocking the build.
I need a watchdog solution as simple as possible. Here's what I came up with. In the test program entry point I start a separate thread that loops continuosly and checks elapsed time. If some predefined period is exceeded it calls TerminateProcess(). Pseudocode:
DWORD WatchDog( LPVOID)
{
DWORD start = GetTickCount();
while( true ) {
Sleep( ReasonablePeriod );
if( GetTickCount() - start > MaxAllowed ) {
TerminateProcess( GetCurrentProcess(), 0 );
}
}
return 0;
}
Is this solution any worse than a watchdog implemented as a separate master program?
I think it's preferable to implement the watchdog as a separate process. It's easier to re-use it, it's easier to detect if your app crashed and to get its return code.

Resources