environment variables propagation on Windows system - windows

It is possible to propagate in already opened application the value(environment variables of Windows) of a variable of Windows after its creation or its modification without having to restart the applications which turn?
How to?
Something like SendMessage(HWND_BROADCAST,WM_SETTINGCHANGE,0,TEXT("Environment")) is your best bet, but most applications will ignore it, but Explorer should handle it. (Allow applications to pick up updates)
If you want to go into crazy undocumented land, you could use WriteProcessMemory and update the environment block in every process you have access to.

Yes, this is possible.
It is involved though. I'll outline the basic steps. The detail for each step is documented in many places on the web, including Stack Overflow.
Create a helper dll. The dll does nothing except set the environment variables you want to set. It can do this from DllMain without causing any problems. Just don't got mad with other function calls from inside DllMain. How you communicate to the DLL what variables to set and what values to set them is left for you to decide (read a file, read from registry...)
Enumerate all processes that you wish to update (toolhelp32 will help with this).
For each process you wish to update, inject your helper dll. CreateRemoteThread() will help with this. This will fail for 2% of all apps on NT 4, rising to 5% on XP. Most likely higher percentage failures for Vista/7 and the server versions.
Things you have to live with:
If you are running a 32 bit process on a 64 bit OS, CreateRemoteThread will fail to inject your DLL into 32 bit apps 100% of the time (and cannot inject into 64 bit apps anyway as that is a job for a 64 bit app).
EDIT: Turns out 100% isn't correct. But it is very hit and miss. Don't rely on it.
Don't remain resident
If you don't want your helper DLL to remain resident in the target application, return FALSE for the DLL_PROCESS_ATTACH notification.
DWORD ul_reason_for_call,
LPVOID lpReserved)
if (ul_reason_for_call == DLL_PROCESS_ATTACH)
// set our env vars here
SetEnvironmentVariable("weebles", "wobble but they don't fall down");
// we don't want to remain resident, our work is done
return FALSE;
return TRUE;

No, I'm pretty sure that's not possible.


WinDbg not showing register values

Basically, this is the same question that was asked here.
When performing kernel debugging of a machine running Windows 7 or older, with WinDbg version 6.2 and up, the debugger doesn't show anything in the registers window. Pressing the Customize... button results in a message box that reads Registers are not yet known.
At the same time, issuing the r command results in perfectly valid register values being printed out.
What is the reason for this behaviour, and can it be fixed?
TL;DR: I wrote an extension DLL that fixes the bug. Available here.
The Problem
To understand the problem, we first need to understand that WinDbg is basically just a frontend to Microsoft's Windows Symbolic Debugger Engine, implemented inside dbgeng.dll. Other frontends include the command-line kd.exe (kernel debugger) and cdb.exe (user-mode debugger).
The engine implements everything we expect from a debugger: working with symbol files, read and writing memory and registers, setting breakpoitns, etc. The engine then exposes all of this functionality through COM-like interfaces (they implement IUnknown but are not registered components). This allows us, for instance, to write our own debugger (like this person did).
Armed with this knowledge, we can now make an educated guess as to how WinDbg obtains the values of the registers on the target machine.
The engine exposes the IDebugRegisters interface for manipulating registers. This interface declares the GetValues method for retrieving the values of multiple registers in one go. But how does WinDbg know how many registers are there? That why we have the GetNumberRegisters method.
So, to retrieve the values of all registers on the target, we'll have to do something like this:
Call IDebugRegisters::GetNumberRegisters to get the total number of registers.
Call IDebugRegisters::GetValues with the Count parameter set to the total number of registers, the Indices parameter set to NULL, and the Start parameter set to 0.
One tiny problem, though: the second call fails with E_INVALIDARG.
Ehm, excuse me? How can it fail? Especially puzzling is the documentation for this return value:
The value of the index of one of the registers is greater than the number of registers on the target machine.
But I just asked you how many registers there are, so how can that value be out of range? Okay, let's continue reading the docs anyway, maybe something will become clear:
If the return value is not S_OK, some of the registers still might have been read. If the target was not accessible, the return type is E_UNEXPECTED and Values is unchanged; otherwise, Values will contain partial results and the registers that could not be read will have type DEBUG_VALUE_INVALID.
(Emphasis mine.)
Aha! So maybe the engine just couldn't read one of the registers! But which one? Turns out that the engine chokes on the xcr0 register. From the Intel 64 and IA-32 Architectures Software Developer’s Manual:
Extended control register XCR0 contains a state-component bitmap that specifies the user state components that software has enabled the XSAVE feature set to manage. If the bit corresponding to a state component is clear in XCR0, instructions in the XSAVE feature set will not operate on that state component, regardless of the value of the instruction mask.
Okay, so the register controls the operation of the XSAVE instruction, which saves the state of the CPU's extended features (like XMM and AVX). According to the last comment on this page, this instruction requires some support from the operating system. Although the comment states that Windows 7 (that's what the VM I was testing on was running) does support this instruction, it seems that the issue at hand is related to the OS anyway, as when the target is Windows 8 everything works fine.
Really, it's unclear whether the bug is within the debugger engine, which reports more registers than it can retrieve values for, or within WinDbg, which refuses to show any values at all if the engine fails to produce all of them.
The Solution
We could, of course, bite the bullet and just use an older version of WinDbg for debugging older Windows versions. But where's the challenge in that?
Instead, I present to you a debugger extension that solves this problem. It does so by hooking (with the help of this library) the relevant debugger engine methods and returning S_OK if the only register that failed was xcr0. Otherwise, it propagates the failure. The extension supports runtime unload, so if you experience problems you can always disable the hooks.
That's it, have fun!

what is the purpose of the BeingDebugged flag in the PEB structure?

What is the purpose of this flag (from the OS side)?
Which functions use this flag except isDebuggerPresent?
It's effectively the same, but reading the PEB doesn't require a trip through kernel mode.
More explicitly, the IsDebuggerPresent API is documented and stable; the PEB structure is not, and could, conceivably, change across versions.
Also, the IsDebuggerPresent API (or flag) only checks for user-mode debuggers; kernel debuggers aren't detected via this function.
Why put it in the PEB? It saves some time, which was more important in early versions of NT. (There are a bunch of user-mode functions that check this flag before doing some runtime validation, and will break to the debugger if set.)
If you change the PEB field to 0, then IsDebuggerPresent will also return 0, although I believe that CheckRemoteDebuggerPresent will not.
As you have found the IsDebuggerPresent flag reads this from the PEB. As far as I know the PEB structure is not an official API but IsDebuggerPresent is so you should stick to that layer.
The uses of this method are quite limited if you are after a copy protection to prevent debugging your app. As you have found it is only a flag in your process space. If somebody debugs your application all he needs to do is to zero out the flag in the PEB table and let your app run.
You can raise the level by using the method CheckRemoteDebuggerPresent where you pass in your own process handle to get an answer. This method goes into the kernel and checks for the existence of a special debug structure which is associated with your process if it is beeing debugged. A user mode process cannot fake this one but you know there are always ways around by simply removing your check ....

Programmatically registering a performance counter in the registry

I'm trying to register a performance counter and part of this process includes adding some textual descriptions to a specific registry key. For English this key is HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Perflib\009 which apparently is also known as HKEY_PERFORMANCE_TEXT. There are a pair of values under there (Counter, Help) that have REG_MULTI_SZ data, and I need to modify them to accomplish my goal.
The official way of doing this is by using a tool called lodctr along with a .h and .ini file. There is also a function for doing this programmatically, but my understanding is that it is just a simple wrapper around calling the lodctr program. I found the prospect of maintaining, distributing, and keeping synchronized 3 separate files a bit cumbersome, so I previously wrote code to do this and it worked fine under Windows XP (and possibly Vista, though I don't remember for sure).
Now I'm trying to use the same code on Windows 7 and it doesn't work. The problem is that whenever I try to set the registry values it fails with ERROR_BADKEY; even regedit fails to modify the values, so it's not a problem with my code. I ran Process Monitor against it and noticed that there was no activity at the driver level, so it seems this access must be getting blocked in user-mode code (e.g. advapi32.dll or wherever). I understand why Microsoft would try to prevent people from doing this as it is very easy to screw up, and doing so will screw up the entire performance counter collection on the machine.
I'm going to debug lodctr and see what the magic is purely out of curiosity, but I'm wondering if anybody has run into this before? Are there any alternatives other than the lodctr utility? Perhaps calling the NT registry API directly? I would really prefer to avoid the hassle of the lodctr method if possible.
A minimal example to reproduce the issue:
LONG nResult = RegOpenKeyEx(HKEY_LOCAL_MACHINE, _T("SOFTWARE\\Microsoft\\Windows NT\\CurrentVersion\\Perflib\\009"), 0, KEY_ALL_ACCESS, &hKey);
if(ERROR_SUCCESS == nResult)
LPCTSTR lpData = _T("bar");
DWORD cbData = (_tcsclen(lpData) + 1) * sizeof(TCHAR);
nResult = RegSetValueEx(hKey, _T("foo"), 0, REG_SZ, (const BYTE*)lpData, cbData);
// here nResult == ERROR_BADKEY
hKey = NULL;
I spent about an hour or so trying to debug the official APIs and couldn't figure it out so I tried some more Google. After a while I came across this KB article which explains the RegSetValueEx behavior. Since it mentioned modifying system files that got me to thinking that perhaps this particular registry data is backed by a mapped file. Then I came across another KB article that mentions Perfc009.dat and Perfh009.dat in the system32 folder. Opened these up in a hex editor and sure enough it is the raw REG_MULTI_SZ data I am trying to modify. Now that I know that maybe I can take another look and figure it out, though I am bored with it for now.
Never mind, I give up. It's easier to just go with the flow. Instead of trying to modify the registry directly, I will create the .h and .ini files programmatically and invoke the relevant functions.

Some Windows API calls fail unless the string arguments are in the system memory rather than local stack

We have an older massive C++ application and we have been converting it to support Unicode as well as 64-bits. The following strange thing has been happening:
Calls to registry functions and windows creation functions, like the following, have been failing:
hWnd = CreateSysWindowExW( ExStyle, ClassNameW.StringW(), Label2.StringW(), Style,
Posn.X(), Posn.Y(),
Size.X(), Size.Y(),
hParentWnd, (HMENU)Id,
AppInstance(), NULL);
ClassNameW and Label2 are instances of our own Text class which essentially uses malloc to allocate the memory used to store the string.
Anyway, when the functions fail, and I call GetLastError it returns the error code for "invalid memory access" (though I can inspect and see the string arguments fine in the debugger). Yet if I change the code as follows then it works perfectly fine:
BSTR Label2S = SysAllocString(Label2.StringW());
BSTR ClassNameWS = SysAllocString(ClassNameW.StringW());
hWnd = CreateSysWindowExW( ExStyle, ClassNameWS, Label2S, Style,
Posn.X(), Posn.Y(),
Size.X(), Size.Y(),
hParentWnd, (HMENU)Id,
AppInstance(), NULL);
SysFreeString(ClassNameWS); ClassNameWS = 0;
SysFreeString(Label2S); Label2S = 0;
So what gives? Why would the original functions work fine with the arguments in local memory, but when used with Unicode, the registry function require SysAllocString, and when used in 64-bit, the Windows creation functions also require SysAllocString'd string arguments? Our Windows procedure functions have all been converted to be Unicode, always, and yes we use SetWindowLogW call the correct default Unicode DefWindowProcW etc. That all seems to work fine and handles and draws Unicode properly etc.
The documentation at http://msdn.microsoft.com/en-us/library/ms632679%28v=vs.85%29.aspx does not say anything about this. While our application is massive we do use debug heaps and tools like Purify to check for and clean up any memory corruption. Also at the time of this failure, there is still only one main system thread. So it is not a thread issue.
So what is going on? I have read that if string arguments are marshalled anywhere or passed across process boundaries, then you have to use SysAllocString/BSTR, yet we call lots of API functions and there is lots of code out there which calls these functions just using plain local strings?
What am I missing? I have tried Googling this, as someone else must have run into this, but with little luck.
Edit 1: Our StringW function does not create any temporary objects which might go out of scope before the actual API call. The function is as follows:
Class Text {
const wchar_t* StringW () const
return TextStartW;
wchar_t* TextStartW; // pointer to current start of text in DataArea
I have been running our application with the debug heap and memory checking and other diagnostic tools, and found no source of memory corruption, and looking at the assembly, there is no sign of temporary objects or invalid memory access.
BUT I finally figured it out:
We compile our code /Zp1, which means byte aligned memory allocations. SysAllocString (in 64-bits) always return a pointer that is aligned on a 8 byte boundary. Presumably a 32-bit ANSI C++ application goes through an API layer to the underlying Unicode windows DLLs, which would also align the pointer for you.
But if you use Unicode, you do not get that incidental pointer alignment that the conversion mapping layer gives you, and if you use 64-bits, of course the situation will get even worse.
I added a method to our Text class which shifts the string pointer so that it is aligned on an eight byte boundary, and viola, everything runs fine!!!
Of course the Microsoft people say it must be memory corruption and I am jumping the wrong conclusion, but there is evidence it is not the case.
Also, if you use /Zp1 and include windows.h in a 64-bit application, the debugger will tell you sizeof(BITMAP)==28, but calling GetObject on a bitmap will fail and tell you it needs a 32-byte structure. So I suspect that some of Microsoft's API is inherently dependent on aligned pointers, and I also know that some optimized assembly (I have seen some from Fortran compilers) takes advantage of that and crashes badly if you ever give it unaligned pointers.
So the moral of all of this is, dont use "funky" compiler arguments like /Zp1. In our case we have to for historical reasons, but the number of times this has bitten us...
Someone please give me a "this is useful" tick on my answer please?
Using a bit of psychic debugging, I'm going to guess that the strings in your application are pooled in a read-only section.
It's possible that the CreateSysWindowsEx is attempting to write to the memory passed in for the window class or title. That would explain why the calls work when allocated on the heap (SysAllocString) but not when used as constants.
The easiest way to investigate this is to use a low level debugger like windbg - it should break into the debugger at the point where the access violation occurs which should help figure out the problem. Don't use Visual Studio, it has a nasty habit of being helpful and hiding first chance exceptions.
Another thing to try is to enable appverifier on your application - it's possible that it may show something.
Calling a Windows API function does not cross the process boundary, since the various Windows DLLs are loaded into your process.
It sounds like whatever pointer that StringW() is returning isn't valid when Windows is trying to access it. I would look there - is it possible that the pointer returned it out of scope and deleted shortly after it is called?
If you share some more details about your string class, that could help diagnose the problem here.

How to find out caller info?

This will require some background. I am using Detours to intercept system calls. For those of who don't know what Detours is - it is a tool which redirects call to system functions to a detour function which allows us to do whatever we want to do before and after the actual system call is made. What I want to know is that if it is possible to find out somehow any info about the dll/module which has made this system call? Does any win32 api function help me do this?
Lets say traceapi.dll makes a system call to GetModuleFileNameW() inside kernel32.dll. Detour will intercept this call and redirect control to a detour function (say Mine_GetModuleFileNameW()). Now inside Mine_GetModuleFileNameW(), is it possible to find out that this call originated from traceapi?
call ZwQuerySystemInformation with first argument SystemProcessesAndThreadsInformation.
once you have the returned buf, typecast it to PSYTSTEM+PROCESS_INFORMATION and use its field to extract your info.
status = ZwQuerySystemInformation (
SystemProcessesAndThreadsInformation, buf, bufsize, NULL);
proc_info->ProcessName, which is a UNICODE_STRING will give you the calling process name.
Please note that the structure and field I am talking about is not documented and might change in future release of windows. However, I am using it and it works fine on WIN XP and above.
I don't know how many stack frames will be on the stack that are owned by Detours code. Easy to find out in the debugger, the odds are good that there are none. That makes it easy, use the _ReturnAddress intrinsic to get the caller's address. VirtualQuery() to get the base address, cast it to HMODULE and use GetModuleFileName(). Well, the non-detoured one :)
If there are Detours stack frames then it gets a lot harder. StackWalk64() to skip them, perilous if there are FPO frames present.
