Writing a Windows NT subsystem [closed]

Writing a Windows NT subsystem [closed] - windows

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I'd like to try writing my own minimal NT subsystem on Windows 7 for purely educational purposes -- something like a bare-bones equivalent of the posix.exe in Microsoft's Subsystem for Unix-based Applications.
But I can't seem to find any public documentation on this topic. What API does a subsystem need to implement? How does it get registered with Windows? How does the subsystem image need to be built (what flags need to be set in the PE header, etc.)?
I'd most like to find a book or web site with an overview of the entire subject, or even the source code for a "hello world" NT subsystem that someone else has written. But anything at all would be appreciated if you can point me in the right direction here...

Here are the major components of a subsystem:
User-mode server. The server creates a (A)LPC port and listens for and handles client requests.
User-mode client DLL. In the DLL_INIT_ROUTINE, you can connect to the port set up by the server. This DLL will expose your subsystem's API, and some functions will require communication with the server.
Kernel-mode support driver (you might not need this).
You will want to store process or thread state in either your server or driver. If you're storing it in the server, you might need something like NtRegisterThreadTerminatePort to ensure you get to clean up when a process or thread exits. If you're using a driver, you need PsSetCreateProcessNotifyRoutine.
And lastly, if you're on XP and below, you can add new system calls. You can do this by calling KeAddSystemServiceTable. To invoke the system calls from user-mode, you need to create stubs like this (for x86):
; XyzCreateFooBar(__out PHANDLE FooBarHandle, __in ACCESS_MASK DesiredAccess, ...)
mov eax, SYSTEM_CALL_NUMBER
mov edx, 0x7ffe0300
call [edx]
retn 4
On Vista and above you can no longer add new system service tables because there is only room for two: the kernel's system calls and win32k's system calls.
After a bit of Googling I found this: http://winntposix.sourceforge.net/. I think it's very similar to what you're looking for, and uses a lot of the things I have mentioned.

I'm also obsessed with the native API. :)
And I'm glad to say that it's nowhere near as dangerous or as undocumented as some people make it seem. :]
There's no source code for "Hello, world" because the native API doesn't interact so easily with the console, since it's part of the Win32 subsystem and requires client/server communication with ports. If you need to write a console application, you need to communicate directly with CSRSS, whose message formats are undocumented (although some of its format can be found in ReactOS's source -- it would do you many benefits if you get familiar with ReactOS).
I'll post an example here soon that you might find interesting; for now, do be aware that your only option ever is to link with NTDLL.dll, and that, for that, you need the Driver Development Kit (since you need the lib file).
Update: Check this out!
(I have a feeling no one else will post something quite as rebellious as this. Showing GUI with the native API?! I must be crazy!)
#include <Windows.h>
typedef DWORD NTSTATUS;
//These are from ReactOS
typedef enum _HARDERROR_RESPONSE_OPTION
{
OptionAbortRetryIgnore,
OptionOk,
OptionOkCancel,
OptionRetryCancel,
OptionYesNo,
OptionYesNoCancel,
OptionShutdownSystem
} HARDERROR_RESPONSE_OPTION, *PHARDERROR_RESPONSE_OPTION;
typedef enum _HARDERROR_RESPONSE
{
ResponseReturnToCaller,
ResponseNotHandled,
ResponseAbort,
ResponseCancel,
ResponseIgnore,
ResponseNo,
ResponseOk,
ResponseRetry,
ResponseYes,
ResponseTryAgain,
ResponseContinue
} HARDERROR_RESPONSE, *PHARDERROR_RESPONSE;
typedef struct _UNICODE_STRING {
USHORT Length;
USHORT MaximumLength;
PWSTR Buffer;
} UNICODE_STRING, *PUNICODE_STRING;
//You'll need to link to NTDLL.lib
//which you can get from the Windows 2003 DDK or any later WDK
NTSYSAPI VOID NTAPI RtlInitUnicodeString(IN OUT PUNICODE_STRING DestinationString,
IN PCWSTR SourceString);
NTSYSAPI NTSTATUS NTAPI NtRaiseHardError(IN NTSTATUS ErrorStatus,
IN ULONG NumberOfParameters, IN ULONG UnicodeStringParameterMask,
IN PULONG_PTR Parameters,
IN HARDERROR_RESPONSE_OPTION ValidResponseOptions,
OUT PHARDERROR_RESPONSE Response);
#define STATUS_SERVICE_NOTIFICATION_2 0x50000018
int main()
{
HARDERROR_RESPONSE response;
ULONG_PTR items[4] = {0};
UNICODE_STRING text, title;
RtlInitUnicodeString(&text,
L"Hello, NT!\r\nDo you like this?\r\n"
L"This is just about as pretty as the GUI will get.\r\n"
L"This message will self-destruct in 5 seconds...");
RtlInitUnicodeString(&title, L"Native Message Box!");
items[0] = (ULONG_PTR)&text;
items[1] = (ULONG_PTR)&title;
items[2] = (ULONG_PTR)OptionYesNo;
items[3] = (ULONG_PTR)5000;
NtRaiseHardError(STATUS_SERVICE_NOTIFICATION_2, ARRAYSIZE(items),
0x1 | 0x2 /*First two parameters are UNICODE_STRINGs*/, items,
OptionOk /*This is ignored, since we have a custom message box.*/,
&response);
return 0;
}
If you have any questions, feel free to ask! I'm not scared of the native API! :)
Edit 2:
If you're trying to make your own DLL version of Kernel32 and have it load like Kernel32 does with every process (hence a new subsystem), I just wanted to let you know that I don't think it's possible. It's rather similar to this question that I asked a couple of days ago, and it seems that you can't extend the NT PE Loader to know about new subsystems, so I don't think it'll be possible.

Related

TIB Custom Storage

After quite a bit of googling and some hints given here, I finally managed to find a layout of the FS segment (used by windows to store TIB data). Of particular interest to me is the ArbitraryUserPointer member provided in the PSDK:
typedef struct _NT_TIB {
struct _EXCEPTION_REGISTRATION_RECORD *ExceptionList;
PVOID StackBase;
PVOID StackLimit;
PVOID SubSystemTib;
union {
PVOID FiberData;
DWORD Version;
};
PVOID ArbitraryUserPointer;
struct _NT_TIB *Self;
} NT_TIB;
How safe exactly is it to use this variable (under Vista and above)? and does it still exist on x64?
Secondary to that is the access of this variable. I'm using MSVC, and as such I
have access to the __readfsdword & __readgsqword intrinsics, however, MSDN for some reason marks these as privileged instructions:
These intrinsics are only available in kernel mode, and the routines are only available as intrinsics.
They are of course not kernel only, but why are they marked as such, just incorrect documentation? (my offline VS 2008 docs don't have this clause).
Finally, is it safe to access ArbitraryUserPointer directly via a single __readfsdword(0x14) or is it preferred to use it via the linear TIB address? (which will still require a read from FS).

ArbitraryUserPointer is an internal field not for general use. The operating system uses it internally, and if you overwrite it, you will corrupt stuff. I concede that it has a very poor name.

In case you're still for an answer, I've had the same problem too and posted my question, similar to yours:
Thread-local storage in kernel mode?
I need a TLS-equivalent in the kernel-mode driver. To be exact, I have a deep function call tree which originates at some point (driver's dispatch routine for instance), and I need to pass the context information.
In my specific case the catch is that I don't need a persistent storage, I just need a thread-specific placeholder for something for a single top-level function call. Hence I decided to use an arbitrary entry in the TLS array for the function call, and after it's done - restore its original value.
You get the TLS array by the following:
DWORD* get_Tls()
{
return (DWORD*) (__readfsdword(0x18) + 0xe10);
}
BTW I have no idea why the TIB is usually accessed by reading the contents of fs:[0x18]. It's just pointed by the fs selector. But this is how all the MS's code accesses it, hence I decided to do this as well.
Next, you choose an arbitrary TLS index, say 0.
const DWORD g_dwMyTlsIndex = 0;
void MyTopLevelFunc()
{
// prolog
DWORD dwOrgVal = get_Tls()[g_dwMyTlsIndex];
get_Tls()[g_dwMyTlsIndex] = dwMyContextValue;
DoSomething();
// epilog
get_Tls()[g_dwMyTlsIndex] = dwOrgVal;
}
void DoSomething()
{
DWORD dwMyContext = get_Tls()[g_dwMyTlsIndex];
}

DRIVER_OBJECT.DriverSection

Does anyone have an idea what is the structure of the DriverSection pointer in the x64 bit version of win7. In 32 bit I used the following:
typedef struct _KLDR_DATA_TABLE_ENTRY {
LIST_ENTRY InLoadOrderLinks;
PVOID ExceptionTable;
ULONG ExceptionTableSize;
//ULONG padding1;
PVOID GpValue;
PVOID NonPagedDebugInfo;
PVOID DllBase;
PVOID EntryPoint;
ULONG SizeOfImage;
UNICODE_STRING FullDllName;
UNICODE_STRING BaseDllName;
ULONG Flags;
USHORT LoadCount;
USHORT __Unused5;
PVOID SectionPointer;
ULONG CheckSum;
//ULONG Padding2;
PVOID LoadedImports;
PVOID PatchInformation;
} KLDR_DATA_TABLE_ENTRY, *PKLDR_DATA_TABLE_ENTRY;
And everything was working but on x64 it is crashing when trying to dereference the LIST_ENTRY. Any pointers/tips would be greatly appreciated

And everything was working but on x64 it is crashing when trying to dereference the LIST_ENTRY. Any pointers/tips would be greatly appreciated
If you can hook up a kernel debugger, you can verify whether or not the DriverSection object matches your definition. To do this, pick a driver you wish to debug - I tend to use a simple one I have. Load its symbols by fixing the symbol path to include its pdb, then break into windbg or kd and type:
.reload
To reload the symbols. Then you can load the driver with:
sc start drivername
having created its service assuming it is a legacy driver. Type:
bu drivername!DriverEntry
to set a breakpoint on the DriverEntry for this module. The difference between bp and bu is that bu breakpoints are evaluated and set on module load. Currently, of course, DriverEntry won't be called, but if we reload the driver it will:
sc stop drivername
sc start drivername
Now your breakpoint should be hit and rcx will contain the DRIVER_OBJECT structure, since it is a pointer argument and pointer/integer arguments are passed in rcx,rdx,r8,r9 according to the Windows ABI. So, you can print out the driver object structure with:
dt _DRIVER_OBJECT (address of rcx)
Which will give you a pointer to the driver section. Then, type:
dt _LDR_DATA_TABLE_ENTRY (driver section object pointer)
This should give you your driver section object. _LDR_DATA_TABLE_ENTRY is actually present in the Windows symbols, so this will work.
Using the debugger, you should be able to dereference the LIST_ENTRY pointers (.flink and .blink) successfully (try dpps on the address of the _LDR_DATA_TABLE_ENTRYstructure, for example). If you do it successfully, one of those addresses will resolve tont!PsLoadedModuleList`.
What I'm trying to say is that in a roundabout way, is that:
Either there is a bug in your code somewhere, or
You've hit upon a synchronisation issue. Remember, this structure is supposed to be opaque and we're not supposed to be modifying it in any way. It's also liable to change on us, and we don't know where the lock for synchronising access to it is.
If you are certain 1 is not the case, it is likely 2 is. Luckily, Microsoft actually provided a function to get the information from these structures called AuxKlibQueryModuleInformation(). You do need to add an extra library to your driver, but that's not the end of the world. Include Aux_klib.h. There's also a code sample on the MSDN page showing how to use it - it's pretty straightforward.

Recording Syscalls in windows

I have been searching for some time now on ways to get syscalls in realtime on windows. I have looked at couple of posts here at stackoverflow and elsewhere but could not find anything easy enough that I could follow. I have looked at procmon but its output has been pretty unstable. Same binary on two systems has generated different number of entries. Perhaps I lack the pre-requisite knowledge to do such stuff. Any help/recommendation is welcome.
I have looked at these link before:
System Calls in windows & Native API?
http://www.codeguru.com/cpp/w-p/system/devicedriverdevelopment/article.php/c8035
http://technet.microsoft.com/en-us/sysinternals/bb897447.aspx
Regards

Depending on the version of Windows you are using, the answer to your question is probably Event Tracing for Windows (ETW) which can do syscall logging [link]

If You are satisfied with sampling approach then You could try this:
typedef struct _THREAD_LAST_SYSCALL_INFORMATION
{
PVOID FirstArgument;
USHORT SystemCallNumber;
} THREAD_LAST_SYSCALL_INFORMATION, *PTHREAD_LAST_SYSCALL_INFORMATION;
THREAD_LAST_SYSCALL_INFORMATION lastSystemCall;
NtQueryInformationThread(
hThread,
ThreadLastSystemCall,
&lastSystemCall,
sizeof(THREAD_LAST_SYSCALL_INFORMATION),
NULL
);
where ThreadLastSystemCall = 21

Simple way to hook registry access for specific process

Is there a simple way to hook registry access of a process that my code executes? I know about SetWindowsHookEx and friends, but its just too complex... I still have hopes that there is a way as simple as LD_PRELOAD on Unix...

Read up on the theory of DLL Injection here: http://en.wikipedia.org/wiki/DLL_injection
However, I will supply you with a DLL Injection snippet here: http://www.dreamincode.net/code/snippet407.htm
It's pretty easy to do these types of things once you're in the memory of an external application, upon injection, you might as well be a part of the process.
There's something called detouring, which I believe is what you're looking for, it simply hooks a function, and when that process calls it, it executes your own function instead. (To ensure that it doesn't crash, call the function at the end of your function)
So if you were wanting to write your own function over CreateRegKeyEx
(http://msdn.microsoft.com/en-us/library/ms724844%28v=vs.85%29.aspx)
It might look something like this:
LONG WINAPI myRegCreateKeyEx(HKEY hKey, LPCTSTR lpSubKey, DWORD Reserved, LPTSTR lpClass, DWORD dwOptions, REGSAM samDesired, LPSECURITY_ATTRIBUTES lpSecurityAttributes, PHKEY phkResult, LPDWORD lpdwDisposition)
{
//check for suspicious keys being made via the parameters
RegCreateKeyEx(hKey, lpSubKey, Reserved, lpClass, dwOptions, samDesired, lpSecurityAttributes, phkResult, lpdwDisposition);
}
You can get a very well written detour library called DetourXS here: http://www.gamedeception.net/threads/10649-DetourXS
Here is his example code of how to establish a detour using it:
#include <detourxs.h>
typedef DWORD (WINAPI* tGetTickCount)(void);
tGetTickCount oGetTickCount;
DWORD WINAPI hGetTickCount(void)
{
printf("GetTickCount hooked!");
return oGetTickCount();
}
// To create the detour
oGetTickCount = (tGetTickCount) DetourCreate("kernel32.dll", "GetTickCount", hGetTickCount, DETOUR_TYPE_JMP);
// ...Or an address
oGetTickCount = (tGetTickCount) DetourCreate(0x00000000, hGetTickCount, DETOUR_TYPE_JMP);
// ...You can also specify the detour len
oGetTickCount = (tGetTickCount) DetourCreate(0x00000000, hGetTickCount, DETOUR_TYPE_JMP, 5);
// To remove the detour
DetourRemove(oGetTickCount);
And if you can't tell, that snippet is hooking GetTickCount() and whenever the function is called, he writes "GetTickCount hooked!" -- then he executes the function GetTickCount is it was intended.
Sorry for being so scattered with info, but I hope this helps. :)
-- I realize this is an old question. --

Most winapi calls generate symbol table entries for inter modular calls, this makes it pretty simple to hook them, all you need to do is overwrite the IAT addresses. Using something such as MSDetours, it can be done safely in a few lines of code. MSDetours also provides the tools to inject a custom dll into the target process so you can do the hooking

SetWindowsHookEx won't help at all - it provides different functionality.
Check if https://web.archive.org/web/20080212040635/http://www.codeproject.com/KB/system/RegMon.aspx helps. SysInternals' RegMon uses a kernel-mode driver which is very complicated way.
Update: Our company offers CallbackRegistry product, that lets you track registry operations without hassle. And BTW we offer free non-commercial licenses upon request (subject to approval on case by case basis).

How to set name to a Win32 Thread?

How do I set a name to a Win32 thread. I did'nt find any Win32 API to achieve the same. Basically I want to add the Thread Name in the Log file. Is TLS (Thread Local Storage) the only way to do it?

Does this help ?
How to: Set a Thread Name in Native Code
In managed code, it is as easy as setting the Name property of the corresponding Thread object.

http://msdn.microsoft.com/en-us/library/xcb2z8hs(VS.90).aspx
//
// Usage: SetThreadName (-1, "MainThread");
//
#include <windows.h>
const DWORD MS_VC_EXCEPTION=0x406D1388;
#pragma pack(push,8)
typedef struct tagTHREADNAME_INFO
{
DWORD dwType; // Must be 0x1000.
LPCSTR szName; // Pointer to name (in user addr space).
DWORD dwThreadID; // Thread ID (-1=caller thread).
DWORD dwFlags; // Reserved for future use, must be zero.
} THREADNAME_INFO;
#pragma pack(pop)
void SetThreadName( DWORD dwThreadID, char* threadName)
{
THREADNAME_INFO info;
info.dwType = 0x1000;
info.szName = threadName;
info.dwThreadID = dwThreadID;
info.dwFlags = 0;
__try
{
RaiseException( MS_VC_EXCEPTION, 0, sizeof(info)/sizeof(ULONG_PTR), (ULONG_PTR*)&info );
}
__except(EXCEPTION_EXECUTE_HANDLER)
{
}
}

According to discussion with the Microsoft debugging team leads (see link below for details) the SetThreadDescription API is the API that will be used going forward by Microsoft to support thread naming officially in native code. By "officially" I mean an MS-supported API for naming threads, as opposed to the current exception-throwing hack that currently only works while a process is running in Visual Studio.
This API became available starting in Windows 10, version 1607.
Currently, however, there is very little tooling support, so the names you set won't be visible in the Visual Studio or WinDbg debuggers. As of April 2017, however, the Microsoft xperf/WPA tools do support it (threads named via this API will have their names show up properly in those tools).
If you would like to see this gain better support, such as in WinDbg, Visual Studio, and crash dump files, please vote for it using this link:
https://visualstudio.uservoice.com/forums/121579-visual-studio-ide/suggestions/17608120-properly-support-native-thread-naming-via-the-sett

Win32 threads do not have names. There is a Microsoft convention whereby applications raise special SEH exceptions containing a thread name. These exceptions can be intercepted by debuggers and used to indicate the thread name. A couple of the answers cover that.
However, that is all handled by the debugger. Threads themselves are nameless objects. So, if you want to associate names with your threads, you'll have to develop your own mechanism. Whilst you could use thread local storage that will only allow you to obtain the name from code executing in that thread. So a global map between thread ID and the name would seem like the most natural and useful approach.

You can use a thread-local storage object to store the name. For example,
__declspec( thread ) char threadName[32];
Then you can write and read this from a thread. This might be useful in a logger application, where you want to print out the name of the thread for every message. You probably want to write this variable as soon as the thread starts, and also throw the Microsoft exception (https://stackoverflow.com/a/10364541/364818) so that the debugger also knows the thread name.

If your application runs on Windows version 1607+, you can use SetThreadDescription()

If you want to see the name of your thread in the debugger (windbg or visual studio):
http://blogs.msdn.com/stevejs/archive/2005/12/19/505815.aspx
I'm not actually sure if there's a reverse method to get the thread name. But TLS sounds like the way to go.

Another way to do this is to store a pointer to the name in the ArbitraryUserPointer field of the TEB of the thread. This can be written to and read from at runtime.
There's a CodeProject article titled "Debugging With The Thread Information Block" that shows you how to do this.

You can always store this information for yourself in a suitable data structure. Use a hash or a map to map GetThreadId() to this name. Since GetThreadId() is always a unique identifier, this works just fine.
Cheers !
Of course, if he's creating many
threads, that hashmap will slowly fill
up and use more and more memory, so
some cleanup procedure is probably a
good thing as well.
You're absolutely right. When a thread dies, it's corresponding entry in the map should naturally be removed.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio