I just want to ask, I know that standart system calls in Linux are done by int instruction pointing into Interrupt Vector Table. I assume this is similiar on Windows. But, how do you call some higher-level specific system routines? Such as how do you tell Windows to create a window? I know this is handled by the code in the dll, but what actually happend at assembler-instruction level? Does the routine in dll calls software interrupt by int instruction, or is there any different approach to handle this? Thanks.
Making a Win32 call to create a window is not really related to an interrupt. The client application is already linked with the .dll that provides the call which exposes the address for the linker to use. Since you are asking about the difference in calling mechanism, I'm limiting the discussion here to those Win32 calls that are available to any application as opposed to kernel-level calls or device drivers. At an assembly language level, it would be the same as any other function call since most Win32 calls are user-level calls which internally make the needed kernel calls. The linker provides the address of the Win32 function as the target for some sort of branching instruction, the specifics would depend on the compiler.
[Edit]
It looks like you are right about the interrupts and the int. vector table. CodeGuru has a good article with the OS details on how NT kernel calls work. Link:
http://www.codeguru.com/cpp/w-p/system/devicedriverdevelopment/article.php/c8035
Windows doesn't let you call system calls directly because system call numbers can change between builds, if a new system call gets added the others can shift forward or if a system call gets removed others can shift backwards. So to maintain backwards compatability you call the win32 or native functions in DLLs.
Now there are 2 groups of system calls, those serviced by the kernel (ntoskrnl) and by the win32 kernel layer (win32k).
Kernel system call stubs are easily accessible from ntdll.dll, while win32k ones are not exported, they're private within user32.dll. Those stubs contain the system call number and the actual system call instruction which does the job.
So if you wanted to create a window, you'd call CreateWindow in user32.dll, then the extended version CreateWindowEx gets called for backwards compatability, that calls the private system call stub NtUserCreateWindowEx, which invokes code within the win32k window manager.
Related
I am following the process to create a trampoline in order to hook a dll function (in my case Direct3DCreate9 from d3d9.dll) as outlined here: https://www.malwaretech.com/2015/01/inline-hooking-for-programmers-part-1.html and https://www.malwaretech.com/2015/01/inline-hooking-for-programmers-part-2.html
My code differs slightly as I am working out the offset bytes manually using a disassembler rather than using the hde32_disasm function.
Everything seems to work fine, the victim process calls my injected dll wrapper function, the new function does some stuff, then calls the original function (Direct3DCreate9), once the original returns the wrapper should then call some other stuff, before returning to the victim process.
Unfortunately when the original function is called from the hook wrapper it returns back to the victim application and not the hook wrapper, this means it misses out some of the code from the wrapper.
Having stepped through the disassembly it looks as though the call stack is being overwritten so when the Direct3DCreate9 returns it pops back to the victim application instead of my hook function that made the call.
I'm guessing I need to push the hook function manually onto the call-stack? How would I go about this?
Other potentially pertinent information: Both the victim process and the hook have been built in debug mode. Direct3DCreate9 is a __stdcall and I am using vs2010 for the hook dll, but the victim process was compiled with vs2015.
It turns out the callstack was being knobbled by the NVidia graphics driver nvd3d9wrap.dll. This dll was injecting into the d3d9 application in the same manner as what I was trying to do. This lead to the madness explained in the original post.
The solution was to open Device Manager in windows and disable the NVidia graphics driver. Thankfully my pc has an integrated graphics chip, so I am able to use that.
I had been setting my application (MASM assembly language program) entry point via Visual Studio configurations, in the Project properties as:
Linker\System\SubSystem: Windows (/SUBSYSTEM:WINDOWS)
Linker\Advanced\Entry Point: WinMain
Any my main proc called WinMain (matching the above setting). It is a basic application that makes simple Windows API calls, e.g. MessageBoxA... and it works.
Now I'm building a Window application (in assembly), I read somewhere that I need to call the WinMain Windows API for an entry point.
I'm now confused! Which technique do I use to set the entry point to my application (exe)? The Windows API call 'WinMain' or the Visual Studio Linker entry point settings? And is the difference, i.e. C++ runtime vs OS?
If you are using the C runtime library (which is usually the case when programming in C) then you must not specify the linker entry point yourself. If you do, the runtime library will not be properly initialized, and any runtime library calls (including those inserted by the compiler) may fail.
Instead, your main function should correspond to the relevant standard: WinMain() for a GUI application, or main() for a console application.
In an assembly language program that is not linked to the C runtime library, you should specify an entry point of your choosing.
The signature of the native entry point is
DWORD CALLBACK RawEntryPoint(void);
Important:
Returning from the raw entry point implicitly calls ExitThread (see this answer) which is not usually the right thing to do, because if the Windows API has created any threads that you don't know about, the process won't exit until they do. Note that the Windows API documentation does not always indicate when a particular API function may cause a thread to be created.
Instead, you should explicitly call ExitProcess. This is what the C runtime library does when you return from WinMain() or main().
I know that software breakpoints in an executable file can work through replacing some assembler instruction at the desired place with another one, which cause interrupt. So debugger can stop execution exactly at this place and replace this instruction with original one and ask user about what to do the next or call some commands and etc.
But code of such executable file is not used by another programs and has only one copy in memory. How can software breakpoints work with a shared libraries? For instance, how software breakpoints work if I set one at some internal function of C-library (as I understand it has only one copy for all the applications, so we cannot just replace some instruction in it)? Are there any "software breakpoints" techniques for that purpose?
The answer for Linux is that the Linux kernel implements COW (Copy-on-Write): If the code of a shared library is written to, the kernel makes a private duplicate copy of the shared page first, remaps internally virtual memory just for that process to the copy, and allows the application to continue. This is completely invisible to userland applications and done entirely in the kernel.
Thus, until the first time a software breakpoint is put into the shared library, its code is indeed shared; But afterwards, not. The process thereafter operates with a dirty but private copy.
This kernel magic is what allows the debugger to not cause every other application to suddenly stop.
On OSes such as VxWorks, however, this is not possible. From personal experience, when I was implementing a GDB remote debug server for VxWorks, I had to forbid my users from ever single-stepping within semTake() and semGive() (the OS semaphore functions), since a) GDB uses software breakpoints in its source-level single-step implementation and b) VxWorks uses a semaphore to protect its breakpoints list...
The unpleasant consequence was an interrupt storm in which a breakpoint would cause an interrupt, and within this interrupt there would be another interrupt, and another and another in an unescapable chain resistant even to Ctrl-Z. The only way out was to power off the machine.
http://www.makelinux.net/ldd3/chp-2-sect-3#chp-2-ITERM-4135 this link describes the user space and kernel space communication.
could anyone explain it with a simple user space application program in c that links & communicates(send / receives values) to the kernel object.?
The program insmod, available on most Linux machines (but requiring sudo privileges to run) instructs the kernel to load a specified module (kernel object) through the system call init_module.
More generally, user-space programs communicate with the kernel through these system calls, which are essentially requests to the kernel from user space. Any application you write in C must use system calls in some way to interact with the system (for example, printf uses the write system call under the hood to put characters on the screen).
Just open a file with open(2). The compiler will add code to the application for this call which will put the function arguments on the stack and make it crash in a certain way (see system call). The kernel catches all the crashes and handles them.
Since this is a "good" crash, the kernel will look up which function to invoke, get the arguments from the stack and invoke the function.
The reason for this complicated approach is security: By "crashing", the application completely relinquishes control. The CPU will switch to a different mode, too. In this mode, it can access the hardware (in "application" mode, any access to the hardware leads to an "illegal access" crash which terminates your app).
The open(2) function itself can't do much. Instead, it will check which file system can handle the request and invoke the open function of the file system. File systems are implemented as kernel modules.
My question is:
What is the flow of execution of an executable file in WINDOWS? (i.e. What happens when we start a application.)
How does the OS integrate with the application to operate or handle the application?
Does it have a central control place that checks execution of each file or process?
Is the process registered in the execution directory? (if any)
What actually happens that makes the file perform its desired task within WINDOWS environment?
All help is appreciated...
There's plenty going on beyond the stages already stated (loading the PE, enumerating the dlls it depends on, calling their entry points and calling the exe entry point).
Probably the earliest action is the OS and the processor collaborating to create a new address space infrastructure (I think essentially a dedicated TLB). The OS initializes a Process Environment Block, with process-wide data (e.g., Process ID and environment variables). It initializes a Thread Environment Block, with thread-specific data (thread id, SEH root handlers, etc). Once an appropriate address is selected for every dll and it is loaded there, re-addressing of exported functions happens by the windows loader. (very briefly - at compile time, the dll cannot know the address at which it will be loaded by every consumer exe. the actual call addresses of its functions is thus determined only at load time). There are initializations of memory pages shared between processes - for windows messages, for example - and I think some initialization of disk-paging structures. there's PLENTY more going on. The main windows component involved is indeed the windows loader, but the kernel and executive are involved. Finally, the exe entry point is called - it is by default BaseProcessStart.
Typically a lot of preparation happens still after that, above the OS level, depending on used frameworks (canonical ones being CRT for native code and the CLR for managed): the framework must initialize its own internal structures to be able to deliver services to the application - memory management, exception handling, I/O, you name it.
A great place to go for such in depth discussions is Windows Internals. You can also dig a bit deeper in forums like SO, but you have to be able to break it down into more focused bits. As phrased, this is really too much for an SO post.
This is high-level and misses many details:
The OS reads the PE Header to find the dlls that the exe depends on and the exe's entry point
The DllMain function in all linked dlls is called with the DLL_PROCESS_ATTACH message
The entry point of the exe is called with the appropriate arguments
There is no "execution directory" or other central control other than the kernel itself. However, there are APIs that allow you to enumerate and query the currently-running processes, such as EnumProcesses.
Your question isn't very clear, but I'll try to explain.
When you open an application, it is loaded into RAM from your disk.
The operating system jumps to the entry point of the application.
The OS provides all the calls required for showing a window, connecting with stuff and receiving user input. It also manages processing time, dividing it evenly between applications.