How are Windows API calls made on Assembly Level? - winapi

I've written some high level interpreters and a simple byte code compiler and interpreter and I want to start making a powerful intermediate language for my small operating system.
It has its own API just like windows does, and the only thing which prevents me of starting this project is to know how these specific API calls (for example the win32 forms api) are being made on the assembly level.
Is there a way to see the assembly output of not optimized c code for example and look how exatly the calls are being made? Or any sources on the WWW?
Thanks in advance

Having C documentation for the API, and knowing the calling convention / ABI, should be enough to create asm that uses it. There's no "magic" needed (no inline syscall instructions or anything like that).
Much of the Win32 API is implemented in user-space DLLs, so API calls are no different from other library function calls. (i.e. an indirect CALL with a function pointer, if I recall correctly).
Often the library function implementation will involve a syscall to interact with the kernel (or for 32-bit code, maybe an int or sysenter, I'm not sure), but this interface is not documented and is not stable across different Windows versions.

Related

actual machine code to execute what Win APIs do stays in OS kernel memory space or compiled together as part of the app?

If this question deals with too basic a matter, please forgive me.
As a somewhat-close-to-beginner-level programmer, I really wonder about this--whether the underlying code of every win API function is compiled altogether at the time of writing an app, or whether the machine code for executing win APIs stays in the memory as part of the OS since the pc is booted up, and only the app uses them?
All the APIs for an OS are used by many apps by means of function call. So I thought that rather than making every individual app include the API machine code on their own, apps just contain the header or signature to call the APIs and the API machine code addresses are mapped when launching the app.
I am sorry that I failed to make this question succinct due to my poor English. I really would like to get your insights. Thank you.
The implementation for (most) API calls is provided by the system by way of compiled modules (Portable Executable images). Application code only contains enough information so that the system can identify and load the required modules, and resolve the respective imports.
As an example consider the following code that shows a message box, waits for it to close, and then exits the program:
#include <Windows.h>
int main()
{
::MessageBoxW(nullptr, L"Foo", L"Bar", MB_OK);
}
Given the function signature (declared in WinUser.h, which gets pulled in from Windows.h) the compiler can almost generate a call instruction. It knows the number of arguments, their expected types, and the order and location the callee expects them in. What's missing is the actual target address inside user32.dll, that's only known after a process was fully initialized, and had the user32.dll module mapped into its address space.
Clearly, the compiler cannot postpone code generation until after load time. It needs to generate a call instruction now. Since we know that "all problems in computer science can be solved by another level of indirection" that's what the compiler does, too: Instead of emitting a direct call instruction it generates an indirect call. The difference is that, while a direct call immediately needs to provide the target address, an indirect call can specify the address at which the target address is stored.
In x86 assembly, instead of having to say
call _MessageBoxW#16 ; uh-oh, not yet known
the compiler can conveniently delegate the call to the Import Address Table (IAT):
call dword ptr [__imp__MessageBoxW#16]
Disaster averted, we've bought us just enough time to fix things up before the code actually executes.
Once a process object is created the system hands over control to its primary thread to finish initialization. Part of that initialization is loading dependencies (such as user32.dll here). Once that has completed, the system finally knows the load address (and ultimately the address of imported symbols, such as _MessageBoxW#16), and can overwrite the IAT entry at address __imp__MessageBoxW#16 with the imported function address.
And that is approximately how the system provides implementations for system services without requiring client applications to know where (physically) they will find them.
I'm saying "approximately" because things are somewhat more involved in reality. If that is something you'll want to learn about, I'll leave it up to Raymond Chen. He has published a series of blog entries covering this topic in far more detail:
How were DLL functions exported in 16-bit Windows?
How were DLL functions imported in 16-bit Windows?
How are DLL functions exported in 32-bit Windows?
Exported functions that are really forwarders
Rethinking the way DLL exports are resolved for 32-bit Windows
Calling an imported function, the naive way
How a less naive compiler calls an imported function
Issues related to forcing a stub to be created for an imported function
What happens when you get dllimport wrong?
Names in the import library are decorated for a reason
Why can't I GetProcAddress a function I dllexport'ed?

Is it possible to call the Windows API from Forth?

In C/C++, Windows executables are linked against static libraries that import DLL files containing Windows API procedures.
But how do we access those procedures from Forth code (e.g. GForth)? Is it possible at all?
I'm aware that there's Win32Forth capable of doing Win32 stuff, but I'm interested how (and if) this could be done in Forth implementations that lack this functionality from the box (yet do run on target OS and are potentially able to interact with it on a certain level).
What currently comes up to my mind is loading the DLL files in question and somehow locating the address of a procedure to execute - but then, execute how? (All I know is that Windows API uses the stdcall
convention). And how do we locate a procedure without a C header? (I'm very new to Forth and just a bit less new to C++. Please bear with me if my musings are nonsense).
In general case, to implement foreign functions interface (FFI) for dynamically loaded libraries in some Forth system as extension (i.e., without changing source code and recompilation), we need the dlopen and dlsym functions, Forth assembler, and intimate knowledge of the Forth-system organization and ABI.
Sometimes it could be done even without assembler. For example, though SP-Forth has FFI, foreign calls were also implemented in pure Forth as a result of native code generation and union of the return stack with the native hardware stack.
Regarding Gforth, it seems that in the version 0.7.9 (see releases) it doesn't have FFI for stdcall calling convention out of the box (it supports cdecl only), although it has dlopen and dlsym, and an assembler. So, it should be feasible to implement FFI for stdcall.
Yes, you could do this in Gforth according to its documentation. The biggest problem will be dealing with call backs, which the Windows API relies on rather heavily. There is an unsupported package to deal with this, see 5.25.6 Callbacks. I have not attempted this myself in Gforth, but the documentation looks adequate.
You might also want to check MPE's VFXForth. From their website:
Windows API Access
VFX Forth can access all the standard Windows API calls, as well as functions in any other DLLs. The function interface allows API calls to be defined by cut and paste from other language reference manuals, for example:
EXTERN: int PASCAL CreateDialogIndirectParam( HINSTANCE, void *,HWND, WNDPROC, LPARAM );
EXTERN: int PASCAL SetWindowText( HANDLE, LPSTR );
EXTERN: HANDLE PASCAL GetDlgItem( HANDLE, int );
This is down the page a bit at VFX Forth for Windows.
As I do my Forth on Mac and Linux, I can't work through the Windows for Gforth to provide more detail, sorry.
Gforth 0.7.9 provides Windows API calls generated by Swig from the Windows header files. The C interface uses a wrapper library, which is compiled by the C compiler, to pass parameters from the Forth stack to the system functions; as the C compiler understands stdcall, and the header files declare Windows API as stdcall, this "just works".
As all pre-generated C bindings live in the directory "unix" (for historical reasons), include unix/win32.fs gives you the win32 part of the Windows API.
Callbacks in the event loop are still a problem, as Gforth is a Cygwin program, and Cygwin has its special event loop task... but I hope that problem can be fixed.

Whether the APIs in kernel32.dll (or others) have subrutines

I was wondering that whether the APIs in kernel32.dll (or others) have subrutines.
For example the CopyFile function, it should take different action to copy file from C: to D: and from a netshare path (\HOSTNAME\SHAREDFOLDER\FILENAME) to somewhere, or trigger the windows server 2012 (hyper-v) new feature ODX.
So in the definition of the CopyFile function, there should be some if/else branch, and call some sub function, isn't it?
If the subrutines exist. Is it possible to call the these sub functions directly, and is it possible to hook them?
Thanks.
As far as I know, the current implementation of kernel32.dll calls functions in ntdll.dll. The functions in ntdll.dll then do a syscall into the kernel somehow.
To answer your question, yes, it calls subroutines, and they probably can be hooked, but most of the logic about how specifically to read from and write to filesystems in different ways is probably buried in the kernel.
Keep in mind that you're probably not supposed to be digging into the internals of these DLLs — it's best to use the public interface. Relying on implementation details makes your code more fragile and likely to break with operating system upgrades.

Can I mix arm-eabi with arm-elf?

I have a product which bootloader and application are compiled using a compiler (gnuarm GCC 4.1.1) that generates "arm-elf".
The bootloader and application are segregated in different FLASH memory areas in the linker script.
The application has a feature that enables it to call the bootloader (as a simple c-function with 2 parameters).
I need to be able to upgrade existing products around the world, and I can safely do this using always the same compiler.
Now I'd like to be able to compile this product application using a new GCC version that outputs arm-eabi.
Everything will be fine for new products, where both application and bootloader are compiled using the same toolchain, but what happens with existing products?
If I flash a new application, compiled with GCC 4.6.x and arm-none-eabi, will my application still be able to call the bootloader function from the old arm-elf bootloader?
Furthermore, not directly related to the above question, can I mix object files compiled with arm-elf into a binary compiled with arm-eabi?
EDIT:
I think is good to make clear I am building for a bare metal ARM7, if it makes any difference...
No. An ABI is the magic that makes binaries compatible. The Application Binary Interface determines various conventions on how to communicate with other libraries/applications. For example, an ABI will define calling convention, which makes implicit assumptions about things like which registers are used for passing arguments to C functions, and how to deal with excess arguments.
I don't know the exact differences between EABI and ABI, but you can find some of them by reading up on EABI. Debian's page mentions the syscall convention is different, along with some alignment changes.
Given the above, of course, you cannot mix arm-elf and arm-eabi objects.
The above answer is given on the assumption that you talk to the bootloader code in your main application. Given that the interface may be very simple (just a function call with two parameters), it's possible that it might work. It'd be an interesting experiment to try. However, it is not ** guaranteed** to work.
Please keep in mind you do not have to use EABI. You can generate an arm-elf toolchain with gcc 4.6 just as well as with older versions. Since you're using a binary toolchain on windows, you may have more of a challenge. I'd suggest investigating crosstool-ng, which works quite well on Linux, and may work okay on cygwin to build the appropriate toolchain.
There is always the option of making the call to bootloader in inline assembly, in which case you can adhere to any calling standard you need :).
However, besides the portability issue it introduces, this approach will also make two assumptions about your bootloader and application:
you are able to detect in your app that a particular device has a bootloader built with your non-EABI toolchain, as you can only call the older type bootloader using the assembly code.
the two parameters you mentioned are used as primitive data by your bootloader. Should the bootloader use them, for example, as pointers to structs then you could be facing issues with incorrect alignment, padding and so forth.
I Think that this will be OK. I did a migration something like this myself, from what I remember I only ran into a problem to do with handling division.
This is the best info I can find about the differences, it suggests that if you don't have struct alignment issues, you may be OK.

Finding undocumented APIs in Windows

I was curious as to how does one go about finding undocumented APIs in Windows.
I know the risks involved in using them but this question is focused towards finding them and not whether to use them or not.
Use a tool to dump the export table from a shared library (for example, a .dll such as kernel32.dll). You'll see the named entry points and/or the ordinal entry points. Generally for windows the named entry points are unmangled (extern "C"). You will most likely need to do some peeking at the assembly code and derive the parameters (types, number, order, calling convention, etc) from the stack frame (if there is one) and register usage. If there is no stack frame it is a bit more difficult, but still doable. See the following links for references:
http://www.sf.org.cn/symbian/Tools/symbian_18245.html
http://msdn.microsoft.com/en-us/library/31d242h4.aspx
Check out tools such as dumpbin for investigating export sections.
There are also sites and books out there that try to keep an updated list of undocumented windows APIs:
The Undocumented Functions
A Primer of the Windows Architecture
How To Find Undocumented Constants Used by Windows API Functions
Undocumented Windows
Windows API
Edit:
These same principles work on a multitude of operating systems however, you will need to replace the tool you're using to dump the export table. For example, on Linux you could use nm to dump an object file and list its exports section (among other things). You could also use gdb to set breakpoints and step through the assembly code of an entry point to determine what the arguments should be.
IDA Pro is your best bet here, but please please double please don't actually use them for anything ever.
They're internal because they change; they can (and do) even change as a result of a Hotfix, so you're not even guaranteed your undocumented API will work for the specific OS version and Service Pack level you wrote it for. If you ship a product like that, you're living on borrowed time.
Everybody here so far is missing some substantial functionality that comprises hugely un-documented portions of the Windows OS RPC . RPC (think rpcrt4.dll, lsass.exe, csrss.exe, etc...) operations occur very frequently across all subsystems, via LPC ports or other interfaces, their functionality is buried in the mysticism incantations of various type/sub-type/struct-typedef's etc... which are substantially more difficult to debug, due to the asynchronous nature or the fact that they are destine for process's which if you were to debug via single stepping or what have you, you would find the entire system lockup due to blocking keyboard or other I/O from being passed ;)
ReactOS is probably the most expedient way to investigate undocumented API. They have a fairly mature kernel and other executive's built up. IDA is fairly time-intensive and it's unlikely you will find anything the ReactOS people have not already.
Here's a blurb from the linked page;
ReactOS® is a free, modern operating
system based on the design of Windows®
XP/2003. Written completely from
scratch, it aims to follow the
Windows® architecture designed by
Microsoft from the hardware level
right through to the application
level. This is not a Linux based
system, and shares none of the unix
architecture.
The main goal of the
ReactOS project is to provide an
operating system which is binary
compatible with Windows. This will
allow your Windows applications and
drivers to run as they would on your
Windows system. Additionally, the look
and feel of the Windows operating
system is used, such that people
accustomed to the familiar user
interface of Windows® would find using
ReactOS straightforward. The ultimate
goal of ReactOS is to allow you to
remove Windows® and install ReactOS
without the end user noticing the
change.
When I am investigating some rarely seen Windows construct, ReactOS is often the only credible reference.
Look at the system dlls and what functions they export. Every API function, whether documented or not, is exported in one of them (user, kernel, ...).
For user mode APIs you can open Kernel32.dll User32.dll Gdi32.dll, specially ntdll.dll in dependancy walker and find all the exported APIs. But you will not have the documentation offcourse.
Just found a good article on Native APIS by Mark Russinovich

Resources