How do different EXEs run in the right runtime? - runtime

I've always been curious as to how one extension, EXE, can be as versatile as it is in that if you assemble an assembly program, you get an EXE in machine code for your processor but if you compile a C# or other .Net program, you also get an EXE except that it is run in the proper runtime environment. I'm not sure if this is different from OS to OS (I imagine it is), but when an EXE is executed, how is it determined how to execute it?
On a related note, if I were writing my own programming language, how would I tie in my runtime environment into this mechanism?

When compiling a .NET program to an EXE, it's more than just a blob of bytecode (like Java). There's actually native executable created that will load the .NET runtime and hand off the .NET bytecode to it, or display a friendly-ish error message indicating that the framework is not available.
The format is even more flexible than that, as every Windows EXE actually includes a DOS program at the beginning which will display an error ("cannot run in DOS mode") when executed as a DOS program.
You can read more details on the PE format on Wikipedia: http://en.wikipedia.org/wiki/Portable_Executable

Related

Is it possible to convert minidump to core files on Windows?

I'm trying to implement breakpad crash handler in our application which runs cross-platform.
I've had great success with the Linux build, however, on Windows, I've gotten as far as getting a crash dump, extracting symbols from the .exe using dump_syms utility, and using minidump_stackwalk to check out the crash and line number.
Unless I'm missing something, It's difficult(impossible) using only minidump_stackwalk to debug further (checking out local variables, etc). On Linux I've used a tool minidump-2-core to convert minidumps to core files that can be loaded in gdb. On Windows we use a MSYS2/MINGW64 environment to build the application. Is it possible to convert the minidumps to core and load them in a similar fashion ?
I understand that an alternative is using WinDbg, however I can't seem to extract symbols correctly from the .exe. Any tips on how I would do that ?

Converting a DOS Application to a Win32 Console Application?

Is it possible to convert a DOS Application to a Win32 Console Application? I have an old program I wrote a long time ago, lost the source to it and asked myself now if it's possible to convert the DOS Binary to an actual Windows Binary, which runs in Command Line Prompt?
This is not possible. The DOS program will attempt to use DOS system calls that do not exist under Windows. The program will need to be updated and rebuilt for Windows. You might have some success running the original program in a DOS emulator.
See other answers about running a DOS program under Windows.
To convert a DOS program to a Win32 console application, one would have to convert the 16-bit (8086) code within the DOS program to 32-bit (i386) code. This is a very hard task to do right, and probably that's why there is no converter readily available. (Alternatively, an emulator can run 16-bit code without conversion, see the other answers and comments.)
However, not all DOS programs contain 16-bit code, for example programs using DOS extenders built with the Watcom C/C++ compiler (or, equivalently, with OpenWatcom: owcc -bdos4g prog.c) contain only 32-bit code. Windows can run 32-bit code directly, but API calls (e.g. opening and reading a file, allocating memory, getting the current time, writing colorful text to the console) have to be converted from the DOS+DPMI API to the Win32 API. Such a conversion is technically possible and feasible (even on the final .exe file, without access to the source code), and it is much easier to do correctly than the conversion of 16-bit code to 32-bit code. However, I still don't know of a converter readily available.
Please also note that conversion graphics and audio code to the Win32 API is very hard, but that's out of scope in this question.

Why is "This program cannot be run in DOS mode" text present in .dll files?

Recently I opened a .dll file produced by Visual Studio 9 while compiling a native C++ DLL project and was surprised to see the "This program cannot be run in DOS mode" text near the beginning.
Why have this text in .dll files?
A dll is very much like an executable with a different extension. The text you saw is part of the 'standard' executable header on windows. It is (was) used to gracefully abort the attempt to run a windows executable from DOS.
The Portable Executable format specification states the following:
The MS-DOS stub is a valid application that runs under MS-DOS. It is
placed at the front of the EXE image. The linker places a default stub
here, which prints out the message “This program cannot be run in DOS
mode.” when the image is run in MS-DOS. The user can specify a
different stub by using the /STUB linker option.
At location 0x3c, the stub has the file offset to the PE signature.
This information enables Windows to properly execute the image file,
even though it has an MS-DOS stub. This file offset is placed at
location 0x3c during linking.
Win32 programs run from DOS mode (ie, single user, no graphics) print that text. DLLs probably print that message too if you try to use them without Windows running.

what's in a .exe file?

So a .exe file is a file that can be executed by windows, but what exactly does it contain? Assembly language that's processor specific? Or some sort of intermediate statement that's recognized by windows which turns it into assembly for a specific processor? What exactly does windows do with the file when it "executes" it?
MSDN has an article "An In-Depth Look into the Win32 Portable Executable File Format" that describes the structure of an executable file.
Basically, a .exe contains several blobs of data and instructions on how they should be loaded into memory. Some of these sections happen to contain machine code that can be executed (other sections contain program data, resources, relocation information, import information, etc.)
I suggest you get a copy of Windows Internals for a full description of what happens when you run an exe.
For a native executable, the machine code is platform specific. The .exe's header indicates what platform the .exe is for.
When running a native .exe the following happens (grossly simplified):
A process object is created.
The exe file is read into that process's memory. Different sections of the .exe (code, data, etc.) are mapped in separately and given different permissions (code is execute, data is read/write, constants are read-only).
Relocations occur in the .exe (addresses get patched if the .exe was not loaded at its preferred address.)
The import table is walked and dependent DLL's are loaded.
DLL's are mapped in a similar method to .exe's, with relocations occuring and their dependent DLL's being loaded. Imported functions from DLL's are resolved.
The process starts execution at an initial stub in NTDLL.
The initial loader stub runs the entry points for each DLL, and then jumps to the entry point of the .exe.
Managed executables contain MSIL (Microsoft Intermediate Language) and may be compiled so they can target any CPU that the CLR supports. I am not that familiar with the inner workings of the CLR loader (what native code initially runs to boot strap the CLR and start interpreting the MSIL) - perhaps someone else can elaborate on that.
I can tell you what the first two bytes in .exe files contain - 'MZ'. i mean the characters 'MZ'.
It actually represents: Mark Zbikowski. The guy who designed the exe file format.
http://en.wikipedia.org/wiki/Mark_Zbikowski
1's and 0's!
This wikipedia link will give you all the info you need on the Portable Executable format used for Windows applications.
An EXE file is really a type of file known as a Portable Executable. It contains binary data, which can be read by the processor and executed (essentially x86 instructions.) There's also a lot of header data and other miscellaneous content. The actual executable code is located in a section called .text, and is stored as machine instructions (processor specific). This code (as well as other parts of the .EXE) are put into memory, and the CPU is sent to it, where it starts executing. (Note that there's much more interfaces actually happening; this is a simplified explanation).

Does a cross-platfrom compiler that can compile a native executable that can be run both in linux windows exist? Could it exist?

I remember a few years ago(2002) there was a multipartite virus that could be run natively on linux and windows. I don't know if a compiler could be specially craft an executable so that it could be read as both ELF and PE, so that the os would start executing at different entry points. Or a program that could merge two programs, one compiled using mingw, one compiled in native linux, to one program.
I don't know if such a program exists, or could it exist, and I'm know this could be implemented in Java or some scripting language, but that's not a native program.
Imagine the possibilities, I could deploy a program with linux and window (and perhaps os/x)libraries, and one main executable that could be run on any os. The cross-platform support would compensate the bigger size.
Windows programs have a DOS stub in the beginning, and I just ran an ELF executable through debug.com, which said that the first instruction of this exe was JG 0x147. Just maybe something could be done with this...
No.
Windows and Linux use vastly different binary file formats. See Portable Executable (Windows) and Executable and Linkable Format (Linux).
Something like WINE will run Windows executables on Linux but that's not the same thing.
This is actually a really terrible idea for multiple reasons.
Cross-compiling across operating system boundaries is extremely difficult to do properly.
If you go for the second route (building separate PE binaries on Windows and ELF on Linux, and then somehow merging them) you have to maintain two machines, each running a different OS and the full build stack, and you'd have to make sure that you tested both versions separately before gluing them together.
Dynamic linking is already a pain to properly manage, on Windows and on Linux; static linking can generate binaries that are much more inconvenient to deal with than whatever imaginary benefits you get from providing one single file type to your end-user.
If you want to run the same binary executable file on multiple OSes, your options are Java, Mono, and potentially NativeClient, the browser plug-in Google's developing to work around the "webapps are too slow" problem.

Resources