How does Windows execute assembly programs? - windows

I am starting to learn the assembly programming language and want to know how does Windows execute assembly programs? Does Windows also use the same procedures for executing .exe files?
At this point I am having a hard time focusing on learning the assembly programming language while constantly thinking of what happens in the background.
I am also looking for a book to get a better and deeper understanding of Windows internals and how general operating systems perform tasks such as the one stated above. Any help(terminology used to describe the procedure) or reference to external resources is appreciated!

After you assemble your program, it becomes a normal executable, and Windows executes it like it would any other native executable.

A native executable contains machine code, which can be executed by the CPU directly. The operating system essentially just loads it into memory, sets up a new process, and starts that process running at the start of the program.

Related

Running ZeroBrane from Lua

Given that ZeroBrane is all written in Lua, can it be actually started from within a Lua environment?
The main motivation would be to fully integrate it inside an existing application (running within the same thread), being able to debug locally using all the exposed C/C++ functionality.
I realize I would have to match the architecture of the clibs used by ZeroBrane with the one used in the host application. So for example, if the host app is running LuaJIT 64-bit I would then require wx.dll compiled against the same LuaJIT binaries.
Will there be any other hurdles or limitations when trying to do this?
Given that ZeroBrane is all written in Lua, can it be actually started from within a Lua environment?
The answer to this question is definitely "yes", as this is what ZeroBrane actually does. For example, on Windows it launches itself by loading src/main.lua and executing it; this code can be seen in win32_starter.c.
Will there be any other hurdles or limitations when trying to do this?
I think the issue you're likely to run into is that it's difficult to debug a single-threaded application from itself (not impossible though). This is why normally the IDE (and its debugger) is launched as a separate process that interacts with the application that's being debugged over socket. You may want to check debugger.lua, which may be closer to what you're looking for.

Does the compiler actually produce Machine Code?

I've been reading that in most cases (like gcc) the compiler reads the source code in a high level language and spits out the corresponding machine code. Now, machine code by definition is the code that a processor can understand directly. So, machine code should be only machine (processor) dependent and OS independent. But this is not the case. Even if 2 different operating systems are running on the same processor, I can not run the same compiled file (.exe for Windows or .out for Linux) on both the Operating Systems.
So, what am I missing? Is the output of a gcc compiler (and most compilers) not Machine Code? Or is Machine Code not the lowest level of code and the OS translated it further to a set of instructions that the processor can execute?
You are confusing a few things. I retargettable compiler like gcc and other generic compilers compile files to objects, then the linker later links objects with other libraries as needed to make a so called binary that the operating system can then read, parse, load the loadable blocks and start execution.
A sane compiler author will use assembly language as the output of the compiler then the compiler or the user in their makefile calls the assembler which creates the object. This is how gcc works. And how clang works sorta, but llc can make objects directly now not just assembly that gets assembled.
It makes far more sense to generate debuggable assembly language that produce raw machine code. You really need a good reason like JIT to skip the step. I would avoid toolchains that go straight to machine code just because they can, they are harder to maintain and more likely to have bugs or take longer to fix bugs.
If the architecture is the same there is no reason why you cant have a generic toolchain generate code for incompatible operating systems. the gnu tools for example can do this. Operating system differences are not by definition at the machine code level most are at the high level language level C libraries that you can to create gui windows, etc have nothing to do with the machine code nor the processor architecture, for some operating systems the same operating system specific C code can be used on mips or arm or powerpc or x86. where the architecture becomes specific is the mechanism that actual system calls are invoked. A specific instruction is often used. and machine code is eventually used yes but no reason why this cant be coded in real or inline assembly.
And then this leads to libraries, even fopen and printf which are generic C calls eventually have to make a system call so much of the library support code can be in a compatible across systems high level language, there will need to be a system and architecture specific bit of code for the last mile. You should see this in glibc sources, or hooks into newlib for example in other library solutions. As examples.
Same is true for other languages like C++ as it is for C. Interpreted languages have additional layers but their virtual machines are just programs that sit on similar layers.
Low level programming doesnt mean machine nor assembly language it just means whatever programming language you are using accesses at a lower level, below the application or below the operating system, etc...
Compilers produce assembly code, which is a human-readable version of machine code (eg, instead of 1's and 0's you have actual commands). However, the correct assembly/machine code needed to make your program run correctly is different depending on the operating system. So the language the processors use is the same, but your program needs to talk to the operating system, which is different.
For example, say you're writing a Hello World program. You need to print the phrase "Hello, World" onto the screen. Your program, will need to go through the OS to actually do that, and different OSes have different interfaces.
I'm deliberately avoiding technical terms here to keep the answer understandable for beginners. To be more precise, your program needs to go through the operating system to interact with the other hardware on your computer(eg, keyboard, display). This is done through system calls that are different for each family of OS.
The machine code that is generated can run on any of the same type of processor it was generated for. The challenge is that your code will interact with other modules or programs on the system and to do that you need a conventions for calling and returning. The code generated assumes a runtime environment (OS) as well as library support (calling conventions). Those are not consistent across operating systems.
So, things break when they need to transition to and depend on other modules using conventions defined by the operating system's calling conventions.
Even if the machine code instructions are identical for the compiled program on two different operating systems (not at all likely, since different operating systems provide different services in different ways), the machine code needs to be stored in a format that the host OS can use "load into" a process for execution. And those formats are frequently different between different operating systems.

How can a program on my CPU run the same way on another CPU?

Let's say that I would code a program with Windows API and then compile it. The code is compiled to machine code for the CPU to execute. Now, my question is: If I share the executable file for someone else with another instruction set in their CPU. How can their CPU run the code the same way and not give errors or run a different code?
someone else with another instruction set in their CPU
...
How can their CPU run the code the same way
The code won't run. The CPU's, simply put, speak another language.
You have two options
recompile your code for the target CPU (assuming you can use the same source language and no platform specific API, so you're left with C/C++ with stdlib)
Write a script / bytecode and use a runtime available for both platforms to interpret the script (or bytecode)
That's why there are Runtime installations such as JVM (for Java) and scripts (Python, Scala, Lua, JavaScript, etc) where the code is in a form of a script or as platform independent code.
And now - next step. If you're using Windows API, well - as the name suggests - it's API (services) provided by the Windows system. So even using the same CPU without the Windows system (e.g. on a Linux system), the application won't run. (ok, there is often a way how to expose Windows API on Linux, but it can be tricky sometimes).
Conclusion: Binaries are not portable between instruction sets, if you're using any high level API (Win32, ...), you're pretty much hooked to the operating system too
When high-level languages are compiled into executable, often they are compiled to intermediate code. This is a representation of the source code compiled closer to assembly language, however it is not specific to any CPU instruction set. It is up to the machine running the executable to interpret this intermediate code and run it in the CPU's native instruction set.

How does software (either compiled or interpreted) reach the end user? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
An executable Python app
So I have taken a little online python course and I now have an understanding of simple programming. We made our own scrabble game, for example. However what i dont understand is how these .py .c .class or whatever get to an exe form?
I never as an end user have to open .py files ever, with windows it is always .exe, but how are these made? Are they batch files that merely execute the file? But what about dlls?
I guess my question is in any language how is the finished code executed on the machine. When i run a java program i dont have to fiddle with class files i just click an exe.
EDIT......
What i mean isnt how to make python an exe, but how does software get to thatstage full stop. I know interpreted languages go to the interpreter, i guess you use an intermediate language to make an exe which runs the code.
Generally the code would be compiled into an executable, which may or may not internally contain everything it needs to run. (If it doesn't, then it could come packaged in an installer which distributes what it needs.)
Specific to Python, a quick Google search turned up this. For interpreted languages, since there is really no "compile" step, you'd need some tool to "convert [language] to windows exe" to accomplish what you're asking.
Most software you run on Windows is not written in an interpreted language like Python, and comes with an installer ('setup.exe') which was generated by some software that creates installers for your code. The purpose of the installer is to both install your program and all the files it may depend on that you have installed as a developer but your end users don't.
See these related questions:
How can I create a directly-executable cross-platform GUI app using Python?
py2exe - generate single executable file
very simply and speaking generically, you would either compile or interpret you source code. An exe or dll would be the result of compilation (JITs as another item to learn about).
You should also learn about "server side" and "client side" code. A web based application would run server side code which may generate html (and perhaps javascript) and send that down to the client side browser.
There are many ways to deploy exe's, dlls etc - simply copy them to the target machine, or use an installer or via a browser plug in environment.
When you use a compiled language that generates native code, the compiler is responsible to generate an executable file based on your source code.
If the language is interpreted, running the program usually means launching the interpreter and passing it the main file of the program. Some languages offer tools to package the interpreter and the sources into an executable.
If the language is compiled but generates intermediate code, you need to run the virtual machine, like an interpreted language. However, if you use .NET on Windows, the compiler generates an executable that loads the virtual machine automatically.

Erlang compilation - Erlang as stand alone executeable

is there a way to compile Erlang to be a stand-alone executable?
this means, to run it as an exe without the Erlang runtime.
While it's possible to wrap everything up in a single EXE, you're not going to get away from having an Erlang runtime. Dynamic languages like Erlang can't really be compiled to native x86 code, for instance, due to their nature. There has to be an interpreter in there somewhere.
It's possible to come up with a scheme that bundles the interpreter and all the BEAM files into a single EXE you can double-click and run directly, but that's probably more work than you were wanting to go to. I've seen it done before, but there's rarely a good reason to do it, so I won't bother going into detail on the techniques here.
Instead, I suggest you use the same technique they use for Python's py2exe and py2app programs for creating Windows and Mac OS X executables, respectively. These programs load the program's main module up into a Python interpreter, figure out which other modules it needs using the language's built-in reflection mechanisms, then write out all those compiled modules along with a copy of the language interpreter and a small wrapper program that launches the program's main module with the interpreter. The directory containing those files is then a stand-alone environment, having everything needed to run the program. The only difference in the Erlang case is that python.exe becomes erl.exe, and *.pyc becomes *.beam. The basic idea is still the same.
You can simplify this if you don't need it to work with any arbitrary Erlang program, but only yours. In that case, you just copy the Erlang interpreter and all the .beam files that make up your program into a single directory. You can make this part of your program's Makefile, for instance.
You can then use your favorite setup.exe or MSI creation method for creating a distributable package that installs this collection of files into c:\Program Files\MyProgram on the end user's system and creates a shortcut for "erl mainmodule.beam" in their Start menu. The end user doesn't care that as part of the program they also get a copy of Erlang. That's an implementation detail.
you can use Warp. I've added examples for wrapping an Erlang release.

Resources