Position code independant executable on windows - portable-executable

Is it possible to build a position independant code for windows ?
I am not talking about a dll, i am talking about an executable (PE).
What i want to do is that functions (main for example) which are in the program should be mapped at a different memory address between 2 executions.
Thanks

You should read up on the image base relocation directory in a PE file. I was wondering the same thing you are asking now a while back. From the research I had done I couldn't find any compiler that generates completely base independent code. I ended up writing my program entirely in x86 asm with some tricks to load strings etc independent form the address. I can tell you its not worth the trouble. Just remap all the addresses to your target location.
I suggest you start reading this
https://msdn.microsoft.com/en-us/library/ms809762.aspx
You should also check out Icezilon's (not sure how you spell his name) tutorials. They're very helpful

Related

How absolute code get generated

I have gone through with memory managment concepts of Operating system concept of Galvin , I have read a statment :
If you know at compile time where the process will reside in memory, then absolute code can be generated.
How at compile time processor got to know at which memory location in main memory process is going to store.
Can someone explain , what is the exactly does it means , if we know at compile time where process will reside in memory ,
As memory be allocated when program is moving from ready to running state .
Generally, machine code isn't position-independent. In order to be able to load it at an arbitrary starting address and run there one needs some extra information about the machine code (e.g. where it has addresses to the various parts of itself), so it can be adjusted to the arbitrary position.
OTOH, if the code is always going to be loaded at the same fixed address, you don't need any of that extra information and processing.
By absolute he means fixed + final, already adjusted to the appropriate address.
The processor does not "know" anything. You "tell" it.
I don't know exactly what he means with "absolute code", depending which operating system you use, the program with it's code and data will be loaded to a virtual address and executed from there.
Beside of this not the compiler but the linker sets the address where the program will be loaded to.
Modern operating systems like Linux are using Address Space Layout Randomization to avoid having one static address where every program is loaded and to avoid the possibility of exploiting software flaws.
If you're writing your own operating system maybe the osdev.org wiki could be a good ressource for you. If you can read/speak german I recommened lowlevel.eu either.

Windows Executable file structure

I know that generally the object file has code, data, heap and stack sections.
But I want to know how this is arranged in windows executables and Linux executables.
I searched on internet and found some structure.
I understood .text is for code and .data is for global variables.
I want to know here is the stack and heap in both Linux and Windows platform?
Can anybody tell me the executable file structures??
Thanks in advance...
This is the specification that Microsoft has released:
http://msdn.microsoft.com/en-us/windows/hardware/gg463119
Also this is a good reading on the subject:
http://msdn.microsoft.com/en-us/magazine/cc301805.aspx
EDIT:
Stack/Heap are in-memory structures which are created/modified during run-time so in essence they are not in the file itself - they can't be. Think of them as a special place in memory where each and every program can store run-time data and by run-time data I mean variables. function invocations, return values and all the nitty-gritty stuff that are hapening on the low level.

LoadLibrary from offset in a file

I am writing a scriptable game engine, for which I have a large number of classes that perform various tasks. The size of the engine is growing rapidly, and so I thought of splitting the large executable up into dll modules so that only the components that the game writer actually uses can be included. When the user compiles their game (which is to say their script), I want the correct dll's to be part of the final executable. I already have quite a bit of overlay data, so I figured I might be able to store the dll's as part of this block. My question boils down to this:
Is it possible to trick LoadLibrary to start reading the file at a certain offset? That would save me from having to either extract the dll into a temporary file which is not clean, or alternatively scrapping the automatic inclusion of dll's altogether and simply instructing my users to package the dll's along with their games.
Initially I thought of going for the "load dll from memory" approach but rejected it on grounds of portability and simply because it seems like such a horrible hack.
Any thoughts?
Kind regards,
Philip Bennefall
You are trying to solve a problem that doesn't exist. Loading a DLL doesn't actually require any physical memory. Windows creates a memory mapped file for the DLL content. Code from the DLL only ever gets loaded when your program calls that code. Unused code doesn't require any system resources beyond reserved memory pages. You have 2 billion bytes worth of that on a 32-bit operating system. You have to write a lot of code to consume them all, 50 megabytes of machine code is already a very large program.
The memory mapping is also the reason you cannot make LoadLibrary() do what you want to do. There is no realistic scenario where you need to.
Look into the linker's /DELAYLOAD option to improve startup performance.
I think every solution for that task is "horrible hack" and nothing more.
Simplest way that I see is create your own virtual drive that present custom filesystem and hacks system access path from one real file (compilation of your libraries) to multiple separate DLL-s. For example like TrueCrypt does (it's open-source). And than you may use LoadLibrary function without changes.
But only right way I see is change your task and don't use this approach. I think you need to create your own script interpreter and compiler, using structures, pointers and so on.
The main thing is that I don't understand your benefit from use of libraries. I think any compiled code in current time does not weigh so much and may be packed very good. Any other resources may be loaded dynamically at first call. All you need to do is to organize the working cycles of all components of the script engine in right way.

how to store assembly in memory

I have a question about how to store the assembly language in memory, when I compile the C-code in assembly, and run by "step", I can see the address of each instruction, but is there a way to change the start address of the code in the memory?
Second question is, can I break the assembly code into two?
That is, how to store the two parts in separate memory sections?
is there a way to do that?
I am curious about how the machine store the assembly code.
I am working on a MACBOOK Pro, duo core.
For the first question, can we change the offset value? or the linker and loader can not be controlled by the user? I am a litter confuse with your answer, it seems that we can not change it?
For the second question, I think what you are talking about is "input section", even if your have many ".text" input sections in your codes, after being assembled, they will become one ".text" "output section".
And the output section is the actual code stored in memory.
And I am wondering if I can control its position.
I am using the knowledge of DSP assembly, I think the mechanisms are same.
I'm not completely following either of your questions, but I'll guess.
For the first one, you're asking how to change where the executable is positioned in memory? ELF files have a preferred offset that the linker will try to use first, but the loader is usually free to position the sections anywhere if the base offset isn't available. If the image is non-relocatable and the preferred offset is unavailable, the loader will fail and the program won't run
As for your second question, you want to modify the assembly so code will be in different sections? How to do that depends on the assembler you're using; in gas you use the section pseudo-op:
.section new-section-name
The code following that directive will be in the specified section

Patching an EXE using IDA

Say there is a buggy program that contains a sprintf() and i want to change it to a snprintf so it doesn't have a buffer overflow.. how do I do that in IDA??
You really don't want to make that kind of change using information from IDA pro.
Although IDA's disassembly is relatively high quality, it's not high quality enough to support executable rewriting. Converting a call to sprintf to a call to snprintf requires pushing a new argument on to the stack. That requires the introduction of a new instruction, which impacts the EA of everything that follows it in the executable image. Updating those effective addresses requires extremely high quality disassembly. In particular, you need to be able to:
Identify which addresses in the executable are data, and which ones are code
Identify which instruction operands are symbolic (address references) and which instruction operands are numeric.
Ida can't (reliably) give you that information. Also, if the executable is statically linked against the crt, it may not contain snpritnf, which would make performing the rewriting by hand VERY difficult.
There are a few potential workarounds. If there is sufficient padding available in (or after) the function making the call, you might be able to get away with only rewriting a single function. Alternatively, if you have access to object files, and those object files were compiled with the /GY switch (assuming you are using Visual Studio) then you may be able to edit the object file. However, editing the object file may still require substantial fix ups.
Presumably, however, if you have access to the object files you probably also have access to the source. Changing the source is probably your best bet.

Resources