Consider an instruction like CALL DWORD PTR 44244100 that imports and uses a DLL function within an assembly program.We know the address used by the instruction is a Relative Virtual Address (RVA).
1.So why do I reach another VA value in the Thunk value field of the LordPE software when I trace that piece of code by it?
2.Whether DLL's such as User32 or Kernel32 always is loaded at a specific VA or not necessarily?
If not so how does Loader recognize which DLL the address mentioned above belongs to? by searching in Name Table?!
I mean this address is invariant,so if the loaded dll's location is fixed too,then another VA should be assigned to this address first.
Thanks all.
I don't understand the first question.. If you mean thunk as for function imports those aren't RVAs, those are flat addresses. Also address used by instruction in case of code addressing is always relative to current instruction pointer value. RVAs are only used by loader pretty much (and functions like LoadLibrary, GetProcAddr and these).. I think. x86 Processor does not know the concept of RVA that's for sure. Maybe you knew that, it wasn't very clear to me, if that's the case, sorry for lecturing.
Question two! No! It is not fixed! The loader actually goes through import table of your exe and fills in placeholders. Fixed load addresses are no more a thing since windows xp sp3. Hope this helps. If not, this helped me when I was little potato https://msdn.microsoft.com/en-us/library/ms809762.aspx
Related
I know from the Microsoft documentation that the image base is set to 0x140000000 for 64-bit images and it is the base address where the executable file is first loaded into the memory.
So my questions are as follows
What comes before 0x140000000 address and starting of virtual address first page (0x0000000)
What does it mean by executable first loaded? Is it the entry point of the program (which is of course not the main function) or something else
What comes before 0x140000000 address and starting of virtual address first page (0x0000000)
Whatever happens to allocate there, like DLLs, file mappings, heap memory, or this memory can be free. The first page is always inaccessible.
What does it mean by executable first loaded? Is it the entry point of the program (which is of course not the main function) or something else
Loaded means mapped into memory. After it is mapped into memory, its imports are resolved, statically linked DLLs are mapped into memory, their entry points are executed, and only then it comes to the executable entry point. Executable entry point is not really the first function to execute from the executable if it has TLS callbacks.
I don't know the technical reason why the 64-bit default is so high, perhaps just to make sure your app does not have 32-bit pointer truncation bugs with data/code in the module? And it is important to note that this default comes from the Microsoft compiler, Windows itself will accept a lower value. The default for 32-bit applications is 0x00400000 and there are actual hardware and technical reasons for that.
The first page starting at 0 is off limits in most operating systems to prevent issues with de-referencing a NULL pointer. The first couple of megabytes might have BIOS/firmware or other legacy things mapped there.
By first loaded, it means the loader will map the file into memory starting at that address. First the MZ part (DOS header and stub code) and the PE header. After this comes the various sections listed in the PE header.
Most applications are using ASLR these days so the base address will be random and not the preferred address listed in the PE. ntdll and kernel32 are mapped before the exe so if you choose their base address you will also be relocated.
When a system call is invoked from 64-bit userspace to 64-bit kernel, syscall table is accessed from arch/x86/kernel/entry_64.S, from the system_call assembly entry point. How can I get the virtual/physical address of this "system_call()" routine?
In other words, I want to know the address of entry point used by all system calls. I tried looking at kallsyms file but couldn't find it there. Perhaps, it has another name in kallsyms?
Reference: https://lwn.net/Articles/604287/
What do you need this for? Are you sure you were inspecting kallsyms of the same kernel which was used in the article?
Figuring out what the func got renamed to is left as an exercise for the reader.
Low-level details on linking and loading of (PE) programs in Windows.
I'm looking for an answer or tutorial that clarifies how a Windows program are linked and loaded into memory after it has been assembled.
Especially, I'm uncertain about the following points:
After the program is assembled, some instructions may reference memory within the .DATA section. How are these references translated, when the program is loaded into memory starting at some arbitrary address? Does RVA's and relative memory references take care of these issues (BaseOfCode and BaseOfData RVA-fields of the PE-header)?
Is the program always loaded at the address specified in ImageBase header field? What if a loaded (DLL) module specifies the same base?
First I'm going to answer your second question:
No, a module (being an exe or dll) is not allways loaded at the base address. This can happen for two reasons, either there is some other module already loaded and there is no space for loading it at the base address contained in the headers, or because of ASLR (Address Space Layout Randomization) which mean modules are loaded at random slots for exploit mitigation purposes.
To address the first question (it is related to the second one):
The way a memory location is refered to can be relative or absolute. Usually jumps and function calls are relative (though they can be absolute), which say: "go this many bytes from the current instruction pointer". Regardless of where the module is loaded, relative jumps and calls will work.
When it comes to addressing data, they are usually absolute references, that is, "access these 4-byte datum at this address". And a full virtual address is specified, not an RVA but a VA.
If a module is not loaded at its base address, absolute references will all be broken, they are no longer pointing to the correct place the linker assumed they should point to. Let's say the ImageBase is 0x04000000 and you have a variable at RVA 0x000000F4, the VA will be 0x040000F4. Now imagine the module is loaded not at its BaseAddress, but at 0x05000000, everything is moved 0x1000 bytes forward, so the VA of your variable is actually 0x050000F4, but the machine code that accessess the data still has the old address hardcoded, so the program is corrupted. In order to fix this, linkers store in the executable where these absolute references are, so they can be fixed by adding to them how much the executable has been displaced: the delta offset, the difference between where the image is loaded and the image base contained in the headers of the executable file. In this case it's 0x1000. This process is called Base Relocation and is performed at load time by the operating system: before the code starts executing.
Sometimes a module has no relocations, so it can't be loaded anywhere else but at its base address. See How do I determine if an EXE (or DLL) participate in ASLR, i.e. is relocatable?
For more information on ASLR: https://insights.sei.cmu.edu/cert/2014/02/differences-between-aslr-on-windows-and-linux.html
There is another way to move the executable in memory and still have it run correctly. There exists something called Position Independent Code. Code crafted in such a way that it will run anywhere in memory without the need for the loader to perform base relocations.
This is very common in Linux shared libraries and it is done addressing data relatively (access this data item at this distance from the instruction pointer).
To do this, in the x64 architecture there is RIP-relative addressing, in x86 a trick is used to emulate it: get the content of the instruction pointer and then calculate the VA of a variable by adding to it a constant offset.
This is very well explained here:
https://www.technovelty.org/linux/plt-and-got-the-key-to-code-sharing-and-dynamic-libraries.html
I don't think PIC code is common in Windows, more often than not, Windows modules contain base relocations to fix absolute addresses when it is loaded somewhere else than its prefered base address, although I'm not exactly sure of this last paragraph so take it with a grain of salt.
More info:
http://opensecuritytraining.info/LifeOfBinaries.html
How are windows DLL actually shared? (a bit confusing because I didn't explain myself well when asking the question).
https://www.iecc.com/linker/
I hope I've helped :)
It is said that on the offset 0x1C of struct PEB_LDR_DATA stores the head pointer to In InitializationOrderModuleList, is that right?
Beyond that, the second node of In InitializationOrderModuleList should be kernel32.dll, however, when I locate the second node, it turns out not to be the base address of kernel32.dll, instead, it is something like kernelbase.dll, how can that be explained?
Thanks!
You're relying on undocumented implementation details, and you ran into a newer implementation.
Implementation details aren't guaranteed to remain unchanged.
This particular detail appears to have been changed to provide defense-in-depth against code injection attacks using buffer overflow bugs.
The comments here are correct, you're running into new-ish (actually pretty old at this point) changes to Windows that dynamically load kernel32.dll. The strategy you're attempting stopped working after Vista.
That doesn't mean you can't, of course. This tactic works for me just fine:
http://blog.harmonysecurity.com/2009_06_01_archive.html
Hey,
Today I tried to do a binary diffing of NDIS.sys, and I noticed something weird.
I took a function, and began to diff it. The first 30 bytes were the same on the disk(using IDA) and on memory(using WinDbg). Then, something have changed. I saw something like "jmp _imp_XXXXX". the JMP bytes were the same, but the address was different.
My question is - what makes the difference? I think it has something to do with relocations. Altough the jump is to address in the same module, it's a long jump, which makes it relative to the module base address. If relocation occured, it needs to relocate this address too, altough its on the same module.
Am I right or totally wrong? :-)
Thanks.
Yes, jump targets get re-written during relocation when a module is not loaded at it's preferred base address in memory. Actually, developers are advised to provide a non-default base address for their modules to avoid relocation cost, but many never do, so some modules will always get relocated and the loader has to re-write jump targets.