a few questions about windows memory segmentation.
every process in windows got his own virtual memory. does it mean that each each process has it own task
(I mean own Task descriptor or Task gate) ?
I opened a simple exe with ollydbg and I saw that for each CALL intruction to a dll function is taking me to the jumping table. the jumping table had jumping instructions to the DLLs like this one :
JMP DWORD PTR DS:[402058]
my question is why its uses the data segment and not the CS selector for the base address?
if I open the memory map and find what stored at 402058 I find that it containes resorces.
if I understand correctly the addresses of the DLL function stored in the DS ?
I noticed that the memory map is organized by owner. shouldn't it be organized with segments like all the code be in CS data in DS etc ?
thank you
1.
A Process has it's own virtual address space.
I do not understand what you're referring to as "Task descriptor or Task gate", but the Windows operating system holds a descriptor for each process, called the Process Control Block, that contains information about the process (such as identification, access tokens, execution state, virtual memory mapping, etc).
A Task is a logical unit that can be used to manage a single process, or multiple processes.
Job -> Tasks
Task -> Processes
Process -> Threads
2.
In the case you mentioned, which is common for compilers, the program uses the .DATA section to store the jump table after loading the function addresses.
The reason why this happens in the first place is because the compiler cannot know the DLL base address at compile-time, therefore the address has to be fixed at load-time to point to the function. This is known as Relocation.
In order to maintain the jump table seperately from the code, compilers store it in the .DATA section. This way, we can also give it write permissions (usually the .DATA segment has write permissions) and modify it as necessary without sacrificing stability and security.
3.
Each module loaded in the process' virtual address space contains it's own sections - that's why you see a different set of .text, .data, .reloc etc for each module. The "Owner" column is the module name.
P.S. Please ask one question per post - that way it will be easily accesible by other users after you get answered, and each question will likely get more accurate answers.
Related
I came across the NtUnmapViewOfSection method, (ntdll.dll) and I don't quite understand what it does? Can you explain please. Thanks in advance.
The obvious answer is that it undoes what NtMapViewOfSection did.
To go a little more into the details. A section (known as a file mapping in Win32 officially) is a range of pages in a processes address space that represents the contents of a file (or "memory" in the page file case). Exactly how these pages map to parts of a file is a cooperation between the operating systems kernel and the CPU. You can read more about virtual memory and the Windows paging/caching model in the Windows Internals books.
Unmapping a section ends up removing this range of pages from the processes address space. Mapping and unmapping portions of a file must be done if the file is larger than the largest free range in a processes address space. Unmapping also decrements the reference count on the underlying section handle (file handle in Win32).
This is actually a theoretical question about memory management. Since different operating systems implement things differently, I'll have to relieve my thirst for knowledge asking how things work in only one of them :( Preferably the open source and widely used one: Linux.
Here is the list of things I know in the whole puzzle:
malloc() is user space. libc is responsible for the syscall job (calling brk/sbrk/mmap...). It manages to get big chunks of memory, described by ranges of virtual addresses. The library slices these chunks and manages to respond the user application requests.
I know what brk/sbrk syscalls do. I know what 'program break' means. These calls basically push the program break offset. And this is how libc gets its virtual memory chunks.
Now that user application has a new virtual address to manipulate, it simply writes some value to it. Like: *allocated_integer = 5;. Ok. Now, what? If brk/sbrk only updates offsets in the process' entry in the process table, or whatever, how the physical memory is actually allocated?
I know about virtual memory, page tables, page faults, etc. But I wanna know exactly how these things are related to this situation that I depicted. For example: is the process' page table modified? How? When? A page fault occurs? When? Why? With what purpose? When is this 'buddy algorithm' called, and this free_area data structure accessed? (http://www.tldp.org/LDP/tlk/mm/memory.html, section 3.4.1 Page Allocation)
Well, after finally finding an excellent guide (http://duartes.org/gustavo/blog/post/how-the-kernel-manages-your-memory/) and some hours digging the Linux kernel, I found the answers...
Indeed, brk only pushes the virtual memory area.
When the user application hits *allocated_integer = 5;, a page fault occurs.
The page fault routine will search for the virtual memory area responsible for the address and then call the page table handler.
The page table handler goes through each level (2 levels in x86 and 4 levels in x86_64), allocating entries if they're not present (2nd, 3rd and 4th), and then finally calls the real handler.
The real handler actually calls the function responsible for allocating page frames.
I have some shared libraries mapped into virtual address space of my task. What happens when I change some data for example in .bss section? I do it using kmap with physical page address as argument. I can suggest 2 ways. Data is changed and it influences at all tasks which use the library or the certain page is copied due to COW.
I think it's neither. The .bss area is set up when the executable is loaded. Virtual memory space is allocated for it at that time, and that space won't be shared with any other task. Pages won't be allocated initially (by default, mlock* can change that); they will be faulted in (i.e. demand-zeroed) as referenced.
I think that even if the process forks before touching the memory, the new process would then just get the equivalent (same virtual memory space marked as demand-zero).
So if you already have a physical address for it, I would think that's already happened and you won't be changing anything except the one page belonging to the current process.
So I have been reasearching the PE format for the last couple days, and I still have a couple of questions
Does the data section get mapped into the process' memory, or does the program read it from the disk?
If it does get mapped into its memory, how can the process aqquire the offset of the section? ( And other sections )
Is there any way the get the entry point of a process that has already been mapped into the memory, without touching the file on disk?
Does the data section get mapped into the process' memory
Yes. That's unlikely to survive for very long, the program is apt to write to that section. Which triggers a copy-on-write page copy that gets the page backed by the paging file instead of the PE file.
how can the process aqquire the offset of the section?
The linker already calculated the offsets of variables in the section. It might be relocated, common for DLLs that have an awkward base address that's already in use when the DLL gets loaded. In which case the relocation table in the PE file is used by the loader to patch the addresses in the code. The pages that contain such patched code get the same treatment as the data section, they are no longer backed by the PE file and cannot be shared between processes.
Is there any way the get the entry point of a process
The entire PE file gets mapped to memory, including its headers. So you can certainly read IMAGE_OPTIONAL_HEADER.AddressOfEntryPoint from memory without reading the file. Do keep in mind that it is painful if you do this for another process since you don't have direct access to its virtual address space. You'd have to use ReadProcessMemory(), that's fairly little joy and unlikely to be faster than reading the file. The file is pretty likely to be present in the file system cache. The Address Space Layout Randomization feature is apt to give you a headache, designed to make it hard to do these kind of things.
Does the data section get mapped into the process' memory, or does the program read it from the disk?
It's mapped into process' memory.
If it does get mapped into its memory, how can the process aqquire the offset of the section? ( And other sections )
By means of a relocation table: every reference to a global object (data or function) from the executable code, that uses direct addressing, has an entry in this table so that the loader patches the code, fixing the original offset. Note that you can make a PE file without relocation section, in which case all data and code sections have a fixed offset, and the executable has a fixed entry point.
Is there any way the get the entry point of a process that has already been mapped into the memory, without touching the file on disk?
Not sure, but if by "not touching" you mean not even reading the file, then you may figure it out by walking up the stack.
Yes, all sections that are described in the PE header get mapped into memory. The IMAGE_SECTION_HEADER struct tells the loader how to map it (the section can for example be much bigger in memory than on disk).
I'm not quite sure if I understand what you are asking. Do you mean how does code from the code section know where to access data in the data section? If the module loads at the preferred load address then the addresses that are generated statically by the linker are correct, otherwise the loader fixes the addresses with relocation information.
Yes, the windows loader also loads the PE Header into memory at the base address of the module. There you can file all the info that was in the file PE header - also the Entry Point.
I can recommend this article for everything about the PE format, especially on relocations.
Does the data section get mapped into the process' memory, or does the
program read it from the disk?
Yes, everything before execution by the dynamic loader of operating systems either Windows or Linux must be mapped into memory.
If it does get mapped into its memory, how can the process acquire the
offset of the section? ( And other sections )
PE file has a well-defined structure which loader use that information and also parse that information to acquire the relative virtual address of sections around ImageBase. Also, if ASLR - Address randomization feature - was activated on the system, the loader has to use relocation information to resolve those offsets.
Is there any way the get the entry point of a process that has already
been mapped into the memory, without touching the file on disk?
NOPE, the loader of the operating system for calculation of OEP uses ImageBase + EntryPoint member values of the optional header structure and in some particular places when Address randomization is enabled, It uses relocation table to resolve all addresses. So we can't do anything without parsing of PE file on the disk.
How can I locate which areas of memory of a Win32 process contain the global data and the stack data for each thread?
There is no API (that I know of) to do this. But If you have a DLL in the process, then you will get DLL_PROCESS_ATTACH/DLL_THREAD_ATTACH notifications in DllMain when each thread is created. You can record the thread ID and the address of a stack object for that thread when you get these notifications, because you will get called on the new thread. So store the thread id and stack address in some table that you create at that time. Don't try to do a lot of work in DllMain, just record the stack location and return.
You can then use VirtualQuery to get turn the address of a variable on each thread stack into a virtual allocation range, that should give you the base address of the stack (remember that stacks grow from high addresses to low addresses). The default allocation size for a stack is 1Mb, but that can be overridden by a linker switch or by the thread creator, but a stack must be contiguous. So what you get back from VirtualQuery will be the full stack at that point in time
As for the heap location - there can be multiple locations for the heap, but in general if you want to assume a single contigous heap location then use HeapAlloc to get the address of a heap object and then VirtualQuery to determine the range of pages for that section of the heap.
Alternatively You can use VirtualQuery on the hModule for the EXE and for each DLL. and then you can assume that anything that is read-write and isn't a stack or a module is part of the heap. Note that this will be true in most processes, but may not be true in some because an application can call VirtualAlloc or CreateFileMapping directly, resulting in valid data pointers that are not from either stack or heap.
Use EnumProcessModules to get the the list of modules loaded into a process.
VirtualQuery basically takes a random address, and returns the base address of the collection of pages that that address belongs to, as well as the page protections. So it's good for going from a specific pointer which 'type' of allocation.
Take the address of variables allocated in the memory regions you are interested. What you do with the addresses when you have them is another question entirely.
You can also objdump -h (I think it's -h, might be -x) to list the section addresses, including data sections.
Global Data
By "Global" I'm going to assume you mean all the data that is not dynamically allocated using new, malloc, HeapAlloc, VirtualAlloc etc - the data that you may declare in your source code that is outside of functions and outside of class definitions.
You can locate these by loading each DLL as a PE File in a PE file reader and determining the locations of the .data and .bss sections (these may have different names for different compilers). You need to do this for each DLL. That gives you the general locations for this data for each DLL. Then, if you have debugging information, or failing that, a MAP file, you can map the DLL addresses against the debug info/mapfile info to get names and exact locations for each variable.
You may find the PE Format DLL helps you perform this task much easier than writing the code to query the PE file yourself.
Thread Stacks
Enumerate the threads in the application using ToolHelp32 (or PSAPI library if on Windows NT 4). For each thread, get the thread context and read the ESP register (RSP for x64). Now do a VirtualQuery on the address in the ESP/RSP register read from each context. The 1MB (default value) region around that address (start at mbi.AllocationBase and work upwards 1MB) is the stack location. Note that the stack size may not be 1MB, you can query this from the PE header of the DLL/EXE that started the thread if you wish.
EDIT, Fix typo where I swapped some register names, thanks #interjay