OS and Hardware role during a LD instruction - memory-management

When loading the contents of a virtual address into a particular register, what are some general sequence of events that need to happen in the hardware and operating system as part of the process?
For example,
LD 0xffe4ca32, R1
The address used for this is the virtual address right?
And it would need to go through some address translation first to get a physical address.
My first question is,
When this instruction executes, how is this instruction handled by the Hardware and Operating System?
And my second question is,
Is the "value" of that virtual address, 0xffe4ca32, the contents of its mapped physical address or is it the physical address itself?
Im just not clear what is being loaded into R1

Here:
Let's assume x86. First, the CPU asks the MMU (memory management unit) to to translate the address. First the MMU checks something called the TLB (translation look-aside buffer), where recent translations from virtual to physical are stored. If it is there, the referenced address is returned. Otherwise, the MMU looks up the address in the page table. If the page is either a supervisor only page, or a page marked as not present in memory, the CPU throws a protection fault, or a page fault. For the protection fault, the OS will usually terminate the responsible process however it does that. For a page fault, the OS then checks it's own special paging structures to see if that page has been paged out, or if it just doesn't exist. If it has been paged out, it is read in to some page somewhere in memory, and the virtual address is remapped to that new place. If space cannot be found, another page will be put on disk to make room (a lot of this is called thrashing). If it has not been paged out, the OS will most likely kill the process, as it is trying to reference a non existing page.
Value of mapped physical address. Virtual memory pointers behave just like physical memory pointers in the perspective of user-space. In kernel space, there are some complications as physical memory access is needed (this is usually achieved through something called identity paging, where the first few hundred pages are mapped directly to their corresponding physical memory.

Related

A Process accessing memory outside of allocated region

Assume a process is allocated a certain region of virtual memory.
How will the processor react if the process happens to access a memory region outside this allocation region?
Does the processor kill the process? Or does it raise a Fault?
Thank you in advance.
Processes are not really allocated a certain region of virtual memory. They are allocated physical frames that they can access using virtual memory. Processes have virtual access to all virtual memory available.
When a high level language is compiled, it is placed in an executable. This executable is a file format which specifies several things among which is the virtual memory in use by the program. When the OS launches that executable, it will allocate certain physical pages to the newly created process. These pages contain the actual code. The OS needs to set up the page tables so that the virtual addresses that the process uses are translated to the right position in memory (the right physical addresses).
When a process attempts to jump nowhere at a virtual address it shouldn't jump to, several things can happen. It is undefined behavior.
As stated on osdev.org (https://wiki.osdev.org/Paging):
A page fault exception is caused when a process is seeking to access an area of virtual memory that is not mapped to any physical memory, when a write is attempted on a read-only page, when accessing a PTE or PDE with the reserved bit or when permissions are inadequate.
The CPU pushes an error code on the stack before firing a page fault exception. The error code must be analyzed by the exception handler to determine how to handle the exception. The bottom 3 bits of the exception code are the only ones used, bits 3-31 are reserved.
It really depends on the language you used and several factors come into play. For example, in assembly, if you try to jump in RAM to a random virtual address. Several things can happen.
If you jump into an allocated page, then the page could contain anything. It could as well contain zeroes. If it contain zeroes, then the process will keep executing the instructions until it reaches a page which isn't present in RAM and trigger a page fault. Or it could as well just end up executing a jmp to somewhere else in RAM and in the end trigger page fault.
If you jump into a page which has the present bit not set (unallocated page), then the CPU will trigger a page fault immediately. Since the page is not allocated, it will not magically become allocated. The OS needs to take action. If the page was supposed to be accessed by the process then maybe it was swapped to the hard disk and the OS needs to swap it back in RAM. If it wasn't supposed to be accessed (like in this case), the OS needs to kill the process (and it does). The OS knows the process should not access a page by looking at its memory map for that process. It should not just blindly allocate a page to a process which jumps nowhere. If the process needs more memory during execution it can ask the OS properly using system calls.
If you jump to a virtual address which, once translated by the MMU using the page tables, lands in RAM in kernel mode code (supervisor code), the CPU will trigger a page fault with supervisor and present error codes (1 0 1).
The OS uses 2 levels of permission (0 and 3). Thus all user mode processes run with permission 3. Nothing prevents one user process from accessing the memory and the code of another process except the way the page tables are set up. The page tables are often not filled up completely. If you jump to a random virtual address, anything can happen. The virtual address can be translated to anything.

TLB Hit - Checking if the page is within the process's memory space

I have been reading about the translation of virtual addresses to physical addresses. I understand that the TLB is a hardware cache that resides in the CPU's Memory Management Unit and contains mappings of recent pages accessed.
However, say there is a TLB hit - How does the OS ensure that the page can actually be accessed by the process (is within the process's allocated address space)?
I believe that one way to do that would be to check with the process's page table, but that seems to defeat the whole purpose of using a TLB. Any insights ?
It depends upon the memory management strategy that the OS is using. For examples, in case of the OS using the inverted paging table, each entry in the page table contains the id of the process (PID) that are owning the page.
For the "normal" paging, each paging entry may contain extra bits for memory protection and sharing.
At a basic level the TLB only contains pages that are in ram, and the os clears the TLB whenever a page is removed from ram.

How does the system define the portion of virtual memory a process gets?

If there is a 32 bit system (assume Windows), the virtual address space is 4GB. So CPu can generate any address between this range. Then shoudn't a process also be able to address anywhere in this range?
It is said that each process has its own private virtual address space.Then How does the system facilitate this?
In other words the CPU generates a 32 bit address, and that gets translated into physical address. Now how does CPU know that a specific process has to address only a specific part of the virtual address space(its private virtual address space).
Suppose a process addresses an address out of its private virtual address space, what happens?
A program has to call VirtualAlloc() on Windows to tell the operating system that it wants to use a chunk of virtual memory. Often called indirectly as a result of allocating memory from a heap or loading a DLL.
The operating system, in turn, sets up the page mapping tables that the CPU uses to translate a virtual address as used in the program to a physical RAM address as output on its address bus pins. One of three unusual things can happen whenever the CPU reads or writes data or executes code at a virtual memory address:
if there is no entry in the page mapping tables then the CPU raises a general protection fault trap. The operating system verifies that the address is invalid and terminates the program
if the page is not mapped to RAM yet then the CPU raises a page fault trap. The operating system finds a page of RAM that's unused, swapping out a used page if necessary. And ensures the content is valid, loading it from a file or the paging file if necessary. And updates the table entry so it now has the physical address of the RAM page. Execution resumes as normal
the CPU verifies that access to the page is allowed. A write to a page that is marked as read-only or an execute of a instruction in the page that's marked as no-execute generates a general protection fault trap. The operating system terminates the program.
Every process has its own set of page mapping tables, ensuring that one process cannot access the RAM pages that are used by another. Unless sharing is specifically requested, common for pages of code loaded from an executable file and memory mapped files. A context switch loads the CR2 register, the CPU register that contains the address of the page mapping table.
So there is no scenario where a process can ever address memory outside of its private virtual address space, the lack of a matching paging table entry ensures that this terminates the program.
The whole 4 GB address space is available to the process (although typically the upper half is reserved for kernel data), and the MMU maps parts of it to physical memory. The process cannot go "out" of its address space (all the 4 GB of it are allowed to be used), but if some part of it hasn't been mapped to physical memory a hardware exception is raised.
The address space is said to be private since the operating system changes the settings of the MMU at task switch, so every process sees a different independent memory layout (although parts of the address space can be shared with other processes).

Page table in Linux kernel space during boot

I feel confuse in page table management in Linux kernel ?
In Linux kernel space, before page table is turned on. Kernel will run in virtual memory with 1-1 mapping mechanism. After page table is turned on, then kernel has consult page tables to translate a virtual address into a physical memory address.
Questions are:
At this time, after turning on page table, kernel space is still 1GB (from 0xC0000000 - 0xFFFFFFFF ) ?
And in the page tables of kernel process, only page table entries (PTE) in range from 0xC0000000 - 0xFFFFFFFF are mapped ?. PTEs are out of this range will be not mapped because kernel code never jump there ?
Mapping address before and after turning on page table is same ?
Eg. before turning on page table, the virtual address 0xC00000FF is mapped to physical address 0x000000FF, then after turning on page table, above mapping does not change. virtual address 0xC00000FF is still mapped to physical address 0x000000FF. Different thing is only that after turning on page table, CPU has consult the page table to translate virtual address to physical address which no need to do before.
The page table in kernel space is global and will be shared across all process in the system including user process ?
This mechanism is same in x86 32bit and ARM ?
The following discussion is based on 32-bit ARM Linux, and version of kernel source code is 3.9
All your questions can be addressed if you go through the procedure of setting up the initial page table(which will be overwitten later by function paging_init ) and turning on MMU.
When kernel is first launched by bootloader, Assembly function stext(in arch\arm\kernel\head.s) is the first function to run. Note that MMU has not been turned on yet at this moment.
Among other things, the two import jobs done by this function stext is:
create the initial page tabel(which will be overwitten later by
function paging_init )
turn on MMU
jump to C part of kernel initialization code and carry on
Before delving into the your questions, it is benificial to know:
Before MMU is turned on, every address issued by CPU is physical
address
After MMU is turned on, every address issued by CPU is virtual address
A proper page table should be set up before turning on MMU, otherwise your code will simply "be blown away"
By convention, Linux kernel uses higher 1GB part of virtual address and user land uses the lower 3GB part
Now the tricky part:
First trick: using position-independent code.
Assembly function stext is linked to address "PAGE_OFFSET + TEXT_OFFSET"(0xCxxxxxxx), which is a virtual address, however, since MMU has not been turned on yet, the actual address where assembly function stext is running is "PHYS_OFFSET + TEXT_OFFSET"(the actual value depends on your actual hardware), which is a physical address.
So, here is the thing: the program of function stext "thinks" that it is running in address like 0xCxxxxxxx but it is actually running in address (0x00000000 + some_offeset)(say your hardware configures 0x00000000 as the starting point of RAM). So before turning on MMU, the assembly code need to be very carefully written to make sure that nothing goes wrong during the execution procedure. In fact a techinque called position-independent code(PIC) is used.
To further explain the above, I extract several assembly code snippets:
ldr r13, =__mmap_switched # address to jump to after MMU has been enabled
b __enable_mmu # jump to function "__enable_mmu" to turn on MMU
Note that the above "ldr" instruction is a pseudo instruction which means "get the (virtual) address of function __mmap_switched and put it into r13"
And function __enable_mmu in turn calls function __turn_mmu_on:
(Note that I removed several instructions from function __turn_mmu_on which are essential instructions to the function but not of our interest)
ENTRY(__turn_mmu_on)
mcr p15, 0, r0, c1, c0, 0 # write control reg to enable MMU====> This is where MMU is turned on, after this instruction, every address issued by CPU is "virtual address" which will be translated by MMU
mov r3, r13 # r13 stores the (virtual) address to jump to after MMU has been enabled, which is (0xC0000000 + some_offset)
mov pc, r3 # a long jump
ENDPROC(__turn_mmu_on)
Second trick: identical mapping when setting up initial page table before turning on MMU.
More specifically, the same address range where kernel code is running is mapped twice.
The first mapping, as expected, maps address range 0x00000000(again,
this address depends on hardware config) through (0x00000000 +
offset) to 0xCxxxxxxx through (0xCxxxxxxx + offset)
The second mapping, interestingly, maps address range 0x00000000
through (0x00000000 + offset) to itself(i.e.: 0x00000000 -->
(0x00000000 + offset))
Why doing that?
Remember that before MMU is turned on, every address issued by CPU is physical address(starting at 0x00000000) and after MMU is turned on, every address issued by CPU is virtual address(starting at 0xC0000000).
Because ARM is a pipeline structure, at the moment MMU is turned on, there are still instructions in ARM's pipeine that are using (physical) addresses that are generated by CPU before MMU is turned on! To avoid these instructions to get blown up, an identical mapping has to be set up to cater them.
Now returning to your questions:
At this time, after turning on page table, kernel space is still 1GB (from 0xC0000000 - 0xFFFFFFFF ) ?
A: I guess you mean turning on MMU. The answer is yes, kernel space is 1GB(actually it also occupies several mega bytes below 0xC0000000, but that is not of our interest)
And in the page tables of kernel process, only page table entries (PTE) in range from 0xC0000000 - 0xFFFFFFFF are mapped ?. PTEs are out
of this range will be not mapped because kernel code never jump there
?
A: While the answer to this question is quite complicated because it involves lot of details regarding specific kernel configurations.
To fully answer this question, you need to read the part of kernel source code that set up the initial page table(assembly function __create_page_tables) and the function which sets up the final page table(C function paging_init).
To put it simple, there are two levels of page table in ARM, the first page table is PGD, which occupies 16KB. Kernel first zeros out this PGD during initialization process and does the initial mapping in assembly function __create_page_tables. In function __create_page_tables, only a very small portion of address space is mapped.
After that, the final page table is set up in function paging_init, and in this function, a quite large portion of address space is mapped. Say if you only have 512M RAM, for most common configurations, this 512M-RAM would be mapping by kernel code section by section(1 section is 1MB). If your RAM is quite large(say 2GB), only a portion of your RAM will be directly mapped.
(I will stop here because there are too many details regarding Question 2)
Mapping address before and after turning on page table is same ?
A: I think I've already answered this question in my explanation of "Second trick: identical mapping when setting up initial page table before turning on MMU."
4 . The page table in kernel space is global and will be shared across
all process in the system including user process ?
A: Yes and no. Yes because all processes share the same copy(content) of kernel page table(higher 1GB part). No because each process uses its own 16KB memory to store the kernel page table(although the content of page table for higher 1GB part is identical for every process).
5 . This mechanism is same in x86 32bit and ARM ?
Different Architectures use different mechanism
When Linux enables the MMU, it is only required that the virtual address of the kernel space is mapped. This happens very early in booting. At this point, there is no user space. There is no restrictions that the MMU can map multiple virtual addresses to the same physical address. So, when enabling the MMU, it is simplest to have a virt==phys mapping for the kernel code space and the mapping link==phys or the 0xC0000000 mapping.
Mapping address before and after turning on page table is same ?
If the physical code address is Oxff and the final link address is 0xc00000FF, then we have a duplicate mapping when turning on the MMU. Both 0xff and 0xc00000ff map to the same physical page. A simple jmp (jump) or b (branch) will move from one address space to the other. At this point, the virt==phys mapping can be removed as we are executing at the final destination address.
I think the above should answer points 1 through 3. Basically, the booting page tables are not the final page tables.
4 . The page table in kernel space is global and will be shared across all process in the system including user process?
Yes, this is a big win with a VIVT cache and for many other reasons.
5 . This mechanism is same in x86 32bit and ARM?
Of course the underlying mechanics are different. They are different even for different processors within these families; 486 vs P4 vs Amd-K6; ARM926 vs Cortex-A5 vs Cortex-A8, etc. However, the semantics are very similar.
See: Bootmem#lwn.net - An article on the early Linux memory phase.
Depending on the version, different memory pools and page table mappings are active during boot. The mappings we are all familiar with do not need to be in place until init runs.

How are base registers, limit registers and relocation registers used?

My understanding in address translation process in MMU(memory management unit)
-> logical address : generated by cpu.programmer concern with this address.
-> virtual address : reside in the hard disk , as a pages.
-> physical address : reside in the RAM. It is the actual address.
1: cpu generate the logical address and send it to the MMU.
2: MMU translate the logical address into the virtual address then translate it to the physical address and send the physical address to RAM.
3: when ever the RAM is full , the page which is not used rapidly is returned to the hard disk , to allocate memory to the other pages(processes).
my questions are :
1) where the value of Relocation register is added?
2) who decide the value of Relocation Register?
3) what to do with the Base register and Limit register , how to use it?
4) where the logical address goes off?
If any body can answer it , It would be grateful to me.
It is requested that , let me know it any misunderstanding in this topic.
-thanks
I can tell you how this works on x86.
All programs in non-64-bit modes operate with addresses combined of two items: segment selector (for brevity "selector" is often omitted in text and that may be confusing) and offset. This selector:offset pair is called the logical address.
The selector portion isn't always explicitly specified or manipulated with in code since the CPU has "default" associations of segment registers containing selectors with specific instructions or specific instruction encodings. It's also uncommon to manipulate selectors in 32-bit mode, but is very often necessary in 16-bit code.
The virtual address is formed from the logical address either "directly" (in real or 8086 virtual mode) or "indirectly" (in protected mode).
"Direct" virtual address = selector * 16 + offset.
"Indirect" virtual address = SegmentDescriptorTable[selector].Base + offset.
SegmentDescriptorTable is either the Global Descriptor Table (AKA GDT) or the Local Descriptor Table (AKA LDT). It's set up by the OS and describes the location and size of various segments of memory. selector is used to select a segment in the table. The Base entry of the table tells the segment's beginning (virtual address). The Limit entry tells the segment size (generally; the details are a little more complex).
When a program tries to access memory with an offset resulting access beyond the end of the segment (the CPU compares offset and Limit), the CPU generates an exception and the OS handles it, by usually terminating the program.
Btw, in real/v86 mode, even though the virtual address is formed directly from selector:offset, there's still a 16-bit Limit imposed on offsets, which is why you need to use a different selector to access more than 64KB of memory.
The Base entry in a segment descriptor can be used to either isolate the segment from the rest of the memory (Limit helps here) or to place or move the entire segment to an arbitrary virtual address without having to modify anything (or much) in the program it belongs to (if we're moving a segment, the data has to be moved in the memory, obviously). Basically, it can be used for relocation purposes. In real/v86 mode for relocation purposes the selector is changed.
The virtual address can be further translated to the physical address if the CPU is running in protected mode and has set up page tables. If there're no page tables, the physical address is the same as the virtual address. The translation is done in blocks of physical memory and address ranges that are called pages (often 4KB).
There's no dedicated relocation register on x86 CPUs. Relocation can be achieved by adjusting:
segment selectors in CPU registers or program's code
segment base addresses in GDT/LDT
offsets in program's code
physical addresses in page tables
As for virtual address : reside in the hard disk , as a pages, I'm not sure what exactly you want to say with this, but just because there's virtual to physical address translation, it doesn't mean there's also virtual on-disk memory. There are other uses for the translation besides virtual on-disk memory. And the addresses reside in the CPU and wherever your (and OS's) code writes them to, not necessarily on the disk.
Your description has a number of mistakes, much of which may be the result of imprecise documentation and common usage.
First of all, there really is no such a thing as a virtual address. There are physical and logical addresses. Sadly, the term virtual address is frequently (even in hardware documentation) used when logical address is what is meant..
The CPU instruction stream always operates on logical addresses (values may refer to physical addresses).
When the CPU needs to access a logical address, the MMU attempts to translate it to a physical addresses. It does that by looking up the address in a page table.
Several things can happen at that point:
There may not be a page table entry for the address => Access violation.
The page table entry is marked invalid => Access violation.
The page table entry indicates that no physical memory is mapped to it => Page fault.
(I omit mode access checks).
It is this last step that last step where virtual memory comes into play. At that point the page fault handler of the operating system needs to find where the corresponding page has been stored to disk, load it, update the page table, and restart the instruction.
The operating system manages the available physical memory by paging writeable memory (that has changed) to disk (read only data does not have to be written back) when there is high demand for physical memory.
I have never heard of a "relocation register" before. But doing a GOOGLE search I can see that some academic material uses it as a confusing pedagogical concept (i.e., with no relation to reality).
Some systems define the page table using base and limit registers. The base registers indicate where the page table starts in memory (this can be either a physical or logical addresses) and the limit register indicates the side of the table.
The registers are usually not loaded directly. Their values are usually written to the hardware Process Context Block (PCB). When the process context is loaded, the page table base and limit are loaded automatically.
On some systems there are multiple page tables. If there are system and user page tables, the user page tables can refer to logical addresses in the system space and the system page tables refer to physical addresses.

Resources