Setting up memory pagination - memory-management

Next, the loader creates a basic page table. This page table maps the 64 MB at the base of virtual memory (starting at virtual address
0) directly to the identical physical addresses. It also maps the same
physical memory starting at virtual address LOADER_PHYS_BASE, which
defaults to 0xc0000000 (3 GB). The Pintos kernel only wants the latter
mapping, but there's a chicken-and-egg problem if we don't include the
former: our current virtual address is roughly 0x20000, the location
where the loader put us, and we can't jump to 0xc0020000 until we turn
on the page table, but if we turn on the page table without jumping
there, then we've just pulled the rug out from under ourselves.
A citation comes from: https://web.stanford.edu/~ouster/cgi-bin/cs140-winter16/pintos/pintos_6.html
I think that I don't understand exactly what does it is written above.
My issues:
Let's assume that kernel was loaded at 0x20000 physical address by bootloader. From what I understand two different virtual address are mapped to the same place- to the kernel. The first, direct mapping: ~0x20000 -> ~0x20000. And the second ~0xc0000000 -> ~0x20000. But, why there is need to use two mappings?
I cannot see why the page table ( and page directory) must be mapped by identity.
Please explain.

Related

What happens when you lose the virtual address of the page directory?

I'm writing a memory manager for my kernel (32 bit x86)
In the course of this... I'm facing a bit of a dilemma....
Description of virtual memory map:
Identity map of first 4 Mb
Virtual address 0xC0000000 mapped to physical address 0x100000 (Also a 4 Mb map)
My page directory is at physical address 0x9c000.
My page table 1 is at physical address 0x9d000.
My page table 2 is at physical addres 0x9e000.
(I need only two page tables here :) ... These correspond to the identity map and higher memory map respectively)
Bless the identity mapping.... I can safely access my page directory and page tables as if paging wasn't even enabled. This makes it really easy for me to modify page tables,etc.
Now comes the issue: I may remove this identity mapping... If so, I can already imagine problems creeping up..
Eg. I have physical addresses that I want to access... But I can only access virtual ones. In order to map the virtual address to the required physical address, I need to access the page directory. But I have the physical address of the page directory... *I realize that I'm back where I started.
So, I'm guessing there's a need for some permanent mapping (or some sort of identity mapping for tables and the directory) so that I can forget about all this and get on with my life.
But if I map something permanently, I feel that I'm reducing the flexibilty of the program(kernel) in some sort of way.
What's the way one deals with this issue?
What happens when you lose the virtual address of the page directory? You can always get the physical address from cr3, but you have no idea where it's mapped, how to access it, and whatnot. In this case, I don't think one can even change the page directory location using cr3 because you'd be loading a physical address into it, but all that you can view are virtual addresses... It seems like a really scary situation here
Am I missing something?
Don't lose your virtual Page directory address! You need at least a small portion of the page directory at a know physical and a known virtual address (they do not need identity mapped).

how a virtual address is mapped to address on the swap partition in paging operation

I'm wondering if anyone could help me understand how a virtual address is mapped to its address one the backing store, which is used to hold moved-out pages of all user processes.
Is it a static mapping or a hash algorithm? If it's static, where such mapping is kept? It seems it can't be in the TLB or page table since according to https://en.wikipedia.org/wiki/Page_table, the PTE will be removed from both TLB and page table when a page is moved out. A description of the algorithm and C structs containing such info will be helpful.
Whether it's static mapping or hash algorithm, how to garrantee no 2 process will map its address to the same location on the swap partition, since the virtual address space of each process is so big (2^64) and the swap space is so small?
So:
during page-in, how the OS know where to find the address (corresponding to the virtual address accessed by the user process) on the swap partition to move in?
when a physical page needs to be paged out, how does the OS know where to put on the swap partition?
For the first part of your question : It is actually hardware dependent but the generic way is to keep a reference to the swap block containing the swapped out page (Depending on the implementation of the swap subsystem, it could be a pointer or a block number or an offset into a table) in it's corresponding page table entry.
EDIT:The TLB is a fast associative cache that help to do the virtual to physical page mapping very quickly. When a page is swapped out, it's entry in the TLB could be replaced by a newly active Page. But the entry in the page table cannot be replaced because page tables are not associative memory. A page table remains persistent in memory for all the duration of the process and no entry could be removed or replaced (By another virtual page). Entries in page tables could only be mapped or unmapped. When they are unmapped (Because of Swapping or freeing), the content of the entry could either hold a reference to the swap block or just an invalid value.
For the second part of your question : The system kernel maintains a list of free blocks in the swap partition. Whenever it needs to evict a RAM page, it allocates a free block and then the block reference is returned so that it can be inserted in the PTE. When the page comes back to RAM, the disk block is freed so that it could be used by other pages.
During page-in, how the OS know where to find the page (corresponding to the virtual address accessed by the user process) on the swap device to move in?
That's can actually be a fairly complicated process. The operating system has to maintain a table of where the process's pages are mapped to. This can be complicated because pages can be mapped to multiple devices and even multiple files on the same device. Some systems use the executable file for paging.
when a physical page needs to be paged out, after the virtual address for a physical page is looked up in TLB, how does the OS know where to put on the swap device?
On a rationally designed operating system, the secondary storage is allocated (or determined) when the virtual page is mapped to the process. That location remains fixed for the duration of the program being run.

linux kernel page table update

In linux x86 paging.
each process has it's own page directory.
page table walking starts with page directory which is pointed by CR3.
every process shares the kernel page directory content
assuming three sentences are correct, let's say some process enters kernel
mode and updates his kernel page directory content(address mapping, access
rights, etc...)
Question. since kernel address spaces is globally shared among processes,
this update has to be synchronized with other process's page directory,
right?
how can this be managed?
I don't know about Linux, so I'll answer for Windows. Some of the kernel space is 'global', which is a flag set in the PTE to indicate it is used by more than one process. The INVPCID instruction can be configured in the register operand to include or exclude these entries in a TLB invalidate. These page table entries are shared between the processes and all appear at the same place in the page table for each process. This way, only the single PTE needs to be updated and it doesn't need to synchronise other PTEs of other processes as they all share a single PTE at a physical address.
http://www.cs.miami.edu/home/burt/journal/NT/memory.html
Some kernel memory is not visible to all processes and is private to each process (doesn't change the fact it is still ring 0). This, on a 32 bit Windows system would be 0xC0000000–0xC0200000 which contains all the user space PTEs and PDEs where 0xC0000000 is the PTE_BASE which allows for the equation
#define MiGetPteAddress (x) ((PMMPTE)(((((ULONG)(x)) >> 12) << 2) + (ULONG_PTR)MmPteBase))
#define MiAddressToPte(x) MiGetPteAddress(x)
to work elegantly for converting faulting virtual address in cr2 to the address of the PTE. This is private to each process as each process has the same base PTE allocation base address; if it were visible to all processes it would quickly take up virtual memory as each set of page takes would have to be allocated sequentially. It doesn't need to be visible to all processes because a process has no interest in the page table entries of another process. A page fault is always handled in the context of the current process, and 0xC0000000–0xC0200000 means something different in each process context.
The kernel space 0xC0200000–0xC0400000 for allocation of kernel PTEs (for kernel addresses) would however be global and shared by all processes, except for the section within it representing 0xC0000000–0xC0200000, which by my calculation will be 0xC0300000–0xC0300800, which is the user-mode side of the PDEs as PDE_BASE = 0xC0300000–0xC0300FFF.
It is however impossible to split up the user PDE and kernel PDE section such that the former is private and the latter is global (i.e. make 0xC0300000–0xC0300800 private (point to different physical addresses) and 0xC0300000–0xC0300FFF point to the same physical address for each process) because the whole PDE region (0xC0300000–0xC0300FFF) will lie on the same physical frame and constitutes a single frame pointed to by cr3, and the cr3 is different for each process, which means that the whole PDE region (all PDEs) would have to private per process (duplicated and installed per process). If a kernel page table page (a page containing a kernel page table) were paged out and in to a new physical location then the PDEs would all have to be synchronised because all processes have copies at different cr3 physical addresses and not the same physical PDE. I'm not sure how it does this (efficiently) ATM therefore it would be wise to impose the restriction of not allowing the kernel page tables to be paged out and have them in non-paged pool; this way the kernel PDEs will remain constant across all CR3 pages. On 64 bit, the restriction could be imposed that kernel PDPTs can't be paged out. On 32 bit Windows, a process is started with a physical CR3 page with a PDE at offset 1100000000(base 2)*4 bytes pointing to itself which is hardwritten in, probably by briefly turning off paging in cr0 (because the write won't succeed without the recursive entry that needs to be written being there, creating a paradox). Notice, the PD Entry for itself is the page table that covers the range 0xC0000000–0xC0400000 i.e. it points to 1023 page tables and 1 page directory (itself) (2^10 entries) and hence allows the PTEs to be modified by their virtual address. The reason why the CR3 page is at 0xC0300000 is because the address has the same page directory and page table indexes 1100000000 and 1100000000 so it loops back on itself twice, therefore yielding the CR3 page and you can modify the PDEs by address (there are other addresses that are special like this e.g. 0xE0380000). After it is set up, the appropriate kernel mappings are made. On 64 bit Windows it would be similar where a process is set up with a single PML4 table page which points to itself and this way any PML4E, PDPTE, PDE or PTE can be filled in and accessed due to the variable amount of loopbacks. On 64 bit Windows, when a process is terminated, all the physical pages of the process get moved to the free list which would include all user physical PDPT pages, PD pages, PT pages and the PML4/CR3 page. The kernel ones would not be marked for the free list.
In general, if you know what entry in the PML4 is the recursive entry to the physical PML4 page you, can work out the virtual address of the PTE structure that services (is used to translate) a particular virtual address range and a particular virtual address in that range. You append the offset (10 bits for 32 bit; 9 bits for 64 bit) in the PML4 to the entry to itself, to the start of a virtual address whose servicing PTE virtual address you want to find (which is what the addition of 0xC0000000 is in the 32 bit equation earlier) and remove the last 12 bits and then make up the offset in the PT now at the end of the virtual address to 12 bits by multiplying it by 8 (or 4) (hence the right shift by 12 and the left shift by 3 (or 2 for 32 bit entries)). 1 loopback takes away 1 layer of indirection and you get the virtual address of the PTE. 2 loopbacks will leave you with the virtual address of the PDE that's used to translate that particular virtual address and so on. PTE_BASE on 32 bit windows is the offset 110000000 left shifted to make 32 bits and PDE_BASE is the offset 110000000110000000 left shifted to make 32 bits. It is used in the macro and any virtual address with this prefix will by definition be part of a PTE or a PDE respectively. Windows chooses the offset 1100000000 for the page table hierarchy but it could be any one of the 2^9 combinations.
KAISER, or KPTI, designed to mitigate meltdown, most likely has 2 cr3s for each process. Upon trapping to the kernel, the restricted cr3 for user mode which would contain a single kernel PML4E—enough for a preliminary interrupt dispatch routine function to be accessible, which performs the swap—would be replaced with the full cr3 containing all kernel PML4Es.
As for physical memory on windows, see here: https://superuser.com/a/1549970/933117
Question. since kernel address spaces is globally shared among processes, this update has to be synchronized with other process's page directory, right?
how can this be managed?
First; understand that paging is usually 2 or more levels of tables. For example (for 80x86), for the oldest "plain 32-bit paging" there are page tables and page directories; and for current long mode there's page map level 4, page directory pointer table, page directory and page table. CR3 points to the highest level table and that must be different for each virtual address space ("process"). For the second highest level table, a single second highest level table can be put into all highest level tables, and if you do that any changes to the second highest level table will automatically change every virtual address space.
This means that (for 80x86), for the oldest "plain 32-bit paging" you can put the same "kernel page table" into all virtual address spaces (all page directories) and when you add/remove pages from that page table it will automatically affect all virtual address spaces; and for current long mode you can put the same page directory pointer table in all virtual address spaces (all page map level 4 tables) and when you add/remove page directories, page tables, or pages, it will automatically affect all virtual address spaces.
This means that you only really need some way to change second highest level page tables (or, some way to change all highest level page table entries). There are multiple ways to do this. The easiest is pre-allocation. For example, if you say "kernel space will always be N MiB" you can pre-allocate all the second highest level tables you'd need for "N MiB" during boot and never change them (e.g. for long mode, you could say that kernel space will be 512 GiB, pre-allocate a single "kernel page directory pointer table", and put that into every page map level 4 when a virtual address space is being created, and then rely on all other changes (to page directories, page tables, etc) automatically affecting the kernel space for all virtual address spaces). I believe this is the method Linux uses (partly because Linux uses the silly "map all RAM into kernel space" security disaster at boot).
However, this is just the table changes alone. There are 2 other concerns.
The first "other concern" is the CPU's translation look-aside buffers (TLBs); which need to be flushed when (virtual address to physical address) translation/s change. Most operating systems use a combination of "lazy TLB shootdown" (where a CPU using wrong information from TLB causes a page fault and page fault handler invalidates and returns so the software that caused the page fault can continue with the new/correct translation without knowing anything happened) and "multi-CPU TLB shootdown" (where you send an inter-processor interrupt to other CPUs and that interrupt handler invalidates the TLB entries).
The second "other concern" is making sure CPUs don't try to change the same thing at the same time. This typically ends up being a problem solved at a higher level. For example, if you acquire a lock for a certain data structure (before changing something in that data structure) and realize you need to allocate/free pages for that data structure (while you're trying to make the changes); then the code that modifies paging tables doesn't need to care about different CPUs changing the page tables at the same time because it knows that something at a higher level (the data structure's lock) already ensures that can't happen.
When the kernel changes page table entries, these updates must be made atomically:
In the 64bit kernel this can be conveniently done using 64bit memory operations, while i386 needs to use CMPXCHG8.
(Source)

Page table in Linux kernel space during boot

I feel confuse in page table management in Linux kernel ?
In Linux kernel space, before page table is turned on. Kernel will run in virtual memory with 1-1 mapping mechanism. After page table is turned on, then kernel has consult page tables to translate a virtual address into a physical memory address.
Questions are:
At this time, after turning on page table, kernel space is still 1GB (from 0xC0000000 - 0xFFFFFFFF ) ?
And in the page tables of kernel process, only page table entries (PTE) in range from 0xC0000000 - 0xFFFFFFFF are mapped ?. PTEs are out of this range will be not mapped because kernel code never jump there ?
Mapping address before and after turning on page table is same ?
Eg. before turning on page table, the virtual address 0xC00000FF is mapped to physical address 0x000000FF, then after turning on page table, above mapping does not change. virtual address 0xC00000FF is still mapped to physical address 0x000000FF. Different thing is only that after turning on page table, CPU has consult the page table to translate virtual address to physical address which no need to do before.
The page table in kernel space is global and will be shared across all process in the system including user process ?
This mechanism is same in x86 32bit and ARM ?
The following discussion is based on 32-bit ARM Linux, and version of kernel source code is 3.9
All your questions can be addressed if you go through the procedure of setting up the initial page table(which will be overwitten later by function paging_init ) and turning on MMU.
When kernel is first launched by bootloader, Assembly function stext(in arch\arm\kernel\head.s) is the first function to run. Note that MMU has not been turned on yet at this moment.
Among other things, the two import jobs done by this function stext is:
create the initial page tabel(which will be overwitten later by
function paging_init )
turn on MMU
jump to C part of kernel initialization code and carry on
Before delving into the your questions, it is benificial to know:
Before MMU is turned on, every address issued by CPU is physical
address
After MMU is turned on, every address issued by CPU is virtual address
A proper page table should be set up before turning on MMU, otherwise your code will simply "be blown away"
By convention, Linux kernel uses higher 1GB part of virtual address and user land uses the lower 3GB part
Now the tricky part:
First trick: using position-independent code.
Assembly function stext is linked to address "PAGE_OFFSET + TEXT_OFFSET"(0xCxxxxxxx), which is a virtual address, however, since MMU has not been turned on yet, the actual address where assembly function stext is running is "PHYS_OFFSET + TEXT_OFFSET"(the actual value depends on your actual hardware), which is a physical address.
So, here is the thing: the program of function stext "thinks" that it is running in address like 0xCxxxxxxx but it is actually running in address (0x00000000 + some_offeset)(say your hardware configures 0x00000000 as the starting point of RAM). So before turning on MMU, the assembly code need to be very carefully written to make sure that nothing goes wrong during the execution procedure. In fact a techinque called position-independent code(PIC) is used.
To further explain the above, I extract several assembly code snippets:
ldr r13, =__mmap_switched # address to jump to after MMU has been enabled
b __enable_mmu # jump to function "__enable_mmu" to turn on MMU
Note that the above "ldr" instruction is a pseudo instruction which means "get the (virtual) address of function __mmap_switched and put it into r13"
And function __enable_mmu in turn calls function __turn_mmu_on:
(Note that I removed several instructions from function __turn_mmu_on which are essential instructions to the function but not of our interest)
ENTRY(__turn_mmu_on)
mcr p15, 0, r0, c1, c0, 0 # write control reg to enable MMU====> This is where MMU is turned on, after this instruction, every address issued by CPU is "virtual address" which will be translated by MMU
mov r3, r13 # r13 stores the (virtual) address to jump to after MMU has been enabled, which is (0xC0000000 + some_offset)
mov pc, r3 # a long jump
ENDPROC(__turn_mmu_on)
Second trick: identical mapping when setting up initial page table before turning on MMU.
More specifically, the same address range where kernel code is running is mapped twice.
The first mapping, as expected, maps address range 0x00000000(again,
this address depends on hardware config) through (0x00000000 +
offset) to 0xCxxxxxxx through (0xCxxxxxxx + offset)
The second mapping, interestingly, maps address range 0x00000000
through (0x00000000 + offset) to itself(i.e.: 0x00000000 -->
(0x00000000 + offset))
Why doing that?
Remember that before MMU is turned on, every address issued by CPU is physical address(starting at 0x00000000) and after MMU is turned on, every address issued by CPU is virtual address(starting at 0xC0000000).
Because ARM is a pipeline structure, at the moment MMU is turned on, there are still instructions in ARM's pipeine that are using (physical) addresses that are generated by CPU before MMU is turned on! To avoid these instructions to get blown up, an identical mapping has to be set up to cater them.
Now returning to your questions:
At this time, after turning on page table, kernel space is still 1GB (from 0xC0000000 - 0xFFFFFFFF ) ?
A: I guess you mean turning on MMU. The answer is yes, kernel space is 1GB(actually it also occupies several mega bytes below 0xC0000000, but that is not of our interest)
And in the page tables of kernel process, only page table entries (PTE) in range from 0xC0000000 - 0xFFFFFFFF are mapped ?. PTEs are out
of this range will be not mapped because kernel code never jump there
?
A: While the answer to this question is quite complicated because it involves lot of details regarding specific kernel configurations.
To fully answer this question, you need to read the part of kernel source code that set up the initial page table(assembly function __create_page_tables) and the function which sets up the final page table(C function paging_init).
To put it simple, there are two levels of page table in ARM, the first page table is PGD, which occupies 16KB. Kernel first zeros out this PGD during initialization process and does the initial mapping in assembly function __create_page_tables. In function __create_page_tables, only a very small portion of address space is mapped.
After that, the final page table is set up in function paging_init, and in this function, a quite large portion of address space is mapped. Say if you only have 512M RAM, for most common configurations, this 512M-RAM would be mapping by kernel code section by section(1 section is 1MB). If your RAM is quite large(say 2GB), only a portion of your RAM will be directly mapped.
(I will stop here because there are too many details regarding Question 2)
Mapping address before and after turning on page table is same ?
A: I think I've already answered this question in my explanation of "Second trick: identical mapping when setting up initial page table before turning on MMU."
4 . The page table in kernel space is global and will be shared across
all process in the system including user process ?
A: Yes and no. Yes because all processes share the same copy(content) of kernel page table(higher 1GB part). No because each process uses its own 16KB memory to store the kernel page table(although the content of page table for higher 1GB part is identical for every process).
5 . This mechanism is same in x86 32bit and ARM ?
Different Architectures use different mechanism
When Linux enables the MMU, it is only required that the virtual address of the kernel space is mapped. This happens very early in booting. At this point, there is no user space. There is no restrictions that the MMU can map multiple virtual addresses to the same physical address. So, when enabling the MMU, it is simplest to have a virt==phys mapping for the kernel code space and the mapping link==phys or the 0xC0000000 mapping.
Mapping address before and after turning on page table is same ?
If the physical code address is Oxff and the final link address is 0xc00000FF, then we have a duplicate mapping when turning on the MMU. Both 0xff and 0xc00000ff map to the same physical page. A simple jmp (jump) or b (branch) will move from one address space to the other. At this point, the virt==phys mapping can be removed as we are executing at the final destination address.
I think the above should answer points 1 through 3. Basically, the booting page tables are not the final page tables.
4 . The page table in kernel space is global and will be shared across all process in the system including user process?
Yes, this is a big win with a VIVT cache and for many other reasons.
5 . This mechanism is same in x86 32bit and ARM?
Of course the underlying mechanics are different. They are different even for different processors within these families; 486 vs P4 vs Amd-K6; ARM926 vs Cortex-A5 vs Cortex-A8, etc. However, the semantics are very similar.
See: Bootmem#lwn.net - An article on the early Linux memory phase.
Depending on the version, different memory pools and page table mappings are active during boot. The mappings we are all familiar with do not need to be in place until init runs.

How are base registers, limit registers and relocation registers used?

My understanding in address translation process in MMU(memory management unit)
-> logical address : generated by cpu.programmer concern with this address.
-> virtual address : reside in the hard disk , as a pages.
-> physical address : reside in the RAM. It is the actual address.
1: cpu generate the logical address and send it to the MMU.
2: MMU translate the logical address into the virtual address then translate it to the physical address and send the physical address to RAM.
3: when ever the RAM is full , the page which is not used rapidly is returned to the hard disk , to allocate memory to the other pages(processes).
my questions are :
1) where the value of Relocation register is added?
2) who decide the value of Relocation Register?
3) what to do with the Base register and Limit register , how to use it?
4) where the logical address goes off?
If any body can answer it , It would be grateful to me.
It is requested that , let me know it any misunderstanding in this topic.
-thanks
I can tell you how this works on x86.
All programs in non-64-bit modes operate with addresses combined of two items: segment selector (for brevity "selector" is often omitted in text and that may be confusing) and offset. This selector:offset pair is called the logical address.
The selector portion isn't always explicitly specified or manipulated with in code since the CPU has "default" associations of segment registers containing selectors with specific instructions or specific instruction encodings. It's also uncommon to manipulate selectors in 32-bit mode, but is very often necessary in 16-bit code.
The virtual address is formed from the logical address either "directly" (in real or 8086 virtual mode) or "indirectly" (in protected mode).
"Direct" virtual address = selector * 16 + offset.
"Indirect" virtual address = SegmentDescriptorTable[selector].Base + offset.
SegmentDescriptorTable is either the Global Descriptor Table (AKA GDT) or the Local Descriptor Table (AKA LDT). It's set up by the OS and describes the location and size of various segments of memory. selector is used to select a segment in the table. The Base entry of the table tells the segment's beginning (virtual address). The Limit entry tells the segment size (generally; the details are a little more complex).
When a program tries to access memory with an offset resulting access beyond the end of the segment (the CPU compares offset and Limit), the CPU generates an exception and the OS handles it, by usually terminating the program.
Btw, in real/v86 mode, even though the virtual address is formed directly from selector:offset, there's still a 16-bit Limit imposed on offsets, which is why you need to use a different selector to access more than 64KB of memory.
The Base entry in a segment descriptor can be used to either isolate the segment from the rest of the memory (Limit helps here) or to place or move the entire segment to an arbitrary virtual address without having to modify anything (or much) in the program it belongs to (if we're moving a segment, the data has to be moved in the memory, obviously). Basically, it can be used for relocation purposes. In real/v86 mode for relocation purposes the selector is changed.
The virtual address can be further translated to the physical address if the CPU is running in protected mode and has set up page tables. If there're no page tables, the physical address is the same as the virtual address. The translation is done in blocks of physical memory and address ranges that are called pages (often 4KB).
There's no dedicated relocation register on x86 CPUs. Relocation can be achieved by adjusting:
segment selectors in CPU registers or program's code
segment base addresses in GDT/LDT
offsets in program's code
physical addresses in page tables
As for virtual address : reside in the hard disk , as a pages, I'm not sure what exactly you want to say with this, but just because there's virtual to physical address translation, it doesn't mean there's also virtual on-disk memory. There are other uses for the translation besides virtual on-disk memory. And the addresses reside in the CPU and wherever your (and OS's) code writes them to, not necessarily on the disk.
Your description has a number of mistakes, much of which may be the result of imprecise documentation and common usage.
First of all, there really is no such a thing as a virtual address. There are physical and logical addresses. Sadly, the term virtual address is frequently (even in hardware documentation) used when logical address is what is meant..
The CPU instruction stream always operates on logical addresses (values may refer to physical addresses).
When the CPU needs to access a logical address, the MMU attempts to translate it to a physical addresses. It does that by looking up the address in a page table.
Several things can happen at that point:
There may not be a page table entry for the address => Access violation.
The page table entry is marked invalid => Access violation.
The page table entry indicates that no physical memory is mapped to it => Page fault.
(I omit mode access checks).
It is this last step that last step where virtual memory comes into play. At that point the page fault handler of the operating system needs to find where the corresponding page has been stored to disk, load it, update the page table, and restart the instruction.
The operating system manages the available physical memory by paging writeable memory (that has changed) to disk (read only data does not have to be written back) when there is high demand for physical memory.
I have never heard of a "relocation register" before. But doing a GOOGLE search I can see that some academic material uses it as a confusing pedagogical concept (i.e., with no relation to reality).
Some systems define the page table using base and limit registers. The base registers indicate where the page table starts in memory (this can be either a physical or logical addresses) and the limit register indicates the side of the table.
The registers are usually not loaded directly. Their values are usually written to the hardware Process Context Block (PCB). When the process context is loaded, the page table base and limit are loaded automatically.
On some systems there are multiple page tables. If there are system and user page tables, the user page tables can refer to logical addresses in the system space and the system page tables refer to physical addresses.

Resources