Kernel space and user space layout in page table - memory-management

Assume that we have the CPU that has MMU, which works as follows:
for memory management is used only paging
every process has own page table
virtual address of every process is split into user space and kernel
space (like many CPUs with paging does)
kernel space (concretely his virtual addresses) is shared between all
processes (for example higher addresses)
Now imagine that, we are running several processes (of course in non-privileged mode). When we want to allocate additional memory for any process, it just apply following scenario in my opinion. We execute system call(s) and OS serves that (in privileged mode) by updating process page table or report some error code.
My question: But now, when we want to allocate some memory for kernel, we must update all tables of all processes in their kernel part addresses? As far as I know there isn't any kernel page table, which can theoretically solve that problem (but brings others), in CPUs that works similarly. So how to solve that situation, if is possible to occur?
Sorry for my English.

Finally, I found solution. I think, this link say it all:
ARM Solution
Very important to understand that is Table 6.13. Translation table size. In a short, the virtual space of every process is divided into kernel space and user space, of course. However, both spaces have own table. When process switch, kernel table pointer is constant, but user table pointer is changed. I hope, you understand that.
Sorry for my English.

Related

Transient OS Code in Single Partition Allocation?

While going through the slides of a lecture on memory management, I came across this:
Relocation register value is static during program execution. Hence all of the OS must be present (it might be used). Otherwise, we have to relocate user code/data “on the fly”! In other words, we cannot have transient OS code
I couldn't understand what the above lines meant. I would appreciate if anyone could explain this.
The relocation-register scheme provides an effective way to allow the
operating system’s size to change dynamically. This flexibility is desirable in
many situations. For example, the operating system contains code and buffer
space for device drivers.
If a device driver (or other operating-system service) is not commonly used, we do not want to keep the code and data inmemory, as
we might be able to use that space for other purposes. Such code is sometimes
called transient operating-system code; it comes and goes as needed. Thus,
using this code changes the size of the operating system during program
execution.
All the logical memory references in user code are mapped to physical address space using Relocation Register value.(Phy.add = Rel.reg_val + log.add).
The value of Relocation-Register is set by OS. So it will be unaffected by user processes.
A transient code in Operating System means that it is active for just small time (useless for long time like code to check if the processor is deadlocked). So the Relocation-Register Scheme tries to allocate the memory occupied by this transient code to some other process in job Queue of Main Memory.Shrinking or expanding OS process boundary with nearby processes (because of single partition allocation).
By knowing above points, if transient code is present in OS, due to changes in boundaries of processes with OS process, we should be able to relocate user code/data frequently.

Does memory address translation need extra access to memory?

I've got a question about virtual memory management, more specifically, the address translation.
When an application runs, the CPU receives instructions containing virtual memory addresses, and translates them into physical addresses via the page table.
My question is, since the page table also aside at a memory block, does that means the CPU has to access the memory twice in a single memory-access instruction? If the answer is no, then how does this actually work? Which part did I miss?
Could anyone give me some details about this?
As usual the answer is neither yes or no.
Worst case you have to do a walk of the page table, which is indeed stored in (some kind of) memory, this is not necessarily only one lookup, it can be multiple lookups, see for example a two-level table (example from wikipedia).
However, typically this page table is accompanied by a hardware assist called the translation lookaside buffer, this is essentially a cache for the page table, the lookup process can be seen in this image. It works just as you would expect a cache too work, if a lookup succeeds you happily continue with the physical fetch, if it fails you proceed to the aforementioned page walk and you update the cache afterwards.
This hardware assist is usually implemented as a CAM (Content Addressable Memory), something that's most used in network processing but is also very useful here. It is a memory-component that does not do the lookup based upon an address but based upon 'content', or any generic key (the keys dont' have to be contiguous, incrementing numbers). In this case the key would be your virtual address, and the resulting memory lookup would be your physical address. As this CAM is a separate component and as it is very fast you could state that as long as you hit it you don't incur any extra memory overhead for virtual -> physical address translation.
You could ask why they don't put the whole page table in a CAM? Quite simply, CAM's are both quite expensive and more importantly quite energy-hungry, so you don't want to make them too big (we wouldn't want a laptop that requires 1KW to run do we?).
Sometimes.
The MMU contains a cache of virtual to physical address mapping, called a TLB (Translation Lookaside Buffer).
If the page in question is not in the TLB (a TLB miss), then it needs to load the relevant piece of page table from main memory into that cache first, which will need additional memory access.
Finally, if the page cannot be found at all, a trap is issued to the CPU (a page fault), and the CPU have an opportunity to fix this - e.g. allocate memory, load the piece from a file, swap space and similar.
The details on how this is done varies between architectures, on some, the TLB miss also involves the CPU to configure the TLB, though on most this is automatic. (but the CPU would have to flush the TLB when doing a context switch, and load a new pagetable for e.g. a new process)
More info e.g. here https://www.kernel.org/doc/gorman/html/understand/understand006.html

Linux memory mapping

I got few questions about linux memory management(assume x86 32bit platform)
By default for all processes the top 1Gig of virtual address is mapped to kernel area. Theoretically the Kernel can map additional memory from high memory using vmalloc. My question is what happens with the page tables of all the user processes , I assume that they should get updates about the kernel memory allocation?( may be that memory will get used when the kernel is in process context).
Can someone explain from where The X86 logical address mapping limitation comes from? in "linux device drivers" chapter 15 it is said that there is a limitation on mapping logical address but with no deep explanation:
in many cases, even 32-bit processors can address more than 4 GB of physical memory. The limitation on how much memory can be directly mapped with logical addresses remains, however. Only the lowest portion of memory (up to 1 or 2 GB, depending on the hardware and the kernel configuration) has logical addresses; the rest (high memory) does not.
When does the kernel switch to its own page table(not including boot time)?. When its in process context, and interrupt context it uses the user mode process page table. The kernel threads use the process page table as well.
1.) There is only one set of 256 page tables that map the kernel's 1GiB region. The top 256 entries of each user space page directory point to these page tables. Thus, if the kernel changes a virtual mapping, all user space processes get the update as well.
2.) I'm not sure which limitations you mean, can you quote some text so I can find the passage in the book.
3.) When a process, like QEMU, starts a virtual CPU with kvm, the kernel swaps out the page table of the process, even though it doesn't yield to a different process. There may be more places like this, but in general, I don't think there is such a thing as a "kernel page table". All process page tables already map kernel memory, and it would thus seem wasteful to switch them out.
"Linux Device Drivers" is a great reference, but I can also recommend "Understanding the Linux Virtual Memory Manager", and of course, the kernel's source code.

Does virtual address matching matter in shared mem IPC?

I'm implementing IPC between two processes on the same machine (Linux x86_64 shmget and friends), and I'm trying to maximize the throughput of the data between the processes: for example I have restricted the two processes to only run on the same CPU, so as to take advantage of hardware caching.
My question is, does it matter where in the virtual address space each process puts the shared object? For example would it be advantageous to map the object to the same location in both processes? Why or why not?
It doesn't matter as long as the OS is concerned. It would have been advantageous to use the same base address in both processes if the TLB cache wasn't flushed between context switches. The Translation Lookaside Buffer (TLB) cache is a small buffer that caches virtual to physical address translations for individual pages in order to reduce the number of expensive memory reads from the process page table. Whenever a context switch occurs, the TLB cache is flushed - you don't want processes to be able to read a small portion of the memory of other processes, just because its page table entries are still cached in the TLB.
Context switch does not occur between processes running on different cores. But then each core has its own TLB cache and its content is completely uncorrelated with the content of the TLB cache of the other core. TLB flush does not occur when switching between threads from the same process. But threads share their whole virtual address space nevertheless.
It only makes sense to attach the shared memory segment at the same virtual address if you pass around absolute pointers to areas inside it. Imagine, for example, a linked list structure in shared memory. The usual practice is to use offsets from the beginning of the block instead of aboslute pointers. But this is slower as it involves additional pointer arithmetic. That's why you might get better performance with absolute pointers, but finding a suitable place in the virtual address space of both processes might not be an easy task (at least not doing it in a portable way), even on platforms with vast VA spaces like x86-64.
I'm not an expert here, but seeing as there are no other answers I will give it a go. I don't think it will really make a difference, because the virutal address does not necessarily correspond to the physical address. Said another way, the underlying physical address the OS maps your virtual address to is not dependent on the virtual address the OS gives you.
Again, I'm not a memory master. Sorry if I am way off here.

Does the last GB in the address space of a linux process map to the same physical memory?

I read that the first 3 GBs are reserved for the process and the last GB is for the Kernel. I also read that the kernel is loaded starting from the 2nd MB of the physical address space (depending on the configuration). My question is that is the mapping of that last 1 GB is same for all processes and maps to this physical area of memory?
Another question is, when a process switches to kernel mode (eg, when a sys call occurs), then what page tables are used, the process page tables or the kernel page tables? If kernel page tables are used, then they can't access the memory locations belonging to the process. If that is the case, then there is apparently no use for the kernel virtual memory since all access to kernel code and data will be through the mapping of the last 1 GB of process address space. Please help me clarify this (any useful links will be much appreciated)
It seems, you are talking about 32-bit x86 systems, right?
If I am not mistaken, the kernel can be configured not only for 3Gb/1Gb memory distrubution, there could be other variants (e.g. 2Gb/2Gb). Still, 3Gb/1Gb is probably the most common one on x86-32.
The kernel part of the address space should be inaccessible from the user space. From the kernel's point of view, yes, the mapping of the memory occupied by the kernel itself is always the same. No matter, in the context of which process (or interrupt handler, or whatever else) the kernel currently operates.
As one of the consequences, if you look at the addresses of kernel symbols in /proc/kallsyms from different processes, you will see the same addresses each time. And these are exactly the addresses of the respective kernel functions, variables and others from the kernel's point of view.
So I suppose, the answer to your first question is "yes" but it is probably not very useful for the user-space code as the kernel space memory is not directly accessible from there anyway.
As for the second question, well, if the kernel currently operates in the context of some process, it can actually access the user-space memory of that process. I can't describe it in detail but probably the implementation of kernel functions copy_from_user and copy_to_user could give you some hints. See arch/x86/lib/usercopy_32.c and arch/x86/include/asm/uaccess.h in the kernel sources. It seems, on x86-32, the user-space memory is accessed in these functions directly, using the default memory mappings for the current process context. The 'magic' stuff there is only related to the optimizations and checking the address of the memory area for correctness.
Yes, the mapping of the kernel part of the address space is the same in all processes. Part of it does map that part of the physical memory where the kernel image is loaded, but that's not the bulk of it - the remainder is used to map other physical memory locations for the kernel's runtime working set.
When a process switches to kernel mode, the page tables are not changed. The kernel part of the address space simply becomes accessible because the CPL (Current Privilege Level) is now zero.

Resources