What is the difference between kernel logicla address space , kernel virtual address space and user virtual address space - memory-management

Let me put my understanding.
Suppose we have a 32-bit memory address space for a system. So a process can access any memory in the 4GB range
If the RAM in the system we have of 4GB, kernel divides it into 1:3 . 1GB for kernel , and rest 3GB for the user space process.
A user space process will get the system memory access within that 3GB memory only and which address it gets is determined by the page table.
Kernel logical address is that 1GB ( approx ~896MB) memory which is being reserved only for the kernel. Is this correct?
kernel virtual address is the memory left i.e. 104MB + 3GB that also can be assigned to userspace. Is this correct?
user virtual address is the address generated by the user space process and its corresponding memory would be assigned from the 3GB reserved for the user space process by the kernel.
Let me know if my above understanding is correct? If not can you please explain in detail the difference between kernel logical address space , kernel virtual address space and user virtual address space.

your understanding is a mixture of right and wrong, I'll try to point to some of them:
in 32 bit machines, we're not always limited by 4GB addressable RAM, check this question for more detail: link
the memory is an abstraction for the user space programs, they see it a a continuous big chunk of memory, but the kernel manages this abstraction with some hardware support named MMU, to map the used virtual space in the user space program into an actual physical address or even some bloc in hard drive if swapping is activated.
the kernel can actually access to the physical memory, in order to manage the abstraction mentioned above, it can also use this abstraction, this depends on the designer of the kernel.
as for the difference between the virtual and logical addressing, check this answer: link

Related

Physical memory mapping and location of page tables

I have a picture of the virtual address space of a process (for x86-64):
However, I am confused about a few things.
What is the "Physical memory map" region for?
I know the 4-page tables are found in the high canonical region but where exactly are they? (data, code, stack, heap or physical memory map?)
What is the "Physical memory map" region for?
Direct-mapping of all physical RAM (usually with hugepages) allows easy access to memory given a physical address. (i.e. add an offset to generate a virtual address you can use to load or store from there.)
Having phys<->virt be cheap makes it easier to manage memory allocations, so you can primarily track what regions of physical memory are in use.
This is how kmalloc works: it returns a kernel virtual address that points into the direct-mapped region. This is great: it doesn't have to spend any time finding free virtual address space as well, just bookkeeping for physical memory. And it doesn't have to create or modify any page tables (And freeing doesn't have to tear down page tables and invlpg.)
kmalloc requires the memory to be contiguous in physical memory, not stitching together multiple 4k pages into a contiguous virtual allocation (that's what vmalloc does), so that's one reason to maybe not use kmalloc for everything, like for larger allocations that might fail or have to stop and defrag or page out memory if the kernel can't find enough contiguous physical pages. Which it couldn't do in a context that must run without pre-emption, like in an interrupt handler. (Correct me if I'm wrong, I don't regularly actually look at Linux kernel code. Regardless of actual Linux details, the basics of this way of handling allocation is important and relevant to any OS that direct-maps all physical RAM.)
Related:
What is the rationality of Linux kernel's mapping as much RAM as possible in direct-mapping(linear mapping) area?
Confusion about different meanings of "HighMem" in Linux Kernel re: how Linux uses physical RAM that it doesn't have enough virtual address-space to keep mapped all the time. (On architectures where Linux supports the concept of Highmem, e.g. i386 but not x86-64). Still, thinking about that can be a useful thought exercise in how kernels have to deal with memory, and why it's nice that x86-64 kernels generally don't have to deal with that pain.
Linux Torvalds has ranted about 32-bit x86 PAE which expanded physical address space but not virtual, when 4GiB virtual was already not enough to comfortably deal with 4GiB physical. It's a useful perspective on how this looks from an OS developer's perspective.
I know the 4-page tables are found in the high canonical region but where exactly are they? (data, code, stack, heap or physical memory map?)
Page tables for user-space task are in physical memory dynamically allocated by the kernel, probably with kmalloc. I haven't looked at the code. Every user-space page-table refers to the page directories for the kernel part of virtual address space, which are also stored somewhere.
They're only accessed by the CPU by physical address, so there's no need for there to be a virtual mapping of them other than the direct mapping of all physical RAM.
(The CPU accesses them on TLB miss, to fetch a PTE with the translation for this virtual address. But if they used virtual addresses themselves, you'd have a catch-22 unless there was a way for the OS to prime the TLB with an entry for the virtual address in CR3, and so on. Much better to just have the OS put physical linear addresses into CR3 and the page-directory / page-table entries.)
For Linux on x86-64, each process has its own page tables. The page tables are independent 4KiB physical pages that can be allocated anywhere in physical memory. The page tables are not part of the virtual address space -- they are accessed by the page table walker hardware using their physical addresses with the bit fields of the requested virtual address as indices into the page table hierarchy. The control register CR3 contains the physical address of the 4KiB page that holds the root of the page table tree for the currently running process. The kernel knows the CR3 of each process (since it must be saved and restored on context switches), so the kernel can walk a process's page tables in software (by emulating what the page table walker does in hardware) for any desired virtual address.

What is meant by holes in the memory Linux?

I have come across a term - holes in the memory in Linux. I believe this is the memory that is I/O remapped. Is my understanding correct?
Holes in memory can mean different things:
1) It can refer to physical memory addressing: For historical and boot-strapping reasons, in the "standard PC" (x86) architecture, all of system RAM is not contiguous. There are "holes" in the address space where memory-mapped I/O resides. For example, from the earliest days, there has been an area reserved for boot ROM (BIOS) and video memory. Also, there is a large area of the address space which is reserved for dynamic assignment to PCI (and PCI-X or PCI-Express) peripherals. These areas are often mapped as needed into kernel virtual address space by device drivers (which may be referred to as "I/O remapping").
Memory controllers built-in to the motherboard allow the physical address of the RAM to be configured (this is typically handled by the BIOS in the standard PC architecture). Other [non-x86] architectures often have similar holes in the physical address space.
2.) The term can also refer to unassigned regions in the virtual address space. Both kernel virtual address space and user processes' virtual address space typically have "holes" in them. For example, linux doesn't map any physical memory corresponding to virtual address 0 (i.e. the first page of the address space never has valid memory) -- this allows null pointer references to be trapped.
In some kinds of memory allocations, the linux kernel maintains unmapped areas between properly allocated virtual memory regions in order to trap faulty memory references (i.e. that stray beyond the end of the allocated space).

Does virtual address space resides in virtual memory?

Does virtual address space resides in virtual memory ? I have a confusion like , Each process
has its own virtual memory and page table and conversion to physical address from virtual address takes place while loading it into physical memory , but where does the virtual address space comes into picture ? I have gone through many operating systems books but everywhere it gives just explaination about particular word not where it resides and what is the relationship between them and how it operates.
please just explain me theoretically , example not needed.
Thanks in advance.
(Virtual) Address space is the set of allowable addresses for a given address width (that is 2^32 bytes on x86, 2^64 on x64). Virtual memory usually means almost the same. It is the set of allowable addresses for a certain process or application or also for the whole system. The virtual memory for a single application can be at most as big as the virtual address space of the system. Each application can "see" only the virtual address space that is allocated to it by the OS (and due to some trickery, it is possible that each application can have a virtual address space of the same size and the sum might be larger than the address space of the system).
Physical memory (more correct: physical RAM), is the amount of effectively installed RAM modules. It is usually smaller than the virtual address space. The OS does swapping to bring the requires memory pages from the hard disk into the physical memory if needed. A memory page in physical memory has a physical address and a virtual one. Normal applications only see the virtual address, and they don't (and must not) care about where the memory page is physically loaded. Therefore an address seen in an application or a debugger is really a virtual address in the virtual address space of that application. The physical address is only ever needed when directly interfacing to hardware. It can even constantly change if the OS decides to do so.
Hope this makes it a bit clearer.
I'm not a specialist but i think virtual addressing and paging is a part of the cpu protected mode introduced after 80386 and it is not part of the operating system. The operating system controls the page tables. For the virtual addresses they are just numbers in your executable file for example
objdump -d will display them

Difference between Kernel Virtual Address and Kernel Logical Address?

I am not able to exactly difference between kernel logical address and virtual address. In Linux device driver book it says that all logical address are kernel virtual address, and virtual address doesn't have any linear mapping. But logically wise when we say it is logical and when we say virtual and in which situation we use these two ?
The Linux kernel maps most of the virtual address space that belongs to the kernel to perform 1:1 mapping with an offset of the first part of physical memory. (slightly less then for 1Gb for 32bit x86, can be different for other processors or configurations). For example, for kernel code on x86 address 0xc00000001 is mapped to physical address 0x1.
This is called logical mapping - a 1:1 mapping (with an offset) that allows the kernel to access most of the physical memory of the machine.
But this is not enough - sometime we have more then 1Gb physical memory on a 32bit machine, sometime we want to reference non contiguous physical memory blocks as contiguous to make thing simple, sometime we want to map memory mapped IO regions which are not RAM.
For this, the kernel keeps a region at the top of its virtual address space where it does a "random" page to page mapping. The mapping there do not follow the 1:1 pattern of the logical mapping area. This is what we call the virtual mapping.
It is important to add that on many platforms (x86 is an example), both the logical and virtual mapping are done using the same hardware mechanism (TLB controlling virtual memory). In many cases, the "logical mapping" is actually done using virtual memory facility of the processor, so this can be a little confusing. The difference therefore is the pattern according to which the mapping is done: 1:1 for logical, something random for virtual.
Basically there are 3 kinds of addressing, namely
Logical Addressing : Address is formed by base and offset. This is nothing but segmented addressing, where the address (or offset) in the program is always used with the base value in the segment descriptor
Linear Addressing : Also called virtual address. Here adresses are contigous, but the physical address are not. Paging is used to implement this.
Physical addressing : The actual address on the Main Memory!
Now, in linux, Kernel memory (in address space) is beyond 3 GB ( 3GB to 4GB), i.e. 0xc000000..The addresses used by Kernel are not Physical addresses. To map the virtual address it uses PAGE_OFFSET. Care must be taken that no page translation is involved. i.e. these addresses are contiguous in nature. However there is a limit to this, i.e. 896 MB on x86. Beyond which paging is used for translation. When you use vmalloc, these addresses are returned to access the allocated memory.
In short, when someone refers to Virtual Memory in context of User Space, then it is through Paging. If Kernel Virtual Memory is mentioned then it is either PAGE_OFFSETed or vmalloced address.
(Reference - Understanding Linux Kernel - 2.6 based )
Shash
Kernel logical addresses are mappings accessible to kernel code through normal CPU memory access functions. On 32-bit systems, only 4GB of kernel logical address space exists, even if more physical memory than that is in use. Logical address space backed by physical memory can be allocated with kmalloc.
Virtual addresses do not necessarily have corresponding logical addresses. You can allocate physical memory with vmalloc and get back a virtual address that has no corresponding logical address (on 32-bit systems with PAE, for example). You can then use kmap to assign a logical address to that virtual address.
Simply speaking, virtual address would include "high memory", which doesn't do the 1:1 mapping for the physical address,if your RAM size is more than the address range of kernel(typically,For 1G/3G in X86,your RAM is 3G but your kernel addressing range is 1G) and also the address return from kmap() and vmalloc(), which requires the kernel to establish page table for the memory mapping. since logic address is always memory mapped by the kernel(1:1 mapping), you don't need to explicitly call kernel API,like set_pte to set up the page table entry for the particular page.
so virtual address can't be logic address all the time.

How does the linux kernel manage less than 1GB physical memory?

I'm learning the linux kernel internals and while reading "Understanding Linux Kernel", quite a few memory related questions struck me. One of them is, how the Linux kernel handles the memory mapping if the physical memory of say only 512 MB is installed on my system.
As I read, kernel maps 0(or 16) MB-896MB physical RAM into 0xC0000000 linear address and can directly address it. So, in the above described case where I only have 512 MB:
How can the kernel map 896 MB from only 512 MB ? In the scheme described, the kernel set things up so that every process's page tables mapped virtual addresses from 0xC0000000 to 0xFFFFFFFF (1GB) directly to physical addresses from 0x00000000 to 0x3FFFFFFF (1GB). But when I have only 512 MB physical RAM, how can I map, virtual addresses from 0xC0000000-0xFFFFFFFF to physical 0x00000000-0x3FFFFFFF ? Point is I have a physical range of only 0x00000000-0x20000000.
What about user mode processes in this situation?
Every article explains only the situation, when you've installed 4 GB of memory and the kernel maps the 1 GB into kernel space and user processes uses the remaining amount of RAM.
I would appreciate any help in improving my understanding.
Thanks..!
Not all virtual (linear) addresses must be mapped to anything. If the code accesses unmapped page, the page fault is risen.
The physical page can be mapped to several virtual addresses simultaneously.
In the 4 GB virtual memory there are 2 sections: 0x0... 0xbfffffff - is process virtual memory and 0xc0000000 .. 0xffffffff is a kernel virtual memory.
How can the kernel map 896 MB from only 512 MB ?
It maps up to 896 MB. So, if you have only 512, there will be only 512 MB mapped.
If your physical memory is in 0x00000000 to 0x20000000, it will be mapped for direct kernel access to virtual addresses 0xC0000000 to 0xE0000000 (linear mapping).
What about user mode processes in this situation?
Phys memory for user processes will be mapped (not sequentially but rather random page-to-page mapping) to virtual addresses 0x0 .... 0xc0000000. This mapping will be the second mapping for pages from 0..896MB. The pages will be taken from free page lists.
Where are user mode processes in phys RAM?
Anywhere.
Every article explains only the situation, when you've installed 4 GB of memory and the
No. Every article explains how 4 Gb of virtual address space is mapped. The size of virtual memory is always 4 GB (for 32-bit machine without memory extensions like PAE/PSE/etc for x86)
As stated in 8.1.3. Memory Zones of the book Linux Kernel Development by Robert Love (I use third edition), there are several zones of physical memory:
ZONE_DMA - Contains page frames of memory below 16 MB
ZONE_NORMAL - Contains page frames of memory at and above 16 MB and below 896 MB
ZONE_HIGHMEM - Contains page frames of memory at and above 896 MB
So, if you have 512 MB, your ZONE_HIGHMEM will be empty, and ZONE_NORMAL will have 496 MB of physical memory mapped.
Also, take a look to 2.5.5.2. Final kernel Page Table when RAM size is less than 896 MB section of the book. It is about case, when you have less memory than 896 MB.
Also, for ARM there is some description of virtual memory layout: http://www.mjmwired.net/kernel/Documentation/arm/memory.txt
The line 63 PAGE_OFFSET high_memory-1 is the direct mapped part of memory
The hardware provides a Memory Management Unit. It is a piece of circuitry which is able to intercept and alter any memory access. Whenever the processor accesses the RAM, e.g. to read the next instruction to execute, or as a data access triggered by an instruction, it does so at some address which is, roughly speaking, a 32-bit value. A 32-bit word can have a bit more than 4 billions distinct values, so there is an address space of 4 GB: that's the number of bytes which could have a unique address.
So the processor sends out the request to its memory subsystem, as "fetch the byte at address x and give it back to me". The request goes through the MMU, which decides what to do with the request. The MMU virtually splits the 4 GB space into pages; page size depends on the hardware you use, but typical sizes are 4 and 8 kB. The MMU uses tables which tell it what to do with accesses for each page: either the access is granted with a rewritten address (the page entry says: "yes, the page containing address x exists, it is in physical RAM at address y") or rejected, at which point the kernel is invoked to handle things further. The kernel may decide to kill the offending process, or to do some work and alter the MMU tables so that the access may be tried again, this time successfully.
This is the basis for virtual memory: from the point of view, the process has some RAM, but the kernel has moved it to the hard disk, in "swap space". The corresponding table is marked as "absent" in the MMU tables. When the process accesses his data, the MMU invokes the kernel, which fetches the data from the swap, puts it back at some free space in physical RAM, and alters the MMU tables to point at that space. The kernel then jumps back to the process code, right at the instruction which triggered the whole thing. The process code sees nothing of the whole business, except that the memory access took quite some time.
The MMU also handles access rights, which prevents a process from reading or writing data which belongs to other processes, or to the kernel. Each process has its own set of MMU tables, and the kernel manage those tables. Thus, each process has its own address space, as if it was alone on a machine with 4 GB of RAM -- except that the process had better not access memory that it did not allocate rightfully from the kernel, because the corresponding pages are marked as absent or forbidden.
When the kernel is invoked through a system call from some process, the kernel code must run within the address space of the process; so the kernel code must be somewhere in the address space of each process (but protected: the MMU tables prevent access to the kernel memory from unprivileged user code). Since code can contain hardcoded addresses, the kernel had better be at the same address for all processes; conventionally, in Linux, that address is 0xC0000000. The MMU tables for each process map that part of the address space to whatever physical RAM blocks the kernel was actually loaded upon boot. Note that the kernel memory is never swapped out (if the code which can read back data from swap space was itself swapped out, things would turn sour quite fast).
On a PC, things can be a bit more complicated, because there are 32-bit and 64-bit modes, and segment registers, and PAE (which acts as a kind of second-level MMU with huge pages). The basic concept remains the same: each process gets its own view of a virtual 4 GB address space, and the kernel uses the MMU to map each virtual page to an appropriate physical position in RAM, or nowhere at all.
osgx has an excellent answer, but I see a comment where someone still doesn't understand.
Every article explains only the situation, when you've installed 4 GB
of memory and the kernel maps the 1 GB into kernel space and user
processes uses the remaining amount of RAM.
Here is much of the confusion. There is virtual memory and there is physical memory. Every 32bit CPU has 4GB of virtual memory. The Linux kernel's traditional split was 3G/1G for user memory and kernel memory, but newer options allow different partitioning.
Why distinguish between the kernel and user space? - my own question
When a task swaps, the MMU must be updated. The kernel MMU space should remain the same for all processes. The kernel must handle interrupts and fault requests at any time.
How does virtual to physical mapping work? - my own question.
There are many permutations of virtual memory.
a single private mapping to a physical RAM page.
a duplicate virtual mapping to a single physical page.
a mapping that throws a SIGBUS or other error.
a mapping backed by disk/swap.
From the above list, it is easy to see why you may have more virtual address space than physical memory. In fact, the fault handler will typically inspect process memory information to see if a page is mapped (I mean allocated for the process), but not in memory. In this case the fault handler will call the I/O sub-system to read in the page. When the page has been read and the MMU tables updated to point the virtual address to a new physical address, the process that caused the fault resumes.
If you understand the above, it becomes clear why you would like to have a larger virtual mapping than physical memory. It is how memory swapping is supported.
There are other uses. For instance two processes may use the same code library. It is possible that they are at different virtual addresses in the process space due to linking. You may map the different virtual addresses to the same physical page in this case in order to save physical memory. This is quite common for new allocations; they all point to a physical 'zero page'. When you touch/write the memory the zero page is copied and a new physical page allocated (COW or copy on write).
It is also sometimes useful to have the virtual pages aliased with one as cached and another as non-cached. The two pages can be examined to see what data is cached and what is not.
Mainly virtual and physical are not the same! Easily stated, but often confusing when looking at the Linux VMM code.
-
Hi, actually, I don't work on x86 hardware platform, so there may exist some technical errors in my post.
To my knowledge, the range between 0(or 16)MB - 896MB is listed specially while you have more RAM than that number, say, you have 1GB physical RAM on your board, which is called "low-memory". If you have more physical RAM than 896MB on your board, then, rest of the physical RAM is called highmem.
Speaking of your question, there are 512MiBytes physical RAM on your board, so actually, there is no 896, no highmem.
The total RAM kernel can see and also can map is 512MB.
'Cause there is 1-to-1 mapping between physical memory and kernel virtual address, so there is 512MiBytes virtual address space for kernel. I'm really not sure whether or not the prior sentence is right, but it's what in my mind.
What I mean is if there is 512MBytes, then the amount of physical RAM the kernel can manage is also 512MiBytes, further, the kernel cannot create such big address space like beyond 512MBytes.
Refer to user space, there is one different point, pages of user's application can be swapped out to harddisk, but pages of the kernel cannot.
So, for user space, with the help of page tables and other related modules, it seems there is still 4GBytes address space.
Of course, this is virtual address space, not physical RAM space.
This is what I understand.
Thanks.
If the physical memory is less than 896 MB then the linux kernel maps upto that physical address lineraly.
For details see this.. http://learnlinuxconcepts.blogspot.in/2014/02/linux-addressing.html

Resources