I am very new to kernel or system programming,
I have couple of questions related to virtual memory. Mostly related to static vs run time, [i.e. ELF and loading/Linking etc], Specific to linux-x86.
My understandings might be completely wrong...
I am aware of virtual memory and it's split 1G/3G. where process can not access address above PAGE_OFFSET in user mode - PAGE_OFFSET is virtual address.
At Static time ELF defines process Virtual space?
If ELF defines virtual address space then does ELF also defines kernel virtual address space? How?
[ I assume kernel virtual address space is dynamically mapped at run time?]
If kernel address space is mapped to process address space then why doesn't process size(virtual) includes kernel size also?
When and How this kernel address space is mapped/linked?
Like , In case of shared library the particular file is pointed by vm struct etc..
Is it when code flow hits system call? For example.
Does executable size determines process size (virtual) completely? in what context sizes differ or they are completely different.
Any article that explains overall flow?
Compile --> link/load --> virtual memory structure (kernel address space/shared objects etc)
I know its very vast but explanation on overall flow will work.
Most systems define logical address ranges for kernel and user address spaces. On some systems the range is entirely up to the operating system (how it sets up the page tables) on others it is done in hardware.
For the former, page tables are usually nested. In which case multiple page tables share identical entires.
For the latter, there are usually separate page tables for the user and kernel address spaces.
If ELF defines virtual address space then does ELF also defines kernel virtual address space? How? [ I assume kernel virtual address space is dynamically mapped at run time?]
The executable file only defines the initial layout of the user address space.
If kernel address space is mapped to process address space then why doesn't process size(virtual) includes kernel size also?
That would depend upon the system and how it does the counting.
When and How this kernel address space is mapped/linked? Like , In case of shared library the particular file is pointed by vm struct etc.
The kernel address space exists independent of any processes. As mentioned above, it is mapped to a process either by having a system page table shared by all processes or a set of nested page table entries shared by all processes.
Does executable size determines process size (virtual) completely? in what context sizes differ or they are completely different.'
Not really. Large executables are indicative of larger ranges of logical addresses required. However, a small EXE can easily describe a large number of demand zero pages. In addition, the application can map logical pages as it executes. The EXE only defines the initial state of the logical address space.
Related
in linux kernel, is it possible for the same physical page frame to be mapped more than once at the same time to the virtual memory space of one process?
In other words, there are multiple page table entries in this process that point to the same physical page.
Yes. In particular a zeroed page is mapped with copy-on-write semantics everywhere zero initialized memory is allocated, like the .bss section of executables.
Whether that can be done for writable mappings without problems might depend on the CPU architecture (in particular the cache design). I can't say right now whether Linux allows that.
I am a little confused about the terms physical/logical/virtual addresses in an Operating System(I use Linux- open SUSE)
Here is what I understand:
Physical Address- When the processor is in system mode, the address used by the processor is physical address.
Logical Address- When the processor is in user mode, the address used is the logical address. these are anyways mapped to some physical address by adding a base register with the offset value.It in a way provides a sort of memory protection.
I have come across discussion that virtual and logical addresses/address space are the same. Is it true?
Any help is deeply appreciated.
My answer is true for Intel CPUs running on a modern Linux system, and I am speaking about user-level processes, not kernel code. Still, I think it'll give you some insight enough to think about the other possibilities
Address Types
Regarding question 3:
I have come across discussion that virtual and logical
addresses/address space are the same. Is it true?
As far as I know they are the same, at least in modern OS's running on top of Intel processors.
Let me try to define two notions before I explain more:
Physical Address: The address of where something is physically located in the RAM chip.
Logical/Virtual Address: The address that your program uses to reach its things. It's typically converted to a physical address later by a hardware chip (mostly, not even the CPU is aware really of this conversion).
Virtual/Logical Address
The virtual address is well, a virtual address, the OS along with a hardware circuit called the MMU (Memory Management Unit) delude your program that it's running alone in the system, it's got the whole address space(having 32-bits system means your program will think it has 4 GBs of RAM; roughly speaking).
Obviously, if you have more than one program running at the time (you always do, GUI, Init process, Shell, clock app, calendar, whatever), this won't work.
What will happen is that the OS will put most of your program memory in the hard disk, the parts it uses the most will be present in the RAM, but hey, that doesn't mean they'll have the address you and your program know.
Example: Your process might have a variable named (counter) that's given the virtual address 0xff (imaginably...) and another variable named (oftenNotUsed) that's given the virtual address (0xaa).
If you read the assembly of your compiled code after all linking's happened, you'll be accessing them using those addresses but well, the (oftenNotUsed) variable won't be really there in RAM at 0xaa, it'll be in the hard disk because the process is not using it.
Moreover, the variable (counter) probably won't be physically at (0xff), it'll be somewhere else in RAM, when your CPU tries to fetch what's in 0xff, the MMU and a part of the OS, will do a mapping and get that variable from where it's really available in the RAM, the CPU won't even notice it wasn't in 0xff.
Now what happens if your program asks for the (oftenNotUsed) variable? The MMU+OS will notice this 'miss' and will fetch it for the CPU from the Harddisk into RAM then hand it over to the CPU as if it were in the address (0xaa); this fetching means some data that was present in RAM will be sent back to the Harddisk.
Now imagine this running for every process in your system. Every process thinks they have 4GB of RAMs, no one actually have that but everything works because everyone has some parts of their program available physically in the RAM but most of the program resides in the HardDisk. Don't confuse this part of the program memory being put in HD with the program data you can access through file operations.
Summary
Virtual address: The address you use in your programs, the address that your CPU use to fetch data, is not real and gets translated via MMU to some physical address; everyone has one and its size depends on your system(Linux running 32-bit has 4GB address space)
Physical address: The address you'll never reach if you're running on top of an OS. It's where your data, regardless of its virtual address, resides in RAM. This will change if your data is sent back and forth to the hard disk to accommodate more space for other processes.
All of what I have mentioned above, although it's a simplified version of the whole concept, is what's called the memory management part of the the computer system.
Consequences of this system
Processes cannot access each other memory, everyone has their separate virtual addresses and every process gets a different translation to different areas even though sometimes you may look and find that two processes try to access the same virtual address.
This system works well as a caching system, you typically don't use the whole 4GB you have available, so why waste that? let others share it and let them use it too; when your process needs more, the OS will fetch your data from the HD and replace other process' data, at an expense of course.
Physical Address- When the processor is in system mode, the address used by the processor is physical address.
Not necessarily true. It depends on the particular CPU. On x86 CPUs, once you've enabled page translation, all code ceases to operate with physical addresses or addresses trivially convertible into physical addresses (except, SMM, AFAIK, but that's not important here).
Logical Address- When the processor is in user mode, the address used is the logical address. these are anyways mapped to some physical address by adding a base register with the offset value.
Logical addresses do not necessarily apply to the user mode exclusively. On x86 CPUs they exist in the kernel mode as well.
I have come across discussion that virtual and logical addresses/address space are the same. Is it true?
It depends on the particular CPU. x86 CPUs can be configured in such a way that segments aren't used explicitly. They are used implicitly and their bases are always 0 (except for thread-local-storage segments). What remains when you drop the segment selector from a logical address is a 32-bit (or 64-bit) offset whose value coincides with the 32-bit (or 64-bit) virtual address. In this simplified set-up you may consider the two to be the same or that logical addresses don't exist. It's not true, but for most practical purposes, good enough of an approximation.
I am referring to below answer base on intel x86 CPU
Difference Between Logical to Virtual Address
Whenever your program is under execution CPU generates logical address for instructions which contains (16 bit Segment Selector and 32 bit offset ).Basically Virtual(Linear address) is generated using logical address fields.
Segment selector is 16 bit field out of which first 13bit is index (Which is a pointer to the segment descriptor resides in GDT,described below) , 1 bit TI field ( TI = 1, Refer LDT , TI=0 Refer GDT )
Now Segment Selector OR say segment identifier refers to Code Segment OR Data Segment OR Stack Segment etc. Linux contains one GDT/LDT (Global/Local Descriptor Table) Which contains 8 byte descriptor of each segments and holds the base (virtual) address of the segment.
So for for each logical address, virtual address is calculated using below steps.
1) Examines the TI field of the Segment Selector to determine which Descriptor
Table stores the Segment Descriptor. This field indicates that the Descriptor is
either in the GDT (in which case the segmentation unit gets the base linear
address of the GDT from the gdtr register) or in the active LDT (in which case the
segmentation unit gets the base linear address of that LDT from the ldtr register).
2) Computes the address of the Segment Descriptor from the index field of the Segment
Selector. The index field is multiplied by 8 (the size of a Segment Descriptor),
and the result is added to the content of the gdtr or ldtr register.
3) Adds the offset of the logical address to the Base field of the Segment Descriptor,
thus obtaining the linear(Virtual) address.
Now it is the job of Pagging unit to translate physical address from virtual address.
Refer : Understanding the linux Kernel , Chapter 2 Memory Addressing
User virtual addresses
These are the regular addresses seen by user-space programs. User addresses are either 32 or 64 bits in length, depending on the underlying hardware architecture, and each process has its own virtual address space.
Physical addresses
The addresses used between the processor and the system's memory. Physical addresses are 32- or 64-bit quantities; even 32-bit systems can use 64-bit physical addresses in some situations.
Bus addresses
The addresses used between peripheral buses and memory. Often they are the same as the physical addresses used by the processor, but that is not necessarily the case. Bus addresses are highly architecture dependent, of course.
Kernel logical addresses
These make up the normal address space of the kernel. These addresses map most or all of main memory, and are often treated as if they were physical addresses. On most architectures, logical addresses and their associated physical addresses differ only by a constant offset. Logical addresses use the hardware's native pointer size, and thus may be unable to address all of physical memory on heavily equipped 32-bit systems. Logical addresses are usually stored in variables of type unsigned long or void *. Memory returned from kmalloc has a logical address.
Kernel virtual addresses
These differ from logical addresses in that they do not necessarily have a direct mapping to physical addresses. All logical addresses are kernel virtual addresses; memory allocated by vmalloc also has a virtual address (but no direct physical mapping). The function kmap returns virtual addresses. Virtual addresses are usually stored in pointer variables.
If you have a logical address, the macro __pa() (defined in ) will return its associated physical address. Physical addresses can be mapped back to logical addresses with __va(), but only for low-memory pages.
Reference.
Normally every address issued (for x86 architecture) is a logical address which is translated to a linear address via the segment tables. After the translation into linear address, it is then translated to physical address via page table.
A nice article explaining the same in depth:
http://duartes.org/gustavo/blog/post/memory-translation-and-segmentation/
Physical Address is the address that is seen by the memory unit, i.e., one loaded into memory address register.
Logical Address is the address that is generated by the CPU.
The user program can never see the real physical address.Memory mapping unit converts the logical address to physical address.
Logical address generated by user process must be mapped to physical memory before they are used.
Logical memory is relative to the respective program i.e (Start point of program + offset)
Virtual memory uses a page table that maps to ram and disk. In this way each process can promise more memory for each individual process.
In the Usermode or UserSpace all the addresses seen by program are Virtual addresses.
When in kernel mode addresses seen by kernel are still virtual but termed as logical as they are equal to physical + pageoffset .
Physical addresses are the ones which are seen by RAM .
With Virtual memory every address in program goes through page tables.
when u write a small program eg:
int a=10;
int main()
{
printf("%d",a);
}
compile: >gcc -c fname.c
>ls
fname.o //fname.o is generated
>readelf -a fname.o >readelf_obj.txt
/readelf is a command to understand the object files and executabe file which will be in 0s and 1s. output is written in readelf_onj.txt file/
`>vim readelf_obj.txt`
/* under "section header" you will see .data .text .rodata sections of your object file. every starting or the base address is started from 0000 and grows to the respective size till it reach the size under the heading "size"----> these are the logical addresses.*/
>gcc fname.c
>ls
a.out //your executabe
>readelf -a a.out>readelf_exe.txt
>vim readelf_exe.txt
/* here the base address of all the sections are not zero. it will start from particular address and end up to the particular address. The linker will give the continuous adresses to all the sections (observe in the readelf_exe.txt file. observe base address and size of each section. They start continuously) so only the base addresses are different.---> this is called the virtual address space.*/
Physical address-> the memory ll have the physical address. when your executable file is loaded into memory it ll have physical address. Actually the virtual adresses are mapped to physical addresses for the execution.
In almost all books and articles I have read about about HIGHMEM in Linux kernel, they say while using 3:1 split, not all of the 1GB is available to the kernel for mapping. And normally its 896MB or so, with the rest being used for kernel data structures, memory maps, page tables and such.
My question is, what exactly are these data structures? Page tables are normally accessed via a page table address register, right? And the base address of page table is normally stored as a physical address. Now why do one need to reserve a virtual address space for the entire table?
Similarly, I read about kernel code itself occupying space. What does that have to do with virtual address space? Is it not the physical memory that would be consumed for storing the code?
And finally, these data structures why do they have to reserve the 128MB space? Why can't they be used out of the entire 1GB address space, as required, like any other normal data structure in kernel would do?
I have gone through LDD3, Professional Linux Kernel Architecture and several posts here at stack-overflow (like: Why Linux Kernel ZONE_NORMAL is limited to 896 MB?) and an older LWN article, but found no specific information on the same.
With regards to page tables, it's true that the MMU wouldn't care if the page tables themselves weren't mapped in the virtual address space - for the purposes of address translations, that would be OK. But when the kernel needs to modify the page tables, they do need to be mapped in the virtual address space - and the kernel can't just map them in "just in time", because it needs to modify the page tables themselves to do that. It's a chicken-and-egg problem, which means that the page tables need to remain mapped at all times.
A similar problem exists with the kernel code. For the code to execute, it must be mapped in the virtual address space - and if the code that does the page table modification were itself not present, we'd have a similar chicken-and-egg problem. Given this, it's easier to leave the entireity of the kernel code mapped all the time, along with the kernel-mode stacks and any kernel data structures access by code where you wouldn't want to potentially take a page fault. One large example of such data structures is the array of struct page structures, representing each physical memory page.
The 128MB reserve is not for a specific data structure, that always use it.
It's virtual memory, reserved for various users, which might use it. Normally, it's not all used.
About physical and virtual memory: every allocation needs three things - a physical page, a virutal page, and mapping connecting the two. Linux almost never uses physical addresses directly, it always passes virtual address translation.
For most kernel memory allocation (called lowmem), this translation is quite simple - subtract some constant from the virtual address to get the physical. But still, a virtual address is used.
Linux's memory management was written when the virtual memory space (4GB) was much larger
than the physical memory, even on the largest machines. In such cases, wasting virtual addresses is not a problem. Today, when physical memory is large, this leads to inefficiencies and problems.
The vmalloc virtual address range is used by any caller of vmalloc. For example:
1. Loading kernel drivers (using modprobe or insmod).
2. Kernel modules often allocate with vmalloc. The alternative function kmalloc used to be limited to 128K, and it rounds the size up to a power of 2, so vmalloc is often preferred for large allocations.
I read that the first 3 GBs are reserved for the process and the last GB is for the Kernel. I also read that the kernel is loaded starting from the 2nd MB of the physical address space (depending on the configuration). My question is that is the mapping of that last 1 GB is same for all processes and maps to this physical area of memory?
Another question is, when a process switches to kernel mode (eg, when a sys call occurs), then what page tables are used, the process page tables or the kernel page tables? If kernel page tables are used, then they can't access the memory locations belonging to the process. If that is the case, then there is apparently no use for the kernel virtual memory since all access to kernel code and data will be through the mapping of the last 1 GB of process address space. Please help me clarify this (any useful links will be much appreciated)
It seems, you are talking about 32-bit x86 systems, right?
If I am not mistaken, the kernel can be configured not only for 3Gb/1Gb memory distrubution, there could be other variants (e.g. 2Gb/2Gb). Still, 3Gb/1Gb is probably the most common one on x86-32.
The kernel part of the address space should be inaccessible from the user space. From the kernel's point of view, yes, the mapping of the memory occupied by the kernel itself is always the same. No matter, in the context of which process (or interrupt handler, or whatever else) the kernel currently operates.
As one of the consequences, if you look at the addresses of kernel symbols in /proc/kallsyms from different processes, you will see the same addresses each time. And these are exactly the addresses of the respective kernel functions, variables and others from the kernel's point of view.
So I suppose, the answer to your first question is "yes" but it is probably not very useful for the user-space code as the kernel space memory is not directly accessible from there anyway.
As for the second question, well, if the kernel currently operates in the context of some process, it can actually access the user-space memory of that process. I can't describe it in detail but probably the implementation of kernel functions copy_from_user and copy_to_user could give you some hints. See arch/x86/lib/usercopy_32.c and arch/x86/include/asm/uaccess.h in the kernel sources. It seems, on x86-32, the user-space memory is accessed in these functions directly, using the default memory mappings for the current process context. The 'magic' stuff there is only related to the optimizations and checking the address of the memory area for correctness.
Yes, the mapping of the kernel part of the address space is the same in all processes. Part of it does map that part of the physical memory where the kernel image is loaded, but that's not the bulk of it - the remainder is used to map other physical memory locations for the kernel's runtime working set.
When a process switches to kernel mode, the page tables are not changed. The kernel part of the address space simply becomes accessible because the CPL (Current Privilege Level) is now zero.
I have taken a course about Operating System design and concept and now I am trying to study Linux kernel thoroughly. I have a question that I cannot get rid of. In modern operating systems each process has own virtual address space(VAS) (eg, 0 to 2^32-1 in 32-bit systems). This provides many advantages. But in the implementation I am confused at some points. Let me explain it by giving an example:
Let's say we have two processes p1, p2;
p1 and p2 have their own VASes. An address 0x023f4a54 is mapped to different physical addresses(PA), how can it be? How is done this translation in this manner. I mean I know translation mechanism but I cannot understand that same address is mapped to different physical address when it comes different processes' address space.
0x023f4a54 in p1's VAS => PA 0x12321321
0x023f4a54 in p2's VAS => PA 0x23af2341 # (random addresses)
A CPU that provides virtual memory lets you set up a mapping of the memory addresses as the CPU sees it to physical memory addresses , typically this is done by a harware unit called the MMU.
The OS kernel can program that MMU, typically not down to the individual addresses, but rather in units of pages (4096 bytes is common). This means the MMU can be programmed to translate e.g. virtual addresses 0x1000-0x2000 to be translated to physical address 0x20000-0x21000.
The OS keeps one set of these mapping per process, and before it schedules a process to run, it loads that mapping into the MMU before it switches control back to the process. This enables different mappings for different processes, and nothing stops those mappings from mapping the same virtual address to a different physical address.
All this is transparent as far as the program is concerned, it just executes instructions on the CPU, and as the CPU has been set to virtual memory mode (paged mode), every memory access is translated by the MMU before it goes out on the physical bus to the memory.
The actual implementation details are complicated, but here's some references that might provide more insight;
http://wiki.osdev.org/Paging
http://www.usenix.org/event/usenix99/full_papers/cranor/cranor.pdf
http://ptgmedia.pearsoncmg.com/images/0131453483/downloads/gorman_book.pdf
Your question confuses a virtual address with using an address as a way of identification, so the first step to understanding is to separate the concepts.
A working example is the C runtime library function sprintf(). When properly declared and called, it is incorporated into a program as a shared object module, along with all the subfunctions it needs. The address of sprintf varies from program to program because the library is loaded in an available free address. For a simple hello world program, sprintf might be loaded at address 0x101000. For a complex program which calculates taxes, it might be loaded at 0x763f8000 (because of all the yucky logic the main program contains goes before the libraries it references). From a system perspective, the shared library is loaded into memory in one place only, but the address window (range of addresses) that each process sees that memory is unique to that executable.
Of course, this is complicated further by some of the features of Security Enhanced Linux (SELinux) which randomizes the addresses at which different program sections are loaded into memory, including shared library mapping.
--- clarification ---
As someone correctly points out, the virtual address mapping of each process is specific to each process, not unlike its set of file descriptors, socket connections, process parent and children, etc. That is, p1 might map address 0x1000 to physical 0x710000 while p2 maps address 0x1000 to a page fault, and p3 is mapped to some shared library at physical 0x9f32a000. The virtual address mapping is carefully supervised by the operating system, arguably for providing features such as swapping and paging, but also to provide features like shared code and data, and interprocess shared data.
There are two important data structures dealing with paging: the page table and the TLB. The OS maintains different page tables per process. The TLB is just a cache of the page table.
Now, different CPUs are, well, different. x86 accesses page tables directly, using a special register called CR3 which points to the page table in use. MIPS processors don't know anything about the page table, so the OS must work directly with the TLB.
Some CPUs (e.g: MIPS) keep an identifier in the TLB to separate different processes apart, so the OS can just change a control register when doing a context switch (unless it needs to reuse an identifier). Other CPUs require a full TLB flush in every context switch. So, basically, the OS needs to change some control registers and possibly needs to clear the TLB (do a TLB flush) to allow virtual addresses from different processes map to whatever physical addresses they should.
Thanks for all answers. The actual point that i dont know is that how same virtual address of different processes does not clash with each other's physical correspondent. I found the answer in the link below, each process has its own page table.
http://tldp.org/LDP/tlk/mm/memory.html
This mapping (virtual address to physical address) is handled by the OS and the MMU (see #nos' answer); the point of this abstraction is so p1 "thinks" it's accessing 0x023f4a54 when in reality it's accessing 0x12321321.
If you go back to your class on how programs work on the machine code level, p1 will expect some variable/function/whatever to be at the same place (eg 0x023f4a54) every time it's loaded. The OS mapping physical to virtual address provides this abstraction. In reality, it won't always be loaded to the same physical address, but your program doesn't care as long as it's in the same virtual address.
I think it is important to keep in mind that each process has its own set of page tables. I had hard times understanding this as well when I was thinking that there is a single page table for the whole system.
When specific process refers to its page table and tries to access the page that has not yet been mapped to a page frame, OS allocates a different piece of physical memory for that specific process and maps it to the virtual address.