Windows - How does this memory addressing work? - windows

Assuming x86, I'm starting to learn that addresses 0x0 thru 0x7FFFFFFF are for the process; whereas anything higher is reserved for the kernel.
I have three curiosities:
1) Does a process EVER call an address higher than 0x7FFFFFFF? I assume it will always result in some sort of access denied? How is that access denied enforced?
2) Does "shared memory" IPC work by mapping two processes virtual addresses to the same physical address range?
3) The amount of RAM in your machine can vary. You may have 2GB, or much more like 16GB. How does this affect the addressing of RAM? Does the kernel ever leave a bunch of RAM unused because it was reserved for itself, but doesn't need it? How can I see this?

I am not very sure but you will find the maximum in this MSDN doc about how it works:-
The range of virtual addresses that is available to a process is
called the virtual address space for the process. Each user-mode
process has its own private virtual address space. For a 32-bit
process, the virtual address space is usually the 2-gigabyte range
0x00000000 through 0x7FFFFFFF. For a 64-bit process, the virtual
address space is the 8-terabyte range 0x000'00000000 through
0x7FF'FFFFFFFF. A range of virtual addresses is sometimes called a
range of virtual memory.
The diagram shows the virtual address spaces for two 64-bit processes:
Notepad.exe and MyApp.exe. Each process has its own virtual address
space that goes from 0x000'0000000 through 0x7FF'FFFFFFFF. Each shaded
block represents one page (4 kilobytes in size) of virtual or physical
memory. Notice that the Notepad process uses three contiguous pages of
virtual addresses, starting at 0x7F7'93950000. But those three
contiguous pages of virtual addresses are mapped to noncontiguous
pages in physical memory. Also notice that both processes use a page
of virtual memory beginning at 0x7F7'93950000, but those virtual pages
are mapped to different pages of physical memory.

Related

How come the allocation of virtual address spaces doesn't rob you of all virtual memory?

On a 32-bit computer, a virtual memory address is represented as an integer between 0 and 2^32. By virtue of being a 32-bit system, no address can be represented that's lower than 0 or higher than 2^32, and we therefore have a total of 4 GiB (2^32 bytes) of virtual memory to use up. We also know that address spaces are memory protected; they cannot touch, because otherwise one process would be able to “step on the toes” of another. So, if all that I've said is correct, let me now ask this: if we grant that, by Microsoft’s own documentation, 2 GiB of virtual address space are used to operate the system and 2GiB of virtual address space are provided to a single user-mode process, have we not exhausted every possible virtual memory address on a 32-bit system? Would this not mean that we have to resort to disk-swapping just to run 2 processes? Surely, this is too ridiculous to be true, and I just want someone to clarify where my thinking has gone astray...
I have looked at the following questions but none of them seem to give satisfying/consistent/not hand-wavy answers. Or maybe I just don't understand them:
What is the maximum addressable space of virtual memory? - Stack Overflow
Virtual address space in windows - Stack Overflow
What happens when the number of possible virtual addresses are exceeded - Stack Overflow
Thanks! :)
TL;DR: Virtual memory is per-process and the address space changes when the OS switches execution from one process to another.
Remember that we are talking about virtual memory here and virtual memory is a trick that works because of the cooperation between the OS and the CPU hardware.
The split is not always at 2GB (there is a boot switch for 3GB etc.) but for this discussion lets assume it is always 2GB and that we are on a x86 machine.
When the CPU needs to access a an address in virtual memory it needs to translate the memory page from virtual to physical. The exact mechanics of how this works is too big of a topic to cover here but suffice to say that the translation involves a page directory that has information about present/swapped, modified, COW etc and a way to map the address to physical RAM (and if not present, the CPU will ask the OS to swap it in from the page-file).
The upper 2GB is where the kernel and drivers live and the mapping is always the same in all processes (but it can only be accessed in kernel mode (CPU ring 0)). The lower 2GB however are per-process. Each process have their own set of mappings. When the OS switches executing from one process to another (context switch) the page directory for the CPU the thread is about to run on is changed. This way each process has its own virtual address space.

What is the difference between kernel logicla address space , kernel virtual address space and user virtual address space

Let me put my understanding.
Suppose we have a 32-bit memory address space for a system. So a process can access any memory in the 4GB range
If the RAM in the system we have of 4GB, kernel divides it into 1:3 . 1GB for kernel , and rest 3GB for the user space process.
A user space process will get the system memory access within that 3GB memory only and which address it gets is determined by the page table.
Kernel logical address is that 1GB ( approx ~896MB) memory which is being reserved only for the kernel. Is this correct?
kernel virtual address is the memory left i.e. 104MB + 3GB that also can be assigned to userspace. Is this correct?
user virtual address is the address generated by the user space process and its corresponding memory would be assigned from the 3GB reserved for the user space process by the kernel.
Let me know if my above understanding is correct? If not can you please explain in detail the difference between kernel logical address space , kernel virtual address space and user virtual address space.
your understanding is a mixture of right and wrong, I'll try to point to some of them:
in 32 bit machines, we're not always limited by 4GB addressable RAM, check this question for more detail: link
the memory is an abstraction for the user space programs, they see it a a continuous big chunk of memory, but the kernel manages this abstraction with some hardware support named MMU, to map the used virtual space in the user space program into an actual physical address or even some bloc in hard drive if swapping is activated.
the kernel can actually access to the physical memory, in order to manage the abstraction mentioned above, it can also use this abstraction, this depends on the designer of the kernel.
as for the difference between the virtual and logical addressing, check this answer: link

Does virtual address space resides in virtual memory?

Does virtual address space resides in virtual memory ? I have a confusion like , Each process
has its own virtual memory and page table and conversion to physical address from virtual address takes place while loading it into physical memory , but where does the virtual address space comes into picture ? I have gone through many operating systems books but everywhere it gives just explaination about particular word not where it resides and what is the relationship between them and how it operates.
please just explain me theoretically , example not needed.
Thanks in advance.
(Virtual) Address space is the set of allowable addresses for a given address width (that is 2^32 bytes on x86, 2^64 on x64). Virtual memory usually means almost the same. It is the set of allowable addresses for a certain process or application or also for the whole system. The virtual memory for a single application can be at most as big as the virtual address space of the system. Each application can "see" only the virtual address space that is allocated to it by the OS (and due to some trickery, it is possible that each application can have a virtual address space of the same size and the sum might be larger than the address space of the system).
Physical memory (more correct: physical RAM), is the amount of effectively installed RAM modules. It is usually smaller than the virtual address space. The OS does swapping to bring the requires memory pages from the hard disk into the physical memory if needed. A memory page in physical memory has a physical address and a virtual one. Normal applications only see the virtual address, and they don't (and must not) care about where the memory page is physically loaded. Therefore an address seen in an application or a debugger is really a virtual address in the virtual address space of that application. The physical address is only ever needed when directly interfacing to hardware. It can even constantly change if the OS decides to do so.
Hope this makes it a bit clearer.
I'm not a specialist but i think virtual addressing and paging is a part of the cpu protected mode introduced after 80386 and it is not part of the operating system. The operating system controls the page tables. For the virtual addresses they are just numbers in your executable file for example
objdump -d will display them

Difference between physical/logical/virtual memory address

I am a little confused about the terms physical/logical/virtual addresses in an Operating System(I use Linux- open SUSE)
Here is what I understand:
Physical Address- When the processor is in system mode, the address used by the processor is physical address.
Logical Address- When the processor is in user mode, the address used is the logical address. these are anyways mapped to some physical address by adding a base register with the offset value.It in a way provides a sort of memory protection.
I have come across discussion that virtual and logical addresses/address space are the same. Is it true?
Any help is deeply appreciated.
My answer is true for Intel CPUs running on a modern Linux system, and I am speaking about user-level processes, not kernel code. Still, I think it'll give you some insight enough to think about the other possibilities
Address Types
Regarding question 3:
I have come across discussion that virtual and logical
addresses/address space are the same. Is it true?
As far as I know they are the same, at least in modern OS's running on top of Intel processors.
Let me try to define two notions before I explain more:
Physical Address: The address of where something is physically located in the RAM chip.
Logical/Virtual Address: The address that your program uses to reach its things. It's typically converted to a physical address later by a hardware chip (mostly, not even the CPU is aware really of this conversion).
Virtual/Logical Address
The virtual address is well, a virtual address, the OS along with a hardware circuit called the MMU (Memory Management Unit) delude your program that it's running alone in the system, it's got the whole address space(having 32-bits system means your program will think it has 4 GBs of RAM; roughly speaking).
Obviously, if you have more than one program running at the time (you always do, GUI, Init process, Shell, clock app, calendar, whatever), this won't work.
What will happen is that the OS will put most of your program memory in the hard disk, the parts it uses the most will be present in the RAM, but hey, that doesn't mean they'll have the address you and your program know.
Example: Your process might have a variable named (counter) that's given the virtual address 0xff (imaginably...) and another variable named (oftenNotUsed) that's given the virtual address (0xaa).
If you read the assembly of your compiled code after all linking's happened, you'll be accessing them using those addresses but well, the (oftenNotUsed) variable won't be really there in RAM at 0xaa, it'll be in the hard disk because the process is not using it.
Moreover, the variable (counter) probably won't be physically at (0xff), it'll be somewhere else in RAM, when your CPU tries to fetch what's in 0xff, the MMU and a part of the OS, will do a mapping and get that variable from where it's really available in the RAM, the CPU won't even notice it wasn't in 0xff.
Now what happens if your program asks for the (oftenNotUsed) variable? The MMU+OS will notice this 'miss' and will fetch it for the CPU from the Harddisk into RAM then hand it over to the CPU as if it were in the address (0xaa); this fetching means some data that was present in RAM will be sent back to the Harddisk.
Now imagine this running for every process in your system. Every process thinks they have 4GB of RAMs, no one actually have that but everything works because everyone has some parts of their program available physically in the RAM but most of the program resides in the HardDisk. Don't confuse this part of the program memory being put in HD with the program data you can access through file operations.
Summary
Virtual address: The address you use in your programs, the address that your CPU use to fetch data, is not real and gets translated via MMU to some physical address; everyone has one and its size depends on your system(Linux running 32-bit has 4GB address space)
Physical address: The address you'll never reach if you're running on top of an OS. It's where your data, regardless of its virtual address, resides in RAM. This will change if your data is sent back and forth to the hard disk to accommodate more space for other processes.
All of what I have mentioned above, although it's a simplified version of the whole concept, is what's called the memory management part of the the computer system.
Consequences of this system
Processes cannot access each other memory, everyone has their separate virtual addresses and every process gets a different translation to different areas even though sometimes you may look and find that two processes try to access the same virtual address.
This system works well as a caching system, you typically don't use the whole 4GB you have available, so why waste that? let others share it and let them use it too; when your process needs more, the OS will fetch your data from the HD and replace other process' data, at an expense of course.
Physical Address- When the processor is in system mode, the address used by the processor is physical address.
Not necessarily true. It depends on the particular CPU. On x86 CPUs, once you've enabled page translation, all code ceases to operate with physical addresses or addresses trivially convertible into physical addresses (except, SMM, AFAIK, but that's not important here).
Logical Address- When the processor is in user mode, the address used is the logical address. these are anyways mapped to some physical address by adding a base register with the offset value.
Logical addresses do not necessarily apply to the user mode exclusively. On x86 CPUs they exist in the kernel mode as well.
I have come across discussion that virtual and logical addresses/address space are the same. Is it true?
It depends on the particular CPU. x86 CPUs can be configured in such a way that segments aren't used explicitly. They are used implicitly and their bases are always 0 (except for thread-local-storage segments). What remains when you drop the segment selector from a logical address is a 32-bit (or 64-bit) offset whose value coincides with the 32-bit (or 64-bit) virtual address. In this simplified set-up you may consider the two to be the same or that logical addresses don't exist. It's not true, but for most practical purposes, good enough of an approximation.
I am referring to below answer base on intel x86 CPU
Difference Between Logical to Virtual Address
Whenever your program is under execution CPU generates logical address for instructions which contains (16 bit Segment Selector and 32 bit offset ).Basically Virtual(Linear address) is generated using logical address fields.
Segment selector is 16 bit field out of which first 13bit is index (Which is a pointer to the segment descriptor resides in GDT,described below) , 1 bit TI field ( TI = 1, Refer LDT , TI=0 Refer GDT )
Now Segment Selector OR say segment identifier refers to Code Segment OR Data Segment OR Stack Segment etc. Linux contains one GDT/LDT (Global/Local Descriptor Table) Which contains 8 byte descriptor of each segments and holds the base (virtual) address of the segment.
So for for each logical address, virtual address is calculated using below steps.
1) Examines the TI field of the Segment Selector to determine which Descriptor
Table stores the Segment Descriptor. This field indicates that the Descriptor is
either in the GDT (in which case the segmentation unit gets the base linear
address of the GDT from the gdtr register) or in the active LDT (in which case the
segmentation unit gets the base linear address of that LDT from the ldtr register).
2) Computes the address of the Segment Descriptor from the index field of the Segment
Selector. The index field is multiplied by 8 (the size of a Segment Descriptor),
and the result is added to the content of the gdtr or ldtr register.
3) Adds the offset of the logical address to the Base field of the Segment Descriptor,
thus obtaining the linear(Virtual) address.
Now it is the job of Pagging unit to translate physical address from virtual address.
Refer : Understanding the linux Kernel , Chapter 2 Memory Addressing
User virtual addresses
These are the regular addresses seen by user-space programs. User addresses are either 32 or 64 bits in length, depending on the underlying hardware architecture, and each process has its own virtual address space.
Physical addresses
The addresses used between the processor and the system's memory. Physical addresses are 32- or 64-bit quantities; even 32-bit systems can use 64-bit physical addresses in some situations.
Bus addresses
The addresses used between peripheral buses and memory. Often they are the same as the physical addresses used by the processor, but that is not necessarily the case. Bus addresses are highly architecture dependent, of course.
Kernel logical addresses
These make up the normal address space of the kernel. These addresses map most or all of main memory, and are often treated as if they were physical addresses. On most architectures, logical addresses and their associated physical addresses differ only by a constant offset. Logical addresses use the hardware's native pointer size, and thus may be unable to address all of physical memory on heavily equipped 32-bit systems. Logical addresses are usually stored in variables of type unsigned long or void *. Memory returned from kmalloc has a logical address.
Kernel virtual addresses
These differ from logical addresses in that they do not necessarily have a direct mapping to physical addresses. All logical addresses are kernel virtual addresses; memory allocated by vmalloc also has a virtual address (but no direct physical mapping). The function kmap returns virtual addresses. Virtual addresses are usually stored in pointer variables.
If you have a logical address, the macro __pa() (defined in ) will return its associated physical address. Physical addresses can be mapped back to logical addresses with __va(), but only for low-memory pages.
Reference.
Normally every address issued (for x86 architecture) is a logical address which is translated to a linear address via the segment tables. After the translation into linear address, it is then translated to physical address via page table.
A nice article explaining the same in depth:
http://duartes.org/gustavo/blog/post/memory-translation-and-segmentation/
Physical Address is the address that is seen by the memory unit, i.e., one loaded into memory address register.
Logical Address is the address that is generated by the CPU.
The user program can never see the real physical address.Memory mapping unit converts the logical address to physical address.
Logical address generated by user process must be mapped to physical memory before they are used.
Logical memory is relative to the respective program i.e (Start point of program + offset)
Virtual memory uses a page table that maps to ram and disk. In this way each process can promise more memory for each individual process.
In the Usermode or UserSpace all the addresses seen by program are Virtual addresses.
When in kernel mode addresses seen by kernel are still virtual but termed as logical as they are equal to physical + pageoffset .
Physical addresses are the ones which are seen by RAM .
With Virtual memory every address in program goes through page tables.
when u write a small program eg:
int a=10;
int main()
{
printf("%d",a);
}
compile: >gcc -c fname.c
>ls
fname.o //fname.o is generated
>readelf -a fname.o >readelf_obj.txt
/readelf is a command to understand the object files and executabe file which will be in 0s and 1s. output is written in readelf_onj.txt file/
`>vim readelf_obj.txt`
/* under "section header" you will see .data .text .rodata sections of your object file. every starting or the base address is started from 0000 and grows to the respective size till it reach the size under the heading "size"----> these are the logical addresses.*/
>gcc fname.c
>ls
a.out //your executabe
>readelf -a a.out>readelf_exe.txt
>vim readelf_exe.txt
/* here the base address of all the sections are not zero. it will start from particular address and end up to the particular address. The linker will give the continuous adresses to all the sections (observe in the readelf_exe.txt file. observe base address and size of each section. They start continuously) so only the base addresses are different.---> this is called the virtual address space.*/
Physical address-> the memory ll have the physical address. when your executable file is loaded into memory it ll have physical address. Actually the virtual adresses are mapped to physical addresses for the execution.

Difference between Kernel Virtual Address and Kernel Logical Address?

I am not able to exactly difference between kernel logical address and virtual address. In Linux device driver book it says that all logical address are kernel virtual address, and virtual address doesn't have any linear mapping. But logically wise when we say it is logical and when we say virtual and in which situation we use these two ?
The Linux kernel maps most of the virtual address space that belongs to the kernel to perform 1:1 mapping with an offset of the first part of physical memory. (slightly less then for 1Gb for 32bit x86, can be different for other processors or configurations). For example, for kernel code on x86 address 0xc00000001 is mapped to physical address 0x1.
This is called logical mapping - a 1:1 mapping (with an offset) that allows the kernel to access most of the physical memory of the machine.
But this is not enough - sometime we have more then 1Gb physical memory on a 32bit machine, sometime we want to reference non contiguous physical memory blocks as contiguous to make thing simple, sometime we want to map memory mapped IO regions which are not RAM.
For this, the kernel keeps a region at the top of its virtual address space where it does a "random" page to page mapping. The mapping there do not follow the 1:1 pattern of the logical mapping area. This is what we call the virtual mapping.
It is important to add that on many platforms (x86 is an example), both the logical and virtual mapping are done using the same hardware mechanism (TLB controlling virtual memory). In many cases, the "logical mapping" is actually done using virtual memory facility of the processor, so this can be a little confusing. The difference therefore is the pattern according to which the mapping is done: 1:1 for logical, something random for virtual.
Basically there are 3 kinds of addressing, namely
Logical Addressing : Address is formed by base and offset. This is nothing but segmented addressing, where the address (or offset) in the program is always used with the base value in the segment descriptor
Linear Addressing : Also called virtual address. Here adresses are contigous, but the physical address are not. Paging is used to implement this.
Physical addressing : The actual address on the Main Memory!
Now, in linux, Kernel memory (in address space) is beyond 3 GB ( 3GB to 4GB), i.e. 0xc000000..The addresses used by Kernel are not Physical addresses. To map the virtual address it uses PAGE_OFFSET. Care must be taken that no page translation is involved. i.e. these addresses are contiguous in nature. However there is a limit to this, i.e. 896 MB on x86. Beyond which paging is used for translation. When you use vmalloc, these addresses are returned to access the allocated memory.
In short, when someone refers to Virtual Memory in context of User Space, then it is through Paging. If Kernel Virtual Memory is mentioned then it is either PAGE_OFFSETed or vmalloced address.
(Reference - Understanding Linux Kernel - 2.6 based )
Shash
Kernel logical addresses are mappings accessible to kernel code through normal CPU memory access functions. On 32-bit systems, only 4GB of kernel logical address space exists, even if more physical memory than that is in use. Logical address space backed by physical memory can be allocated with kmalloc.
Virtual addresses do not necessarily have corresponding logical addresses. You can allocate physical memory with vmalloc and get back a virtual address that has no corresponding logical address (on 32-bit systems with PAE, for example). You can then use kmap to assign a logical address to that virtual address.
Simply speaking, virtual address would include "high memory", which doesn't do the 1:1 mapping for the physical address,if your RAM size is more than the address range of kernel(typically,For 1G/3G in X86,your RAM is 3G but your kernel addressing range is 1G) and also the address return from kmap() and vmalloc(), which requires the kernel to establish page table for the memory mapping. since logic address is always memory mapped by the kernel(1:1 mapping), you don't need to explicitly call kernel API,like set_pte to set up the page table entry for the particular page.
so virtual address can't be logic address all the time.

Resources