Pointer to a memory location of another process - virtual-memory

I have come across this question:
If process A contains a pointer to a variable in process B, is it
possible for A to access and modify that variable?
My intuition is that, since processes A and B are different, they should not be allowed to access each other's address space since it will violate the protection.
But after some thinking, the following questions popped in my mind and want to get clarified.
(i). When we say A has a pointer to a variable V in B, does A holds the virtual address (of process B) corresponding to V or the physical address?
I believe when we talk about address in virtual memory systems, we always talk about virtual address. Please clarify.
(ii). If A contains the virtual address, since it is possible that both A and B can have the same virtual address, it is possible that A's pagetable contains a mapping for the virtual address that A holds (which is actually the virtual address of variable V in process B).
Then when A tries to access and modify that virtual address, it modifies something in its own address space (this access will be allowed since A accesses its own address).
I think the above reasoning applies when we try to access some random virtual address from a process i.e., accidentally the address that we try to access has a valid mapping.
Please throw your thoughts.

The whole point of processes and memory management in the form they appear in modern OS's is that you cannot have a pointer from one process to another. Their memory is separated and one process cannot usually see the memory of another memory. To each process it looks like it has almost all the memory of the system available to it, as if there were only this one process (and the kernel, which might map stuff into the process' memory region).
The exception is shared memory: if both processes share a shared memory region and both processes have the access rights to modify the region, then yes, one process can modify the memory of the other process (but only within the bounds of that shared memory region).
IIRC, it works like this on the lowest level: the kernel manages a list of memory regions for each process. These regions might map to physical memory. If a region isn't mapped to physical memory and the process tries to access the region, the CPU signals the kernel to make it available (for example by loading its content from a swap file/partition). If two processes use shared memory, for both processes these regions would map to the same physical memory location (or swap file location). You might want to read about MMU and virtual memory.

You're exactly right. This is the way that virtual memory works. However, memory can be shared between processes.
For example, mmap in Linux can be used to create a shared mapping that will allow 2 seperate processes to access the same memory. I'm not sure if this works by mapping the virtual addresses for the 2 processes to the same piece of physical memory of by the same technique that memory mapped I/O works (pages are marked "dirty", then the operating system is responsible for doing the actual I/O), but from the point of view of the programmer, it's exactly as if 2 threads were accessing that memory.

In addition to what everyone said: all modern OSes have mechanisms for debugging, and that requires one process (the debugger) to access the memory of another (the debuggee). In Windows for example, there are APIs ReadProcessMemory()/WriteProcessMemory(), but there's a privilege barrier to their use.
Yes, there's some abuse potential. But how would you debug otherwise?

Related

How does shared memory work behind the scene in Linux?

Process A created a shared memory '1234' using shmget . After this , Process A attaches the memory to itself using shmat.
Process B also attaches shared memory corresponding to '1234' to itself using shmat.
Now what does "attach" mean exactly ? Are there two copies of the same memory existing ? If not , then where exactly is this memory existing ?
Every process has its own virtual memory space. To simplify things a bit, you can imagine that a process has all possible memory addresses 0x00000000..0xffffffff available for itself. One consequence of this is that a process can not use memory allocated to any other process – this is absolutely essential for both stability and security.
Behind the scenes, kernel manages allocations of all processes and maps them to physical memory, making sure they don't overlap. Of course, not all addresses are in fact mapped, only those that are being used. This is done by means of pages, and with the help of memory-mapping unit in the CPU hardware.
Creating shared memory (shmget) allocates a chunk of memory that does not belong to any particular process. It just sits there. From the kernel's point of view, it doesn't matter who uses it. So a process has to request access to it – that's the role of shmat. By doing that, kernel will map the shared memory into process' virtual memory space. This way, the process can read and write it. Because it's the same memory, all processes who have "attached" to it see the same contents. Any change a process makes is visible to other processes as well.

What happens to the physical memory contents of a process during context switch

Let's consider that process A's virtual address V1->P1 (virtual address(V1) maps to physical address(P1) ).
During a context switch, page table of process A is swapped out with process B's page table.
Let's consider that process B's virtual address V2->P1 ((virtual address(V2) maps to physical address(P1)) with its own contents in that memory area.
Now what has happened to the physical memory contents that V1 was pointing to ?
Is it saved in memory when the context switch took place ? If so, what if the process A had written contents worth or close to the size of physical memory or RAM available ? Where would it save the contents then ?
There are many ways that an OS can handle the scenario described in the question, which is how to effectively deal with running out of free RAM. Depending on the CPU architecture and the goals of the OS here are some ways of handling this issue.
One solution is to simply kill processes when they attempt to malloc(or some other similar mechanism) and there is no free pages available. This effectively avoids the problem posed in the original question. On the surface this seems like a bad idea, but it has the advantages of simplifying kernel code and potentially speeding up context switches. In fact, in some applications if the code running had to use swap space to accommodate pages that can not fit into RAM by using non-volatile storage, the application would take such a performance hit that the system effectively failed anyways. Alternatively, not all computers even have non-volatile storage to use for swap space!
As already alluded to, the alternative is to use non-volatile storage to hold pages that can not fit into RAM. Actual specific implementations can vary depending on the specific needs of the system. Here are some possible ways to directly answer how the mappings of V1->P1 and V2->P1 can exist.
1 -There is often no strict requirement for the OS needs to maintain a V1->P1 and V2->P1 mapping. So long as the contents of the virtual space stays the same, the physical address backing it is transparent to the program running. If both programs needed to run concurrently the OS can stop the program running in V2 and move the memory of P1 to a new region, say P2. Then remap V2 to P2 and resume running the program in V2. This assumes free RAM exists to map of course.
2 - The OS can simply choose not to map the full virtual address space of a program into a RAM backed physical address space. Suppose not all of V1 address space was directly mapped into physical memory. When the program in V1 hits an unpagged section the OS can catch the exception triggered by this. If available RAM is running low, the OS can then use the swap space in non-volatile storage. The OS can free up some RAM by pushing some physical addresses of a a region not currently in use to swap space in non-volatile storage (such as the P1 space). Next the OS can load the requested page into the freed up RAM, setup a virtual to physical mapping and then return execution to the program in V1.
The advantage of this approach is that the OS can allocate more memory then it has RAM. Additionally, in many situations programs tend to repeatedly access a small area of memory. As a result, not having the entire virtual address region page into RAM may not incur that big of a performance penalty. The main downside to this approach is that it is more complex to code, can make context switches slower and accessing non-volatile storage is extremely slow compared to RAM.

Does virtual address matching matter in shared mem IPC?

I'm implementing IPC between two processes on the same machine (Linux x86_64 shmget and friends), and I'm trying to maximize the throughput of the data between the processes: for example I have restricted the two processes to only run on the same CPU, so as to take advantage of hardware caching.
My question is, does it matter where in the virtual address space each process puts the shared object? For example would it be advantageous to map the object to the same location in both processes? Why or why not?
It doesn't matter as long as the OS is concerned. It would have been advantageous to use the same base address in both processes if the TLB cache wasn't flushed between context switches. The Translation Lookaside Buffer (TLB) cache is a small buffer that caches virtual to physical address translations for individual pages in order to reduce the number of expensive memory reads from the process page table. Whenever a context switch occurs, the TLB cache is flushed - you don't want processes to be able to read a small portion of the memory of other processes, just because its page table entries are still cached in the TLB.
Context switch does not occur between processes running on different cores. But then each core has its own TLB cache and its content is completely uncorrelated with the content of the TLB cache of the other core. TLB flush does not occur when switching between threads from the same process. But threads share their whole virtual address space nevertheless.
It only makes sense to attach the shared memory segment at the same virtual address if you pass around absolute pointers to areas inside it. Imagine, for example, a linked list structure in shared memory. The usual practice is to use offsets from the beginning of the block instead of aboslute pointers. But this is slower as it involves additional pointer arithmetic. That's why you might get better performance with absolute pointers, but finding a suitable place in the virtual address space of both processes might not be an easy task (at least not doing it in a portable way), even on platforms with vast VA spaces like x86-64.
I'm not an expert here, but seeing as there are no other answers I will give it a go. I don't think it will really make a difference, because the virutal address does not necessarily correspond to the physical address. Said another way, the underlying physical address the OS maps your virtual address to is not dependent on the virtual address the OS gives you.
Again, I'm not a memory master. Sorry if I am way off here.

Does the last GB in the address space of a linux process map to the same physical memory?

I read that the first 3 GBs are reserved for the process and the last GB is for the Kernel. I also read that the kernel is loaded starting from the 2nd MB of the physical address space (depending on the configuration). My question is that is the mapping of that last 1 GB is same for all processes and maps to this physical area of memory?
Another question is, when a process switches to kernel mode (eg, when a sys call occurs), then what page tables are used, the process page tables or the kernel page tables? If kernel page tables are used, then they can't access the memory locations belonging to the process. If that is the case, then there is apparently no use for the kernel virtual memory since all access to kernel code and data will be through the mapping of the last 1 GB of process address space. Please help me clarify this (any useful links will be much appreciated)
It seems, you are talking about 32-bit x86 systems, right?
If I am not mistaken, the kernel can be configured not only for 3Gb/1Gb memory distrubution, there could be other variants (e.g. 2Gb/2Gb). Still, 3Gb/1Gb is probably the most common one on x86-32.
The kernel part of the address space should be inaccessible from the user space. From the kernel's point of view, yes, the mapping of the memory occupied by the kernel itself is always the same. No matter, in the context of which process (or interrupt handler, or whatever else) the kernel currently operates.
As one of the consequences, if you look at the addresses of kernel symbols in /proc/kallsyms from different processes, you will see the same addresses each time. And these are exactly the addresses of the respective kernel functions, variables and others from the kernel's point of view.
So I suppose, the answer to your first question is "yes" but it is probably not very useful for the user-space code as the kernel space memory is not directly accessible from there anyway.
As for the second question, well, if the kernel currently operates in the context of some process, it can actually access the user-space memory of that process. I can't describe it in detail but probably the implementation of kernel functions copy_from_user and copy_to_user could give you some hints. See arch/x86/lib/usercopy_32.c and arch/x86/include/asm/uaccess.h in the kernel sources. It seems, on x86-32, the user-space memory is accessed in these functions directly, using the default memory mappings for the current process context. The 'magic' stuff there is only related to the optimizations and checking the address of the memory area for correctness.
Yes, the mapping of the kernel part of the address space is the same in all processes. Part of it does map that part of the physical memory where the kernel image is loaded, but that's not the bulk of it - the remainder is used to map other physical memory locations for the kernel's runtime working set.
When a process switches to kernel mode, the page tables are not changed. The kernel part of the address space simply becomes accessible because the CPL (Current Privilege Level) is now zero.

How are same virtual address for different processes mapped to different physical addresses

I have taken a course about Operating System design and concept and now I am trying to study Linux kernel thoroughly. I have a question that I cannot get rid of. In modern operating systems each process has own virtual address space(VAS) (eg, 0 to 2^32-1 in 32-bit systems). This provides many advantages. But in the implementation I am confused at some points. Let me explain it by giving an example:
Let's say we have two processes p1, p2;
p1 and p2 have their own VASes. An address 0x023f4a54 is mapped to different physical addresses(PA), how can it be? How is done this translation in this manner. I mean I know translation mechanism but I cannot understand that same address is mapped to different physical address when it comes different processes' address space.
0x023f4a54 in p1's VAS => PA 0x12321321
0x023f4a54 in p2's VAS => PA 0x23af2341 # (random addresses)
A CPU that provides virtual memory lets you set up a mapping of the memory addresses as the CPU sees it to physical memory addresses , typically this is done by a harware unit called the MMU.
The OS kernel can program that MMU, typically not down to the individual addresses, but rather in units of pages (4096 bytes is common). This means the MMU can be programmed to translate e.g. virtual addresses 0x1000-0x2000 to be translated to physical address 0x20000-0x21000.
The OS keeps one set of these mapping per process, and before it schedules a process to run, it loads that mapping into the MMU before it switches control back to the process. This enables different mappings for different processes, and nothing stops those mappings from mapping the same virtual address to a different physical address.
All this is transparent as far as the program is concerned, it just executes instructions on the CPU, and as the CPU has been set to virtual memory mode (paged mode), every memory access is translated by the MMU before it goes out on the physical bus to the memory.
The actual implementation details are complicated, but here's some references that might provide more insight;
http://wiki.osdev.org/Paging
http://www.usenix.org/event/usenix99/full_papers/cranor/cranor.pdf
http://ptgmedia.pearsoncmg.com/images/0131453483/downloads/gorman_book.pdf
Your question confuses a virtual address with using an address as a way of identification, so the first step to understanding is to separate the concepts.
A working example is the C runtime library function sprintf(). When properly declared and called, it is incorporated into a program as a shared object module, along with all the subfunctions it needs. The address of sprintf varies from program to program because the library is loaded in an available free address. For a simple hello world program, sprintf might be loaded at address 0x101000. For a complex program which calculates taxes, it might be loaded at 0x763f8000 (because of all the yucky logic the main program contains goes before the libraries it references). From a system perspective, the shared library is loaded into memory in one place only, but the address window (range of addresses) that each process sees that memory is unique to that executable.
Of course, this is complicated further by some of the features of Security Enhanced Linux (SELinux) which randomizes the addresses at which different program sections are loaded into memory, including shared library mapping.
--- clarification ---
As someone correctly points out, the virtual address mapping of each process is specific to each process, not unlike its set of file descriptors, socket connections, process parent and children, etc. That is, p1 might map address 0x1000 to physical 0x710000 while p2 maps address 0x1000 to a page fault, and p3 is mapped to some shared library at physical 0x9f32a000. The virtual address mapping is carefully supervised by the operating system, arguably for providing features such as swapping and paging, but also to provide features like shared code and data, and interprocess shared data.
There are two important data structures dealing with paging: the page table and the TLB. The OS maintains different page tables per process. The TLB is just a cache of the page table.
Now, different CPUs are, well, different. x86 accesses page tables directly, using a special register called CR3 which points to the page table in use. MIPS processors don't know anything about the page table, so the OS must work directly with the TLB.
Some CPUs (e.g: MIPS) keep an identifier in the TLB to separate different processes apart, so the OS can just change a control register when doing a context switch (unless it needs to reuse an identifier). Other CPUs require a full TLB flush in every context switch. So, basically, the OS needs to change some control registers and possibly needs to clear the TLB (do a TLB flush) to allow virtual addresses from different processes map to whatever physical addresses they should.
Thanks for all answers. The actual point that i dont know is that how same virtual address of different processes does not clash with each other's physical correspondent. I found the answer in the link below, each process has its own page table.
http://tldp.org/LDP/tlk/mm/memory.html
This mapping (virtual address to physical address) is handled by the OS and the MMU (see #nos' answer); the point of this abstraction is so p1 "thinks" it's accessing 0x023f4a54 when in reality it's accessing 0x12321321.
If you go back to your class on how programs work on the machine code level, p1 will expect some variable/function/whatever to be at the same place (eg 0x023f4a54) every time it's loaded. The OS mapping physical to virtual address provides this abstraction. In reality, it won't always be loaded to the same physical address, but your program doesn't care as long as it's in the same virtual address.
I think it is important to keep in mind that each process has its own set of page tables. I had hard times understanding this as well when I was thinking that there is a single page table for the whole system.
When specific process refers to its page table and tries to access the page that has not yet been mapped to a page frame, OS allocates a different piece of physical memory for that specific process and maps it to the virtual address.

Resources