Why code segment is common for different instances of same program - memory-management

I wanted to know why code segment is common for different instances of same program.
For example: consider program P1.exe running, if another copy of P1.exe is running, code segment will be common for both running instances. Why is it so?

If the code segment in question is loaded from a DLL, it might be the operating system being clever and re-using the already loaded library. This is one of the core points of using dynamically loaded library code, it allows the code to be shared across multiple processes.
Not sure if Windows is clever enough to do this with the code sections of regular EXE files, but it would make sense if possible.
It could also be virtual memory fooling you; two processes can look like they have the same thing on the same address, but that address is virtual, so they really are just showing mappings of physical memory.

Code is typically read-only, so it would be wasteful to make multiple copies of it.
Also, Windows (at least, I can't speak for other OS's at this level) uses the paging infrastructure to page code in and out direct from the executable file, as if it were a paging file. Since you are dealing with the same executable, it is paging from the same location to the same location.
Self-modifying code is effectively no longer supported by modern operating systems. Generating new code is possible (by setting the correct flags when allocating memory) but this is separate from the original code segment.

The code segment is (supposed to be) static (does not change) so there is no reason not to use it for several instances.

Just to start at a basic level, Segmentation is just a way to implement memory isolation and partitioning. Paging is another way to achieve this. For the most part, anything you can achieve via segmentation, you can be achieve via paging. As such, most modern operating systems on the x86 forego using segmentation at all, instead relying completely on paging facilities.
Because of this, all processes will usually be running under the trivial segment of (Base = 0, Limit = 4GB, Privilege level = 3), which means the code/data segment registers play no real part in determining the physical address, and are just used to set the privilege level of the process. All processes will usually be run at the same privilege, so they should all have the same value in the segment register.
Edit
Maybe I misinterpreted the question. I thought the question author was asking why both processes have the same value in the code segment register.

Related

How Memory Management Unit decide to map Pages into Pysycal memory frames?

In operating Systems their are chunks called pages of a process. So it will load only necessary pages into physical memory frames. My question is these processes in binary right? (instruction of process is in binary format at that stage?) Assume compiled c or c++ or any program. So page will contain part of that whole binary right? (please correct me if I wrong.) Do those pages contains binary parts form whole process? Then how memory management unit (MMU) will know which page to select next.. like that? because it don't know about process. Or its flow.. just binary?
Please Correct me if I made wrong. Here I have several questions that I have been struggling.
Thank You !
The MMU doesn't do that. The operating system does that. The MMU only translates addresses whenever you access memory, in the way the operating system told it.

Transient OS Code in Single Partition Allocation?

While going through the slides of a lecture on memory management, I came across this:
Relocation register value is static during program execution. Hence all of the OS must be present (it might be used). Otherwise, we have to relocate user code/data “on the fly”! In other words, we cannot have transient OS code
I couldn't understand what the above lines meant. I would appreciate if anyone could explain this.
The relocation-register scheme provides an effective way to allow the
operating system’s size to change dynamically. This flexibility is desirable in
many situations. For example, the operating system contains code and buffer
space for device drivers.
If a device driver (or other operating-system service) is not commonly used, we do not want to keep the code and data inmemory, as
we might be able to use that space for other purposes. Such code is sometimes
called transient operating-system code; it comes and goes as needed. Thus,
using this code changes the size of the operating system during program
execution.
All the logical memory references in user code are mapped to physical address space using Relocation Register value.(Phy.add = Rel.reg_val + log.add).
The value of Relocation-Register is set by OS. So it will be unaffected by user processes.
A transient code in Operating System means that it is active for just small time (useless for long time like code to check if the processor is deadlocked). So the Relocation-Register Scheme tries to allocate the memory occupied by this transient code to some other process in job Queue of Main Memory.Shrinking or expanding OS process boundary with nearby processes (because of single partition allocation).
By knowing above points, if transient code is present in OS, due to changes in boundaries of processes with OS process, we should be able to relocate user code/data frequently.

Reading huge files using Memory Mapped Files

I see many articles suggesting not to map huge files as mmap files so the virtual address space won't be taken solely by the mmap.
How does that change with 64 bit process where the address space dramatically increases?
If I need to randomly access a file, is there a reason not to map the whole file at once? (dozens of GBs file)
On 64bit, go ahead and map the file.
One thing to consider, based on Linux experience: if the access is truly random and the file is much bigger than you can expect to cache in RAM (so the chances of hitting a page again are slim) then it can be worth specifying MADV_RANDOM to madvise to stop the accumulation of hit file pages steadily and pointlessly swapping other actually useful stuff out. No idea what the windows equivalent API is though.
There's a reason to think carefully of using memory-mapped files, even on 64-bit platform (where virtual address space size is not an issue). It's related to the (potential) error handling.
When reading the file "conventionally" - any I/O error is reported by the appropriate function return value. The rest of error handling is up to you.
OTOH if the error arises during the implicit I/O (resulting from the page fault and attempt to load the needed file portion into the appropriate memory page) - the error handling mechanism depends on the OS.
In Windows the error handling is performed via SEH - so-called "structured exception handling". The exception propagates to the user mode (application's code) where you have a chance to handle it properly. The proper handling requires you to compile with the appropriate exception handling settings in the compiler (to guarantee the invocation of the destructors, if applicable).
I don't know how the error handling is performed in unix/linux though.
P.S. I don't say don't use memory-mapped files. I say do this carefully
One thing to be aware of is that memory mapping requires big contiguous chunks of (virtual) memory when the mapping is created; on a 32-bit system this particularly sucks because on a loaded system, getting long runs of contiguous ram is unlikely and the mapping will fail. On a 64-bit system this is much easier as the upper bound of 64-bit is... huge.
If you are running code in controlled environments (e.g. 64-bit server environments you are building yourself and know to run this code just fine) go ahead and map the entire file and just deal with it.
If you are trying to write general purpose code that will be in software that could run on any number of types of configurations, you'll want to stick to a smaller chunked mapping strategy. For example, mapping large files to collections of 1GB chunks and having an abstraction layer that takes operations like read(offset) and converts them to the offset in the right chunk before performing the op.
Hope that helps.

virtual addressing v.s. physical addressing

I do not quite understand the benefit of "multiple independent virtual address, which point to the same physical address", even though I read many books and posts,
E.g.,in a similar question Difference between physical addressing and virtual addressing concept,
The post claims that program will not crash each other, and
"in general, a particular physical page only maps to one application's
virtual space"
Well, in http://tldp.org/LDP/tlk/mm/memory.html, in section "shared virtual memory", it says
"For example there could be several processes in the system running
the bash command shell. Rather than have several copies of bash, one
in each processes virtual address space, it is better to have only one
copy in physical memory and all of the processes running bash share
it."
If one physical address (e.g., shell program) mapped to two independent virtual addresses, how can this not crash? Wouldn't it be the same as using the physical addressing?
what does virtual addressing provide, which is not possible or convenient from physical addressing? If no virtual memory exists, i.e., two directly point to the same physical memory? i think, by using some coordinating mechanism, it can still work. So why bother "virtual addressing, MMU, virtual memory" these stuff?
There are two main uses of this feature.
First, you can share memory between processes, that can communicate via the shared pages. In facts, shared memory is one of the simplest forms of IPC.
But shared readonly pages can also be used to avoid useless duplication: most of times, the code of a program does not change after it has been loaded in memory, so its memory pages can be shared among all the processes that are running that program. Obviously only the code is shared, the memory pages containing the stack, the heap and in general the data (or, if you prefer, the state) of the program are not shared.
This trick is improved with "copy on write". The code of executables usually doesn't change when running, but there are programs that are actually self-modifying (they were quite common in the past, when most of the development was still done in assembly); to support this stuff, the operating system does read-only sharing as explained before, but, if it detects a write on one of the shared pages, it disables the sharing for such page, creating an independent copy of it and letting the program write there.
This trick is particularly useful in situations in which there's a good chance that the data won't change, but it may happen.
Another case in which this technique can be used is when a process forks: instead of copying every memory page (which is completely useless if the child process does immediately an exec) , the new process shares with the parent all its memory pages in copy on write mode, allowing quick process creation, still "faking" the "classic" fork behavior.
If one physical address (e.g., shell program) mapped to two independent virtual addresses
Multiple processes can be built to share a piece of memory; e.g. with one acting as a server that writes to the memory, the other as a client reading from it, or with both reading and writing. This is a very fast way of doing inter-process communication (IPC). (Other solutions, such as pipes and sockets, require copying data to the kernel and then to the other process, which shared memory skips.) But, as with any IPC solution, the programs must coordinate their reads and writes to the shared memory region by some messaging protocol.
Also, the "several processes in the system running the bash command shell" from the example will be sharing the read-only part of their address spaces, which includes the code. They can execute the same in-memory code concurrently, and won't kill each other since they can't modify it.
In the quote
in general, a particular physical page only maps to one application's virtual space
the "in general" part should really be "typically": memory pages are not shared unless you set them up to be, or unless they are read-only.

Simple toy OS memory management

I'm developing a simple little toy OS in C and assembly as an experiment, but I'm starting to worry myself with my lack of knowledge on system memory.
I've been able to compile the kernel, run it in Bochs (loaded by GRUB), and have it print "Hello, world!" Now I'm off trying to make a simple memory manager so I can start experimenting with other things.
I found some resources on memory management, but they didn't really have enough code to go off of (as in I understood the concept, but I was at a loss for actually knowing how to implement it).
I tried a few more or less complicated strategies, then settled with a ridiculously simplistic one (just keep an offset in memory and increase it by the size of the allocated object) until the need arises to change. No fragmentation control, protection, or anything, yet.
So I would like to know where I can find more information when I do need a more robust manager. And I'd also like to learn more about paging, segmentation, and other relevant things. So far I haven't dealt with paging at all, but I've seen it mentioned often in OS development sites, so I'm guessing I'll have to deal with it sooner or later.
I've also read about some form of indirect pointers, where an application holds a pointer that is redirected by the memory manager to its real location. That's quite a ways off for me, I'm sure, but it seems important if I ever want to try virtual memory or defragmentation.
And also, where am I supposed to put my memory offset? I had no idea what the best spot was, so I just randomly picked 0x1000, and I'm sure it's going to come back to me later when I overwrite my kernel or something.
I'd also like to know what I should expect performance-wise (e.g. a big-O value for allocation and release) and what a reasonable ratio of memory management structures to actual managed memory would be.
Of course, feel free to answer just a subset of these questions. Any feedback is greatly appreciated!
If you don't know about it already, http://wiki.osdev.org/ is a good resource in general, and has multiple articles on memory management. If you're looking for a particular memory allocation algorithm, I'd suggest reading up on the "buddy system" method (http://en.wikipedia.org/wiki/Buddy_memory_allocation). I think you can probably find an example implementation on the Internet. If you can find a copy in a library, it's also probably worth reading the section of The Art Of Computer Programming dedicated to memory management (Volume 1, Section 2.5).
I don't know where you should put the memory offset (to be honest I've never written a kernel), but one thing that occurred to me which might work is to place a static variable at the end of the kernel, and start allocations after that address. Something like:
(In the memory manager)
extern char endOfKernel;
... (also in the memory manager)
myOffset = &endOfKernel;
... (at the end of the file that gets placed last in the binary)
char endOfKernel;
I guess it goes without saying, but depending on how serious you get about the operating system, you'll probably want some books on operating system design, and if you're in school it wouldn't hurt to take an OS class.
If you're using GCC with LD, you can create a linker script that defines a symbol at the end of the .BSS section (which would give you the complete size of the kernel's memory footprint). Many kernels in fact use this value as a parameter for GRUB's AOUT_KLUDGE header.
See http://wiki.osdev.org/Bare_bones#linker.ld for more details, note the declaration of the ebss symbol in the linker script.

Resources