What is thrashing? Why does it occur? - memory-management

In an operating system, thrashing is something related to memory management. Why does thrashing occur?
How can we prevent it?
I checked Wikipedia (but I need some simple understanding).

In operating systems that implement a virtual memory space the programs allocate memory from an address space that may be much larger than the actual amount of RAM the system possesses. The OS is responsible for deciding which programs "memory" is in actual RAM. It needs a place to keep things while they are "out". This is what is called "swap space", as the OS is swapping things in and out as needed. When this swapping activity is occurring such that it is the major consumer of the CPU time, then you are effectively thrashing. You prevent it by running fewer programs, writing programs that use memory more efficiently, adding RAM to the system, or maybe even by increasing the swap size.
A page fault occurs when the memory access requested (from the virtual address space) does not map to something that is in RAM. A page must then be sent from RAM to swap, so that the requested new page can be brought from swap to RAM. As you might imagine, 2 disk I/Os for a RAM read tends to be pretty poor performance.

Thrashing
It is a state in which our CPU perform 'productive' work less and 'swapping' more.
CPU is busy in swapping pages, so much that it can not respond to user program as much as required.
Why it occurs
In our system Thrashing occurs when there are too much pages in our memory, and each page referes t an other page. The real memory shortens in capacity to have all the pages in it, so it uses 'virtual memory'. When each page in execution demands that page that is not currently in real memory (RAM) it place some pages on virtual memory and adjust the required page on RAM. If CPU is so much busy in doing this task, thrashing occurs.

I know this question has been asked long ago, but I just wanted to share information with others.
The term thrashing is actually related to the virtual memory, that an operating system uses in order to provide extra amount of memory or space for the processes. What dose it actually mean by the term thrashing is that, when the process is ready to be loaded in the memory, only a few or some pages(parts) of the process are loaded in the actual physical memory, and rest are in the swap-space(the virtual memory or the disk).
Now if the page that the process needs to execute, is not loaded in the memory, it generates a page-fault and asks the OS to replace the page. Here the process resumes its execution.
Some times, the page replaced by the OS is again required by the process therefore it again asks the OS to load it in the memory, replacing some other page and so on. since the process is not executing, therefore the CPU utilization is 0, However the disk read and write are at the peak.
Our OSs are designed in such a way that when the CPU utilization decreases it initiates another process in the memory. The next process have to wait now because the first process is busy. Again since the CPU is not being utilized or it is 0(in our example), the OS initiates another process, and the same thing happen.
Therefore, the CPU utilization decreases to an extreme minimum level, while the processes are busy reading and writing(swapping the pages). This is called thrashing!

Logical addresses are generated by CPU which are in fact not real memory location but a process thinks that it is the actual memory location.
A complete process is divided into different parts which are stored in pages in logical memory but only sub parts or only some pages are allocated actual memory for execution which are required at that point of time, whereas other pages reside in logical memory and are not allocated a physical address. Now if any other pages are required to load and there is not enough frame to allocate it to assign, an interrupt occurs which causes a page fault and a replacement algorithm is needed to remove some pages from frames and load the current page required.
Now suppose there are not enough frames to meet requirements for all the pages needed to be loaded. Then the frame removed will again demand to be loaded and again page fault will occur. This goes in loop. Meanwhile, since the process is not able to execute, CPU utilization will be low, and the dispatcher will load more processes in the CPU which degrades the scenario even more.
This process is known as thrashing. Thrashing generally occurs when enough frames are not allocated to a process than needed.
Ways to prevent thrashing:
instruct mid-term scheduler to swap out some of the process too recover from thrashing
instructing the dispatcher not to load more processes after a threshold

In a virtual memory system, thrashing is the excessive swapping of pages of data between memory and the hard disk, causing the application to respond more slowly. The virtual memory function tracks page usage and keeps often-used pages in memory as much as possible.

Memory thrashing is a problem which arises when the memory is allocated more than the physical memory and it is not available in the system.
To know what is thrashing, you must first be aware of swapping and page fault. So lets start with those concepts:
Page Fault and Swapping :- A page fault occurs when the memory access requested (from the virtual address space) does not map to something that is in RAM. A page must then be sent from RAM to swap, so that the requested new page can be brought from swap to RAM. This results in 2 disk I/Os. Now you might know that disk I/Os are very slow as compared to memory access.
To know what is memory thrashing please refer to the following link :
http://www.firmcodes.com/memory-thrashing-in-operating-system/

The operating system uses the concept of virtual memory to provide memory to process at times when main memory get full and don't have space to provide space to incoming processes.This mechanism of using virtual memory as substitute memory is abstract and the user doesn't get to know what is going behind.It appears to the user that the new process which he/she executed just now got space in the main memory.
So,in order to accomodate incoming processes in main memory ,the idle processes residing in main memory need to be moved to virtual memory.This movement from main memory to virtual memory takes place when page fault occurs.
Now coming to thrashing.
If the operating system uses such a page replacement algo in which there is high probability for page fault to occur,then much of the CPU cycle will be wasted in swapping pages to and fro between main memory and virtual memory.This suppresses CPU performance.This process of degradation in CPU performance due to occurence of large number of page faults is called Thrashing.

Thrashing is a state in which our CPU perform 'productive' work less and 'swapping' more. CPU is busy in swapping pages, so much that it can not respond to user program as much as required.
Why it occurs : In our system, Thrashing occurs when there are too much pages in our memory, and each page refers to another page. The real memory shortens in capacity to have all the pages in it, so it uses 'virtual memory'. When each page in execution demands that page that is not currently in real memory (RAM) it place some pages on virtual memory and adjust the required page on RAM. If CPU is so much busy in doing this task, thrashing occurs.
To resolve thrashing you can do any of the suggestions below :
*Increase the amount of RAM in the computer.
*Decrease the number of programs being run on the computer.
*Adjust the size of the swap file.

If you want exactly what is thrashing right..
If C.P.U usage is 0(idle) then this situation called as Thrashing.
It will Occur At the time of Deadlock situation.so at that time no resources(like printers etc)will utilise C.P.U so it will be idle so we can call this situation as Thrashing.
thats it about Thrashing :)

Related

Will Windows be still able to allocate memory when free space in physical memory is very low?

On Windows 32-bit system the application is being developed using Visual Studio:
Lets say lots of other application running on my machine and they have occupied almost all of physical memory and only 1 MB memory is left free. If my application (which has not yet allocated any memory) tries to allocate, say 2 MB, will the call be successful?
My guess: In theory, each Windows application has 2GB of virtual memory available.
So I believe this call should be successful (regardless how much physical memory is available). But I am not sure on this. That's why asking here.
Windows gives a rock-hard guarantee that this will always work. A process can only allocate virtual memory when Windows can commit space in the paging file for the allocation. If necessary, it will grow the paging file to make the space available. If that fails, for example when the paging file grows beyond the preset limit, then the allocation fails as well. Windows doesn't have the equivalent of the Linux "OOM killer", it does not support over-committing that may require an operating system to start randomly killing processes to find RAM.
Do note that the "always works" clause does have a sting. There is no guarantee on how long this will take. In very extreme circumstances the machine can start thrashing where just about every memory access in the running processes causes a page fault. Code execution slows down to a crawl, you can lose control with the mouse pointer frozen when Explorer or the mouse or video driver start thrashing as well. You are well past the point of shopping for RAM when that happens. Windows applies quotas to processes to prevent them from hogging the machine, but if you have enough processes running then that doesn't necessarily avoid the problem.
Of course. It would be lousy design if memory had to be wasted now in order to be used later. Operating systems constantly re-purpose memory to its most advantageous use at any moment. They don't have to waste memory by keeping it free just so that it can be used later.
This is one of the benefits of virtual memory with a page file. Because the memory is virtual, the system can allocate more virtual memory than physical memory. Virtual memory that cannot fit in physical memory, is pushed out to the page file.
So the fact that your system may be using all of the physical memory does not mean that your program will not be able to allocate memory. In the scenario that you describe, your 2MB memory allocation will succeed. If you then access that memory, the virtual memory will be paged in to physical memory and very likely some other pages (maybe in your process, maybe in another process) will be pushed out to the page file.
Well, it will succeed as long as there's some memory for it - apart from physical memory, there's also the page file.
However, once you reach the limit of both RAM and the page file, you're done for and that's when the out of memory situation really starts being fun.
Now, systems like Windows Vista will try to use all of your available RAM, pretty much for caching. That's a good thing, and when there's a request for memory from an application, the cache will be thrown away as needed.
As for virtual memory, you can request much more than you have available, regardless of your RAM or page file size. Only when you commit the memory does it actually need some backing - either RAM or the page file. On 64-bit, you can easily request terabytes of virtual memory - that doesn't mean you'll get it when you try to commit it, though :P
If your application is unable to allocate a physical memory (RAM) block to store information, the operating system takes over and 'pages' or stores sections that are in RAM on disk to free up physical memory so that your program is able to perform the allocation. This is done automatically and is completely invisible to your applications.
So, in your example, on a system that has 1MB RAM free, if your application tries to allocate memory, the operating system will page certain contents of physical memory to disk and free up RAM for your application. Your application will not crash in this case.
This, obviously is much more complicated than that.
There are several ways to configure a page file on Windows (fixed size, variable size and on which disk). If you run out of physical memory, and out of hard drive space (because your page file has grown very large due to excessive 'paging') or reach the limit of your paging file (if it is a static limit) then your applications will fail due out an out-of-memory exception. With today's systems with large local storage however, this is a rare event.
Be sure to read about paging for the full picture. Check out:
http://en.wikipedia.org/wiki/Paging
In certain cases, you will notice that you have sufficient free physical memory. Say 100MB and your program tries to allocate a 10MB block to store a large object but fails. This is caused by physical memory fragmentation. Although the total free memory is 100MB, there is no single contiguous block of 10MB that can be used to store your object. This will result in an exception that needs to be handled in your code. If you allocate large objects in your code you may want to separate the allocation into smaller blocks to facilitate allocation, and then aggregate them back in your code logic. For example, instead of having a single 10m vector, you can declare 10 x 1m vectors in an array and allocate memory for each individual one.

What are the differences between virtual memory and physical memory?

I am often confused with the concept of virtualization in operating systems. Considering RAM as the physical memory, why do we need the virtual memory for executing a process?
Where does this virtual memory stand when the process (program) from the external hard drive is brought to the main memory (physical memory) for the execution.
Who takes care of the virtual memory and what is the size of the virtual memory?
Suppose if the size of the RAM is 4GB (i.e. 2^32-1 address spaces) what is the size of the virtual memory?
Softwares run on the OS on a very simple premise - they require memory. The device OS provides it in the form of RAM. The amount of memory required may vary - some softwares need huge memory, some require paltry memory. Most (if not all) users run multiple applications on the OS simultaneously, and given that memory is expensive (and device size is finite), the amount of memory available is always limited. So given that all softwares require a certain amount of RAM, and all of them can be made to run at the same time, OS has to take care of two things:
That the software always runs until user aborts it, i.e. it should not auto-abort because OS has run out of memory.
The above activity, while maintaining a respectable performance for the softwares running.
Now the main question boils down to how the memory is being managed. What exactly governs where in the memory will the data belonging to a given software reside?
Possible solution 1: Let individual softwares specify explicitly the memory address they will use in the device. Suppose Photoshop declares that it will always use memory addresses ranging from 0 to 1023 (imagine the memory as a linear array of bytes, so first byte is at location 0, 1024th byte is at location 1023) - i.e. occupying 1 GB memory. Similarly, VLC declares that it will occupy memory range 1244 to 1876, etc.
Advantages:
Every application is pre-assigned a memory slot, so when it is installed and executed, it just stores its data in that memory area, and everything works fine.
Disadvantages:
This does not scale. Theoretically, an app may require a huge amount of memory when it is doing something really heavy-duty. So to ensure that it never runs out of memory, the memory area allocated to it must always be more than or equal to that amount of memory. What if a software, whose maximal theoretical memory usage is 2 GB (hence requiring 2 GB memory allocation from RAM), is installed in a machine with only 1 GB memory? Should the software just abort on startup, saying that the available RAM is less than 2 GB? Or should it continue, and the moment the memory required exceeds 2 GB, just abort and bail out with the message that not enough memory is available?
It is not possible to prevent memory mangling. There are millions of softwares out there, even if each of them was allotted just 1 kB memory, the total memory required would exceed 16 GB, which is more than most devices offer. How can, then, different softwares be allotted memory slots that do not encroach upon each other's areas? Firstly, there is no centralized software market which can regulate that when a new software is being released, it must assign itself this much memory from this yet unoccupied area, and secondly, even if there were, it is not possible to do it because the no. of softwares is practically infinite (thus requiring infinite memory to accommodate all of them), and the total RAM available on any device is not sufficient to accommodate even a fraction of what is required, thus making inevitable the encroaching of the memory bounds of one software upon that of another. So what happens when Photoshop is assigned memory locations 1 to 1023 and VLC is assigned 1000 to 1676? What if Photoshop stores some data at location 1008, then VLC overwrites that with its own data, and later Photoshop accesses it thinking that it is the same data is had stored there previously? As you can imagine, bad things will happen.
So clearly, as you can see, this idea is rather naive.
Possible solution 2: Let's try another scheme - where OS will do majority of the memory management. Softwares, whenever they require any memory, will just request the OS, and the OS will accommodate accordingly. Say OS ensures that whenever a new process is requesting for memory, it will allocate the memory from the lowest byte address possible (as said earlier, RAM can be imagined as a linear array of bytes, so for a 4 GB RAM, the addresses range for a byte from 0 to 2^32-1) if the process is starting, else if it is a running process requesting the memory, it will allocate from the last memory location where that process still resides. Since the softwares will be emitting addresses without considering what the actual memory address is going to be where that data is stored, OS will have to maintain a mapping, per software, of the address emitted by the software to the actual physical address (Note: that is one of the two reasons we call this concept Virtual Memory. Softwares are not caring about the real memory address where their data are getting stored, they just spit out addresses on the fly, and the OS finds the right place to fit it and find it later if required).
Say the device has just been turned on, OS has just launched, right now there is no other process running (ignoring the OS, which is also a process!), and you decide to launch VLC. So VLC is allocated a part of the RAM from the lowest byte addresses. Good. Now while the video is running, you need to start your browser to view some webpage. Then you need to launch Notepad to scribble some text. And then Eclipse to do some coding.. Pretty soon your memory of 4 GB is all used up, and the RAM looks like this:
Problem 1: Now you cannot start any other process, for all RAM is used up. Thus programs have to be written keeping the maximum memory available in mind (practically even less will be available, as other softwares will be running parallelly as well!). In other words, you cannot run a high-memory consuming app in your ramshackle 1 GB PC.
Okay, so now you decide that you no longer need to keep Eclipse and Chrome open, you close them to free up some memory. The space occupied in RAM by those processes is reclaimed by OS, and it looks like this now:
Suppose that closing these two frees up 700 MB space - (400 + 300) MB. Now you need to launch Opera, which will take up 450 MB space. Well, you do have more than 450 MB space available in total, but...it is not contiguous, it is divided into individual chunks, none of which is big enough to fit 450 MB. So you hit upon a brilliant idea, let's move all the processes below to as much above as possible, which will leave the 700 MB empty space in one chunk at the bottom. This is called compaction. Great, except that...all the processes which are there are running. Moving them will mean moving the address of all their contents (remember, OS maintains a mapping of the memory spat out by the software to the actual memory address. Imagine software had spat out an address of 45 with data 123, and OS had stored it in location 2012 and created an entry in the map, mapping 45 to 2012. If the software is now moved in memory, what used to be at location 2012 will no longer be at 2012, but in a new location, and OS has to update the map accordingly to map 45 to the new address, so that the software can get the expected data (123) when it queries for memory location 45. As far as the software is concerned, all it knows is that address 45 contains the data 123!)! Imagine a process that is referencing a local variable i. By the time it is accessed again, its address has changed, and it won't be able to find it any more. The same will hold for all functions, objects, variables, basically everything has an address, and moving a process will mean changing the address of all of them. Which leads us to:
Problem 2: You cannot move a process. The values of all variables, functions and objects within that process have hardcoded values as
spat out by the compiler during compilation, the process depends on
them being at the same location during its lifetime, and changing them is expensive. As a result,
processes leave behind big "holes" when they exit. This is called
External Fragmentation.
Fine. Suppose somehow, by some miraculous manner, you do manage to move the processes up. Now there is 700 MB of free space at the bottom:
Opera smoothly fits in at the bottom. Now your RAM looks like this:
Good. Everything is looking fine. However, there is not much space left, and now you need to launch Chrome again, a known memory-hog! It needs lots of memory to start, and you have hardly any left...Except.. you now notice that some of the processes, which were initially occupying large space, now is not needing much space. May be you have stopped your video in VLC, hence it is still occupying some space, but not as much as it required while running a high resolution video. Similarly for Notepad and Photos. Your RAM now looks like this:
Holes, once again! Back to square one! Except, previously, the holes occurred due to processes terminating, now it is due to processes requiring less space than before! And you again have the same problem, the holes combined yield more space than required, but they are scattered around, not much of use in isolation. So you have to move those processes again, an expensive operation, and a very frequent one at that, since processes will frequently reduce in size over their lifetime.
Problem 3: Processes, over their lifetime, may reduce in size, leaving behind unused space, which if needed to be used, will require
the expensive operation of moving many processes. This is called
Internal Fragmentation.
Fine, so now, your OS does the required thing, moves processes around and start Chrome and after some time, your RAM looks like this:
Cool. Now suppose you again resume watching Avatar in VLC. Its memory requirement will shoot up! But...there is no space left for it to grow, as Notepad is snuggled at its bottom. So, again, all processes has to move below until VLC has found sufficient space!
Problem 4: If processes needs to grow, it will be a very expensive operation
Fine. Now suppose, Photos is being used to load some photos from an external hard disk. Accessing hard-disk takes you from the realm of caches and RAM to that of disk, which is slower by orders of magnitudes. Painfully, irrevocably, transcendentally slower. It is an I/O operation, which means it is not CPU bound (it is rather the exact opposite), which means it does not need to occupy RAM right now. However, it still occupies RAM stubbornly. If you want to launch Firefox in the meantime, you can't, because there is not much memory available, whereas if Photos was taken out of memory for the duration of its I/O bound activity, it would have freed lot of memory, followed by (expensive) compaction, followed by Firefox fitting in.
Problem 5: I/O bound jobs keep on occupying RAM, leading to under-utilization of RAM, which could have been used by CPU bound jobs in the meantime.
So, as we can see, we have so many problems even with the approach of virtual memory.
There are two approaches to tackle these problems - paging and segmentation. Let us discuss paging. In this approach, the virtual address space of a process is mapped to the physical memory in chunks - called pages. A typical page size is 4 kB. The mapping is maintained by something called a page table, given a virtual address, all now we have to do is find out which page the address belong to, then from the page table, find the corresponding location for that page in actual physical memory (known as frame), and given that the offset of the virtual address within the page is same for the page as well as the frame, find out the actual address by adding that offset to the address returned by the page table. For example:
On the left is the virtual address space of a process. Say the virtual address space requires 40 units of memory. If the physical address space (on the right) had 40 units of memory as well, it would have been possible to map all location from the left to a location on the right, and we would have been so happy. But as ill luck would have it, not only does the physical memory have less (24 here) memory units available, it has to be shared between multiple processes as well! Fine, let's see how we make do with it.
When the process starts, say a memory access request for location 35 is made. Here the page size is 8 (each page contains 8 locations, the entire virtual address space of 40 locations thus contains 5 pages). So this location belongs to page no. 4 (35/8). Within this page, this location has an offset of 3 (35%8). So this location can be specified by the tuple (pageIndex, offset) = (4,3). This is just the starting, so no part of the process is stored in the actual physical memory yet. So the page table, which maintains a mapping of the pages on the left to the actual pages on the right (where they are called frames) is currently empty. So OS relinquishes the CPU, lets a device driver access the disk and fetch the page no. 4 for this process (basically a memory chunk from the program on the disk whose addresses range from 32 to 39). When it arrives, OS allocates the page somewhere in the RAM, say first frame itself, and the page table for this process takes note that page 4 maps to frame 0 in the RAM. Now the data is finally there in the physical memory. OS again queries the page table for the tuple (4,3), and this time, page table says that page 4 is already mapped to frame 0 in the RAM. So OS simply goes to the 0th frame in RAM, accesses the data at offset 3 in that frame (Take a moment to understand this. The entire page, which was fetched from disk, is moved to frame. So whatever the offset of an individual memory location in a page was, it will be the same in the frame as well, since within the page/frame, the memory unit still resides at the same place relatively!), and returns the data! Because the data was not found in memory at first query itself, but rather had to be fetched from disk to be loaded into memory, it constitutes a miss.
Fine. Now suppose, a memory access for location 28 is made. It boils down to (3,4). Page table right now has only one entry, mapping page 4 to frame 0. So this is again a miss, the process relinquishes the CPU, device driver fetches the page from disk, process regains control of CPU again, and its page table is updated. Say now the page 3 is mapped to frame 1 in the RAM. So (3,4) becomes (1,4), and the data at that location in RAM is returned. Good. In this way, suppose the next memory access is for location 8, which translates to (1,0). Page 1 is not in memory yet, the same procedure is repeated, and the page is allocated at frame 2 in RAM. Now the RAM-process mapping looks like the picture above. At this point in time, the RAM, which had only 24 units of memory available, is filled up. Suppose the next memory access request for this process is from address 30. It maps to (3,6), and page table says that page 3 is in RAM, and it maps to frame 1. Yay! So the data is fetched from RAM location (1,6), and returned. This constitutes a hit, as data required can be obtained directly from RAM, thus being very fast. Similarly, the next few access requests, say for locations 11, 32, 26, 27 all are hits, i.e. data requested by the process is found directly in the RAM without needing to look elsewhere.
Now suppose a memory access request for location 3 comes. It translates to (0,3), and page table for this process, which currently has 3 entries, for pages 1, 3 and 4 says that this page is not in memory. Like previous cases, it is fetched from disk, however, unlike previous cases, RAM is filled up! So what to do now? Here lies the beauty of virtual memory, a frame from the RAM is evicted! (Various factors govern which frame is to be evicted. It may be LRU based, where the frame which was least recently accessed for a process is to be evicted. It may be first-come-first-evicted basis, where the frame which allocated longest time ago, is evicted, etc.) So some frame is evicted. Say frame 1 (just randomly choosing it). However, that frame is mapped to some page! (Currently, it is mapped by the page table to page 3 of our one and only one process). So that process has to be told this tragic news, that one frame, which unfortunate belongs to you, is to be evicted from RAM to make room for another pages. The process has to ensure that it updates its page table with this information, that is, removing the entry for that page-frame duo, so that the next time a request is made for that page, it right tells the process that this page is no longer in memory, and has to be fetched from disk. Good. So frame 1 is evicted, page 0 is brought in and placed there in the RAM, and the entry for page 3 is removed, and replaced by page 0 mapping to the same frame 1. So now our mapping looks like this (note the colour change in the second frame on the right side):
Saw what just happened? The process had to grow, it needed more space than the available RAM, but unlike our earlier scenario where every process in the RAM had to move to accommodate a growing process, here it happened by just one page replacement! This was made possible by the fact that the memory for a process no longer needs to be contiguous, it can reside at different places in chunks, OS maintains the information as to where they are, and when required, they are appropriately queried. Note: you might be thinking, huh, what if most of the times it is a miss, and the data has to be constantly loaded from disk into memory? Yes, theoretically, it is possible, but most compilers are designed in such a manner that follows locality of reference, i.e. if data from some memory location is used, the next data needed will be located somewhere very close, perhaps from the same page, the page which was just loaded into memory. As a result, the next miss will happen after quite some time, most of the upcoming memory requirements will be met by the page just brought in, or the pages already in memory which were recently used. The exact same principle allows us to evict the least recently used page as well, with the logic that what has not been used in a while, is not likely to be used in a while as well. However, it is not always so, and in exceptional cases, yes, performance may suffer. More about it later.
Solution to Problem 4: Processes can now grow easily, if space problem is faced, all it requires is to do a simple page replacement, without moving any other process.
Solution to Problem 1: A process can access unlimited memory. When more memory than available is needed, the disk is used as backup, the new data required is loaded into memory from the disk, and the least recently used data frame (or page) is moved to disk. This can go on infinitely, and since disk space is cheap and virtually unlimited, it gives an illusion of unlimited memory. Another reason for the name Virtual Memory, it gives you illusion of memory which is not really available!
Cool. Earlier we were facing a problem where even though a process reduces in size, the empty space is difficult to be reclaimed by other processes (because it would require costly compaction). Now it is easy, when a process becomes smaller in size, many of its pages are no longer used, so when other processes need more memory, a simple LRU based eviction automatically evicts those less-used pages from RAM, and replaces them with the new pages from the other processes (and of course updating the page tables of all those processes as well as the original process which now requires less space), all these without any costly compaction operation!
Solution to Problem 3: Whenever processes reduce in size, its frames in RAM will be less used, so a simple LRU based eviction can evict those pages out and replace them with pages required by new processes, thus avoiding Internal Fragmentation without need for compaction.
As for problem 2, take a moment to understand this, the scenario itself is completely removed! There is no need to move a process to accommodate a new process, because now the entire process never needs to fit at once, only certain pages of it need to fit ad hoc, that happens by evicting frames from RAM. Everything happens in units of pages, thus there is no concept of hole now, and hence no question of anything moving! May be 10 pages had to be moved because of this new requirement, there are thousands of pages which are left untouched. Whereas, earlier, all processes (every bit of them) had to be moved!
Solution to Problem 2: To accommodate a new process, data from only less recently used parts of other processes have to be evicted as required, and this happens in fixed size units called pages. Thus there is no possibility of hole or External Fragmentation with this system.
Now when the process needs to do some I/O operation, it can relinquish CPU easily! OS simply evicts all its pages from the RAM (perhaps store it in some cache) while new processes occupy the RAM in the meantime. When the I/O operation is done, OS simply restores those pages to the RAM (of course by replacing the pages from some other processes, may be from the ones which replaced the original process, or may be from some which themselves need to do I/O now, and hence can relinquish the memory!)
Solution to Problem 5: When a process is doing I/O operations, it can easily give up RAM usage, which can be utilized by other processes. This leads to proper utilization of RAM.
And of course, now no process is accessing the RAM directly. Each process is accessing a virtual memory location, which is mapped to a physical RAM address and maintained by the page-table of that process. The mapping is OS-backed, OS lets the process know which frame is empty so that a new page for a process can be fitted there. Since this memory allocation is overseen by the OS itself, it can easily ensure that no process encroaches upon the contents of another process by allocating only empty frames from RAM, or upon encroaching upon the contents of another process in the RAM, communicate to the process to update it page-table.
Solution to Original Problem: There is no possibility of a process accessing the contents of another process, since the entire allocation is managed by the OS itself, and every process runs in its own sandboxed virtual address space.
So paging (among other techniques), in conjunction with virtual memory, is what powers today's softwares running on OS-es! This frees the software developer from worrying about how much memory is available on the user's device, where to store the data, how to prevent other processes from corrupting their software's data, etc. However, it is of course, not full-proof. There are flaws:
Paging is, ultimately, giving user the illusion of infinite memory by using disk as secondary backup. Retrieving data from secondary storage to fit into memory (called page swap, and the event of not finding the desired page in RAM is called page fault) is expensive as it is an IO operation. This slows down the process. Several such page swaps happen in succession, and the process becomes painfully slow. Ever seen your software running fine and dandy, and suddenly it becomes so slow that it nearly hangs, or leaves you with no option that to restart it? Possibly too many page swaps were happening, making it slow (called thrashing).
So coming back to OP,
Why do we need the virtual memory for executing a process? - As the answer explains at length, to give softwares the illusion of the device/OS having infinite memory, so that any software, big or small, can be run, without worrying about memory allocation, or other processes corrupting its data, even when running in parallel. It is a concept, implemented in practice through various techniques, one of which, as described here, is Paging. It may also be Segmentation.
Where does this virtual memory stand when the process (program) from the external hard drive is brought to the main memory (physical memory) for the execution? - Virtual memory doesn't stand anywhere per se, it is an abstraction, always present, when the software/process/program is booted, a new page table is created for it, and it contains the mapping from the addresses spat out by that process to the actual physical address in RAM. Since the addresses spat out by the process are not real addresses, in one sense, they are, actually, what you can say, the virtual memory.
Who takes care of the virtual memory and what is the size of the virtual memory? - It is taken care of by, in tandem, the OS and the software. Imagine a function in your code (which eventually compiled and made into the executable that spawned the process) which contains a local variable - an int i. When the code executes, i gets a memory address within the stack of the function. That function is itself stored as an object somewhere else. These addresses are compiler generated (the compiler which compiled your code into the executable) - virtual addresses. When executed, i has to reside somewhere in actual physical address for duration of that function at least (unless it is a static variable!), so OS maps the compiler generated virtual address of i into an actual physical address, so that whenever, within that function, some code requires the value of i, that process can query the OS for that virtual address, and OS in turn can query the physical address for the value stored, and return it.
Suppose if the size of the RAM is 4GB (i.e. 2^32-1 address spaces) what is the size of the virtual memory? - The size of the RAM is not related to the size of virtual memory, it depends upon the OS. For example, on 32 bit Windows, it is 16 TB, on 64 bit Windows, it is 256 TB. Of course, it is also limited by the disk size, since that is where the memory is backed up.
Virtual memory is, among other things, an abstraction to give the programmer the illusion of having infinite memory available on their system.
Virtual memory mappings are made to correspond to actual physical addresses. The operating system creates and deals with these mappings - utilizing the page table, among other data structures to maintain the mappings. Virtual memory mappings are always found in the page table or some similar data structure (in case of other implementations of virtual memory, we maybe shouldn't call it the "page table"). The page table is in physical memory as well - often in kernel-reserved spaces that user programs cannot write over.
Virtual memory is typically larger than physical memory - there wouldn't be much reason for virtual memory mappings if virtual memory and physical memory were the same size.
Only the needed part of a program is resident in memory, typically - this is a topic called "paging". Virtual memory and paging are tightly related, but not the same topic. There are other implementations of virtual memory, such as segmentation.
I could be assuming wrong here, but I'd bet the things you are finding hard to wrap your head around have to do with specific implementations of virtual memory, most likely paging. There is no one way to do paging - there are many implementations and the one your textbook describes is likely not the same as the one that appears in real OSes like Linux/Windows - there are probably subtle differences.
I could blab a thousand paragraphs about paging... but I think that is better left to a different question targeting specifically that topic.
I am shamelessly copying the excerpts from man page of top
VIRT -- Virtual Image (kb)
The total amount of virtual memory used by the task. It includes all code, data and shared libraries plus pages that have been
swapped out and pages that have been mapped but not used.
SWAP -- Swapped size (kb)
Memory that is not resident but is present in a task. This is memory that has been swapped out but could include additional non-
resident memory. This column is calculated by subtracting physical memory from virtual memory
See here: Physical Vs Virtual Memory
Virtual memory is stored on the hard drive and is used when the RAM is filled. Physical memory is limited to the size of the RAM chips installed in the computer. Virtual memory is limited by the size of the hard drive, so virtual memory has the capability for more storage.
Physical memory: Physical memory refers to the RAM or the primary memory in the computer. Physical memory is a volatile memory. Therefore, it requires a continuous flow of power to retain data.
Virtual memory is a logical memory. In other words, it is a memory management technique performed by the operating system. Virtual memory allows the programmer to use more memory for the programs than the available physical memory. If the physical memory is 4GB and the virtual memory is 16GB, the programmer can use the 16GB virtual memory to execute the program. Using virtual memory, he can execute complex programs that require more memory than the physical memory.
The main difference between physical and virtual memory is that the physical memory refers to the actual RAM of the system attached to the motherboard, but the virtual memory is a memory management technique that allows the users to execute programs larger than the actual physical memory.
So, any program typically needs to store/read data from memory (physical address). CPU generates address space that is referred as Logical Address Apace, which is a set of addresses that can be used to access the program in memory, since logical address space corresponds to actual addresses there.
Addresses in logical address space are referred to as Logical or Virtual Addresses, whereas the set of addresses that correspond to them in physical memory are referred to as Physical Address Space, each of them called Physical Address, which is actual address in memory.
Note: MMU or Memory Management Unit is responsible for the address binding that takes place.
Source: "Operating System Concepts" by Abraham Silberschatz (Chapter8 Memory Management Strategies)

caches vs paging

So I'm in a computer architecture class, and I guess I'm having a hard time differentiating between caching and pages.
The only explanation I can come up with is that pages are the OS's way of tricking a program that it's doing all it's work in a specified region of memory, vs a cache memory is the hardware's way of tricking the OS that it's reading from one specified region of memory, when it's really not.
Does the os direct the hardware that it needs a "new page" or is that taken care of by the os trying to read the address that is "out of range" of the current cache "page" (for lack of a better term).
Am I on the right track or am I completely crazy?
Caching and pages are orthogonal concepts.
A cache is a high-speed "memory" that acts to minimise the number of accesses to a large low-speed "memory". In the most general sense, the high-speed "memory" could be your hard disk acting to cache web pages fetched from the web (low-speed "memory"). Of course, in the context of computer architecture, the term "cache" is more likely to refer to physical RAM used to speed up access to slower RAM or disk.
Pages, OTOH, are simply a unit of management for the contents of RAM or disk.
These two concepts come together in implementing virtual memory systems. A process may allocate 500 MB of memory. This may be more that the physical RAM available to give to the process, so the operating system allocates blocks on disk called pages, which will hold the contents of certain logical pages in the process's address-space.
When the process accesses a location in its address-space, and the associated page isn't currently mapped into physical memory, the CPU signals a page fault, and the OS responds by fetching the page from disk while the process is in a suspended state. Once the page is mapped, the process resumes and is able to access that memory location as if it was there all along.
The common view that virtual memory is a way of tricking the process into thinking it has tons of RAM isn't the only way to think about this. You could also think of a process's address-space as being logically stored on disk pages, with the OS-assisted mapping into RAM being just a way to cache the contents of those pages such that the process isn't continually accessing the hard drive. In this sense, caching and paged virtual memory are logically the same thing. Just keep in mind that, while this viewpoint may help to understand the relationship between the two concepts, it isn't entirely accurate, since it is possible to run without virtual memory at all, just physical memory (in fact, most embedded systems run this way).
I think Paging is bring instruction data from disk or secondary memory to the main memory while caching is bring instruction data from main memory to CPU

Real Memory VS. Private Memory

I have an Objective-C application that is designed to run for an extended period of time using WebKit views. After time the application builds up a fair amount of memory so I would like to accurately reveal the memory usage to the end user upon request. In Activity Monitor on OS X I see two memory values for my application Real Memory and Private Memory. From what I have read Private is everything that the process has and Real is an estimate. What should I trust? Is there a specific formula I can use to calculate the exact usage rather than what OS X reports?
Not my area of expertise.
If you are worried about memory leaks, you should be using the various tools provided by Apple to debug it.
Real memory is the actual physical memory in use by the process.
Private memory is the physical memory that is used by just that process.
Virtual memory is the size of the entire virtual memory of the process including those pages that are not currently resident in physical RAM.
It's actually quite difficult to tell by looking at those numbers if you have a leak. For example, a block that is malloced and then leaked will never be referenced again, so it'll eventually get swapped out. It'll be part of the virtual memory but not part of the resident memory. So if you have a leak, the virtual memory will gradually increase over time.
On the other hand, virtual memory will increase if malloc can't find an unused block of memory to allocate, but it won't decrease when free gives memory back. So if you malloc a huge amount of RAM the VM will increase but even if you then free it correctly, it'll never decrease again. If you also have a leak, it will take a long time for malloc to run out of the recycled VM which means that you might not notice it.
So, use the purpose built leak detection tools.

Memory mapped files causes low physical memory

I have a 2GB RAM and running a memory intensive application and going to low available physical memory state and system is not responding to user actions, like opening any application or menu invocation etc.
How do I trigger or tell the system to swap the memory to pagefile and free physical memory?
I'm using Windows XP.
If I run the same application on 4GB RAM machine it is not the case, system response is good. After getting choked of available physical memory system automatically swaps to pagefile and free physical memory, not that bad as 2GB system.
To overcome this problem (on 2GB machine) attempted to use memory mapped files for large dataset which are allocated by application. In this case virtual memory of the application(process) is fine but system cache is high and same problem as above that physical memory is less.
Even though memory mapped file is not mapped to process virtual memory system cache is high. why???!!! :(
Any help is appreciated.
Thanks.
If your data access pattern for using the memory mapped file is sequential, you might get slightly better page recycling by specifying the FILE_FLAG_SEQUENTIAL_SCAN flag when opening the underlying file. If your data pattern accesses the mapped file in random order, this won't help.
You should consider decreasing the size of your map view. That's where all the memory is actually consumed and cached. Since it appears that you need to handle files that are larger than available contiguous free physical memory, you can probably do a better job of memory management than the virtual memory page swapper since you know more about how you're using the memory than the virtual memory manager does. If at all possible, try to adjust your design so that you can operate on portions of the large file using a smaller view.
Even if you can't get rid of the need for full random access across the entire range of the underlying file, it might still be beneficial to tear down and recreate the view as needed to move the view to the section of the file that the next operation needs to access. If your data access patterns tend to cluster around parts of the file before moving on, then you won't need to move the view as often. You'll take a hit to tear down and recreate the view object, but since tearing down the view also releases all the cached pages associated with the view, it seems likely you'd see a net gain in performance because the smaller view significantly reduces memory pressure and page swapping system wide. Try setting the size of the view based on a portion of the installed system RAM and move the view around as needed by your file processing. The larger the view, the less you'll need to move it around, but the more RAM it will consume potentially impacting system responsiveness.
As I think you are hinting in your post, the slow response time is probably at least partially due to delays in the system while the OS writes the contents of memory to the pagefile to make room for other processes in physical memory.
The obvious solution (and possibly not practical) is to use less memory in your application. I'll assume that is not an option or at least not a simple option. The alternative is to try to proactively flush data to disk to continually keep available physical memory for other applications to run. You can find the total memory on the machine with GlobalMemoryStatusEx. And GetProcessMemoryInfo will return current information about your own application's memory usage. Since you say you are using a memory mapped file, you may need to account for that in addition. For example, I believe the PageFileUsage information returned from that API will not include information about your own memory mapped file.
If your application is monitoring the usage, you may be able to use FlushViewOfFile to proactively force data to disk from memory. There is also an API (EmptyWorkingSet) that I think attempts to write as many dirty pages to disk as possible, but that seems like it would very likely hurt performance of your own application significantly. Although, it could be useful in a situation where you know your application is going into some kind of idle state.
And, finally, one other API that might be useful is SetProcessWorkingSetSizeEx. You might consider using this API to give a hint on an upper limit for your application's working set size. This might help preserve more memory for other applications.
Edit: This is another obvious statement, but I forgot to mention it earlier. It also may not be practical for you, but it sounds like one of the best things you might do considering that you are running into 32-bit limitations is to build your application as 64-bit and run it on a 64-bit OS (and throw a little bit more memory at the machine).
Well, it sounds like your program needs more than 2GB of working set.
Modern operating systems are designed to use most of the RAM for something at all times, only keeping a fairly small amount free so that it can be immediately handed out to processes that need more. The rest is used to hold memory pages and cached disk blocks that have been used recently; whatever hasn't been used recently is flushed back to disk to replenish the pool of free pages. In short, there isn't supposed to be much free physical memory.
The principle difference between using a normal memory allocation and memory mapped a files is where the data gets stored when it must be paged out of memory. It doesn't necessarily have any effect on when the memory will be paged out, and will have little effect on the time it takes to page it out.
The real problem you are seeing is probably not that you have too little free physical memory, but that the paging rate is too high.
My suggestion would be to attempt to reduce the amount of storage needed by your program, and see if you can increase the locality of reference to reduce the amount of paging needed.

Resources