Memory Allocation Algorithm (non-contiguous) - algorithm

Just a curious question though!
While going through various techniques of memory allocation and management, specially paging in which the fixed size block is assigned to a particular request, everything goes fine until when the process exits the memory leaving behind spaces for non-contiguous allocation for other processes. Now for that case the data structure Page Tables keeps a track of corresponding Page to Frame number.
Can an algorithm be designed in such a way that pages are always allocated after the last allocated page in the memory and dynamically shifting and covering the empty page space that is caused by the freeing process (from somewhere in the middle) at every minute intervals, maintaining a row of contiguous memory at any given time. This may preserve contiguous memory allocation for processes thus enabling quicker memory access.
For e.g:
---------- ----------
| Page 1 | After Page 2 is deallocated | Page 1 |
---------- rather than assigning ----------
| Page 2 | the space to some other process | Page 3 |
---------- in a non-contiguous fashion, ----------
| Page 3 | there can be something | Page 4 |
---------- like this --> ----------
| Page 4 | | |
---------- ----------
The point is that the memory allocation can be continuous and always after the last allocated page.
I would appreciate if I am told about the design flaws or the parameters one has to take care of while thinking of any such algorithm.

This is called 'compacting' garbage collection, usually part of a mark-compact garbage collection algorithm: https://en.wikipedia.org/wiki/Mark-compact_algorithm
As with any garbage collector, the collection/compaction is the easy part. The hard part is letting the program that you're collecting for continue to work as you're moving its memory around.

Related

How to compute how many page faults can occur at most?

I got this question during an exam last week and I still cannot figure out how to get to the correct answer.
I think that first, the 8MiB block of data can at worst be saved on 2049 pages (8MiB/4KiB + 1 due to the data not starting at the start of a page). But that is basically all I have as I get confused and tangled up in the question every time.
How many page faults may occur at the most when copying a contiguous 8
MiB block of memory (i.e., from memory to memory) under the following
assumptions:
32-bit virtual address space, 4 KiB pages, three-level page tables
1st page table level has only 4 8-byte items, 2nd and 3rd page table levels have 512 of 8-byte items in each paging table
there is enough free frames, there will be no page replacement
the first level paging table is always in the memory
OS implements a very simple algorithm which allocates only one frame during each page failure
we do not take into account potential page faults caused by instruction fetching (we expect the code is already in the memory)

Optimum memory consumption program

Write a program to find the processes which utilize the memory optimally, given the list of the processes with their memory usage and the total memory available.
Example:-
Total memory :- 10
First column denotes process id's and 2nd column is the memory consumption of respective process.
1 2
2 3
3 4
4 5
Answer should be processes {1,2,4} with memory consumption {2,3,5} as 2+3+5=10
This question is Knapsack problem
I believe you can find many sample code on Google

Independent memory channels: What does it mean for a programmer

I read the Datasheet for an Intel Xeon Processor and saw the following:
The Integrated Memory Controller (IMC) supports DDR3 protocols with four
independent 64-bit memory channels with 8 bits of ECC for each channel (total of
72-bits) and supports 1 to 3 DIMMs per channel depending on the type of memory
installed.
I need to know what this exactly means from a programmers view.
The documentation on this seems to be rather sparse and I don't have someone from Intel at hand to ask ;)
Can this memory controller execute 4 loads of data simultaneously from non-adjacent memory regions (and request each data from up to 3 memory DIMMs)? I.e. 4x64 Bits, striped from up to 3 DIMMs, e.g:
| X | _ | X | _ | X | _ | X |
(X is loaded data, _ an arbitrarily large region of unloaded data)
Can this IMC execute 1 load which will load up to 1x256 Bits from a contiguous memory region.
| X | X | X | X | _ | _ | _ | _ |
This seems to be implementation specific, depending on compiler, OS and memory controller. The standard is available at: http://www.jedec.org/standards-documents/docs/jesd-79-3d . It seems that if your controller is fully compliant there are specific bits that can be set to indicate interleaved or non-interleaved mode. See page 24,25 and 143 of the DDR3 Spec, but even in the spec details are light.
For the i7/i5/i3 series specifically, and likely all newer Intel chips the memory is interleaved as in your first example. For these newer chips and presumably a compiler that supports it, yes one Asm/C/C++ level call to load something large enough to be interleaved/striped would initiate the required amount of independent hardware channel level loads to each channel of memory.
In the Triple channel section in of the Multichannel memory page on wikipedia there is a small list of CPUs that do this, likely it is incomplete: http://en.wikipedia.org/wiki/Multi-channel_memory_architecture

Differences or similarities between Segmented paging and Paged segmentation?

I was studying combined paging/segmentation systems and in my book there were two approaches to this :
1.paged segmentation
2.segmented paging
I could not make out the difference between the two. I think in paged segmentation the segment is divided into pages and in segmented paging the pages are divided into segments, though I don't know if I am right or wrong. Meanwhile on the internet the combined paging/segmentation is described using one scheme only. I can't figure out why in my coursebook there are two schemes for this. Any help would be deeply appreciated.
So,after vigorously searching on net for the difference or similarity between these two terms,I have come up on a final answer.First of all I would write down the similarities:
They both (segmented paging and paged segmentation) are a type of paging/segmentation combined systems (Paging and Segmentation can be combined by dividing each segment into pages).
In both the system the segments are divided into pages.
Now to describe the differences I will have to define and describe each term separately:
Segmented paging- Segments are divided into pages.Implementation requires STR(segment table register) and PMT(page map table).In this scheme, each virtual address consists of a segment number, page number within that segment and an offset within that page.The segment number indexes into segment table which yields the base address of the page table for that segment.The page number indexes into the page table,each of which entry is a page frame.Adding the PFN(page frame number) and the offset results in the physical address.Hence addressing can be described by the following function :
va = (s,p,w) where, va is the virtual address, |s| determines number of
segments (size of ST), |p| determines number of pages per segment (size of
PT), |w| determines page size.
address_map(s, p, w)
{
pa = *(*(STR+s)+p)+w;
return pa;
}
The diagram is here:
Paged Segmentation- Sometimes segment table or page table may too large to keep in physical memory(they can even reach MBs).Therefore,the segment table is divided into pages too and thus a page table of ST pages is created. The segment number is broken into page no.(s1) and page offset(s2) of page table of ST pages.So,the virtual address can be described as :
va = (s1,s2,p,w)
address_map
(s1, s2, p, w)
{
pa = *(*(*(STR+s1)+s2)+p)+w;
return pa;
}
The diagram description is here:
Best characteristics of paging
The fact is, paging has the following goodies:
Fast Allocation (At least faster than segmentation)
No External Fragmentation(The last page in this method suffers from Internal Fragmentation)
Best characteristics of segmentation
But there is also a great behavior seen from segmentation:
Sharing
Protection
The given terms, can be combined and create the following terms:
Segmented Paging: The virtual address space is divided into segments. The physical address space is divided into page frames.
Paged Segmentation: The main Segmentation Technique which uses the process segment table sometimes goes out of bound! Meaning that the size gets too large and the main memory does not have enough space to keep the segment table. Therefore the segment table and the segment number is divided into pages.
Requirements for the Segmented Paging
There are multiple steps need to be taken to achieve the Segmented Paging:
Each segment table entry represents a page table base address.
STR(Segment Table Register) and PMT(Page Map Table) are filled with desired values.
Each virtual address consists of a Segment Number, Page Number and the Offset within that page.
The segment number indexes into the segment table which gives us the the base address of the page table for that segment.
The page number indexes into the page table.
Each page table entry is a page frame.
Final result which is a Physical Address is found by adding the page frame number and the offset.
Requirements for the Paged segmentation
The following steps take place in this scheme:
Each segment entry is divided into multiple segments.
For each segment table entry, which represents a gathering of the pages, a page table is created.
Segmentation leads to slower page translations and swapping
For those reasons, segmentation was largely dropped on x86-64.
The main difference between them is that:
paging splits memory into fixed sized chunks
segmentation allows different widths for each chunk
While it might appear smarter to have configurable segment widths, as you increase memory size for a process, fragmentation is inevitable, e.g.:
| | process 1 | | process 2 | |
----------- -----------
0 max
will eventually become as process 1 grows:
| | process 1 || process 2 | |
------------------ -------------
0 max
until a split is inevitable:
| | process 1 part 1 || process 2 | | process 1 part 2 | |
------------------ ----------- ------------------
0 max
At this point:
the only way to translate pages is to do binary searches over all pages of process 1, which takes an unacceptable log(n)
a swap out of process 1 part 1 could be huge since that segment could be huge
With fixed sized pages however:
every 32-bit translation does only 2 memory reads: directory and page table walk
every swap is an acceptable 4KiB
Fixed sized chunks of memory are simply more manageable, and have dominated current OS design.
See also: How does x86 paging work?

Why does the stack address grow towards decreasing memory addresses?

I read in text books that the stack grows by decreasing memory address; that is, from higher address to lower address. It may be a bad question, but I didn't get the concept right. Can you explain?
First, it's platform dependent. In some architectures, stack is allocated from the bottom of the address space and grows upwards.
Assuming an architecture like x86 that stack grown downwards from the top of address space, the idea is pretty simple:
=============== Highest Address (e.g. 0xFFFF)
| |
| STACK |
| |
|-------------| <- Stack Pointer (e.g. 0xEEEE)
| |
. ... .
| |
|-------------| <- Heap Pointer (e.g. 0x2222)
| |
| HEAP |
| |
=============== Lowest Address (e.g. 0x0000)
To grow stack, you'd decrease the stack pointer:
=============== Highest Address (e.g. 0xFFFF)
| |
| STACK |
| |
|.............| <- Old Stack Pointer (e.g. 0xEEEE)
| |
| Newly |
| allocated |
|-------------| <- New Stack Pointer (e.g. 0xAAAA)
. ... .
| |
|-------------| <- Heap Pointer (e.g. 0x2222)
| |
| HEAP |
| |
=============== Lowest Address (e.g. 0x0000)
As you can see, to grow stack, we have decreased the stack pointer from 0xEEEE to 0xAAAA, whereas to grow heap, you have to increase the heap pointer.
Obviously, this is a simplification of memory layout. The actual executable, data section, ... is also loaded in memory. Besides, threads have their own stack space.
You may ask, why should stack grow downwards. Well, as I said before, some architectures do the reverse, making heap grow downwards and stack grow upwards. It makes sense to put stack and heap on opposite sides as it prevents overlap and allows both areas to grow freely as long as you have enough address space available.
Another valid question could be: Isn't the program supposed to decrease/increase the stack pointer itself? How can an architecture impose one over the other to the programmer? Why it's not so program dependent as it's architecture dependent?
While you can pretty much fight the architecture and somehow get away your stack in the opposite direction, some instructions, notably call and ret that modify the stack pointer directly are going to assume another direction, making a mess.
Nowadays it's largely because it's been done that way for a long time and lots of programs assume it's done that way, and there's no real reason to change it.
Back when dinosaurs roamed the earth and computers had 8kB of memory if you were lucky, though, it was an important space optimization. You put the bottom of the stack at the very top of memory, growing down, and you put the program and its data at the very bottom, with the malloc area growing up. That way, the only limit on the size of the stack was the size of the program + heap, and vice versa. If the stack instead started at 4kB (for instance) and grew up, the heap could never get bigger than 4kB (minus the size of the program) even if the program only needed a few hundred bytes of stack.
Man CLONE : The child_stack argument specifies the location of the stack used by the child process. Since the child and calling process may share memory, it is not possible for the child process to execute in the same stack as the calling process. The calling process must therefore set up memory space for the child stack and pass a pointer to this space to clone(). Stacks grow downward on all processors that run Linux (except the HP PA processors), so child_stack usually points to the topmost address of the memory space set up for the child stack.
On x86, the primary reason the stack grows toward decreasing memory addresses is that the PUSH instruction decrements the stack pointer:
Decrements the stack pointer and then stores the source operand on the top of the stack.
See p. 4-511 in Intel® 64 and IA-32 ArchitecturesSoftware Developer’s Manual.

Resources