So this is not a homework question. It's a question from a previous exam my professor posted as resource to help us study for our midterm. However, there are two answers that (to me) seem like they could be the correct answer.
A.) A page fault means the physical page to be replaced must be saved to the hard disk.
B.) The requested virtual is not in the physical memory.
Now, it is my understanding that a page fault is when the data stored in the physical memory page is not the data you need, therefore, you need to access the hard drive and load the correct data. Also, if the dirty flag is 1, then that means the previous data in physical memory has been modified, therefore you need to re-save that to the disk.
Therefore, it seems to me that both A and B are right, but I was wondering if anyone could tell me what they think the better option is.
If I were forced to choose I would say A.
SIDE NOTE
I have emailed the professor about the answer but he's really bad about responding and hasn't emailed me back yet.
Neither of these is correct.
A.) A page fault means the physical page to be replaced must be saved to the hard disk.
This is not correct because it could also mean the page needs to be read from the hard disk.
B.) The requested virtual is not in the physical memory.
This is not correct because in a soft page fault, the page is resident in physical memory. For example, the operation may just be the first write to a resident, unshared page, so the page has to be marked dirty. Or the page may be shared and need to be unshared. In these cases, the requested virtual page is resident in physical memory, it just needs some massaging by the memory management system.
A page fault means some help from the kernel is needed in order to permit the access to that page of virtual memory. The help needed could vary from reading the page in to disk to just marking the page accessed so the kernel knows not to evict it.
Of those two, B is probably closer to correct because A is almost never right. The "classic" page fault would be if the page had to be read in from hard disk, which B would apply to but not A.
Now, it is my understanding that a page fault is when the data stored in the physical memory page is not the data you need, therefore, you need to access the hard drive and load the correct data. Also, if the dirty flag is 1, then that means the previous data in physical memory has been modified, therefore you need to re-save that to the disk.
How could the page both be dirty and not hold the data you need? If it's dirty, that means you dirtied it. Which means it's holding the data you're working with.
Related
To ask the question another way, can you confirm that when you mmap() a file that you do in fact access the exact physical pages that are already in the page cache?
I ask because I’m doing testing on a 192 core machine with 1TB of RAM, on a 400GB data file that is pre-cached into the page cache prior to the test (by just dropping the cache, then doing md5sum on the file).
Initially, I had all 192 threads each mmap the file separately, on the assumption that they would all get (basically) the same memory region back (or perhaps the same memory region but somehow mapped multiple times). Accordingly, I assumed two threads using two different mappings to the same file would both have direct access to the same pages. (Let’s ignore NUMA for this example, though obviously it’s significant at higher thread counts.)
However, in practice I found performance would get terrible at higher thread counts when each thread separately mmapped the file. When we removed that and instead just did a single mmap that was passed into the thread (such that all threads just directly access the same memory region), then performance improved dramatically.
That’s all great, but I’m trying to figure out why. If in fact mmapping a file just grants direct access to the existing page cache, then I would think that it shouldn’t matter how many times you map it — it should all go to the exact same place.
But given that there was such a performance cost, it seemed to me that in fact each mmap was being independently and redundantly populated (perhaps by copying from the page cache, or perhaps by reading again from disk).
Can you comment on why I was seeing such different performance between shared access to the same memory, versus mmapping the same file?
Thanks, I appreciate your help!
I think I found my answer, and it deals with the page directory. The answer is yes, two mmapped regions of the same file will access the same underlying page cache data. However, each mapping needs to independently map each of the virtual pages to the physical pages -- meaning 2x as many entries in the page directory to access the same RAM.
Basically, each mmap() creates a new range in virtual memory. Every page of that range corresponds to a page of physical memory, and that mapping is stored in a hierarchical page directory -- with one entry per 4KB page. So every mmap() of a large region generates a huge number of entries in the page directory.
My guess is it doesn't actually define them all up front, which is why mmap() is instant to call even for a giant file. But over time it probably has to establish those entries as there are faults on the mmapped range, meaning over the course of time it gets filled out. This extra work to populate the page directory is probably why threads using different mmaps are slower than threads sharing the same mmap. And I bet the kernel needs to erase all those entries when unmapping the range -- which is why unmmap() is so slow.
(There's also the translation lookaside buffer, but that's per-CPU, and so small I don't think that matters much here.)
Anyway, it sounds like re-mapping the same region just adds extra overhead, for what seems to me like no gain.
Under Windows, the kernel can swap a physical memory page to a page in the paging file.
For simplicity, we assume there is only one paging file.
As far as I understand, the paging file consists of pages which have the same size of a physical memory page. i.e. 4K.
I just wonder:
How does the kernel know which page in the paging file is free to store?
(Free here means the page in the paging file doesn't previously store another physical memory page.)
At the risk of gross oversimplification . . . the usual approach in implementing virtual memory is that disk is the primary storage. Unless there is a mapping to a file, a virtual page does not exist. That mapping remains fixed for the life of the process.
The virtual memory on disk gets mapped to physical memory when available.
The kernel maintains some data structure (e.g. a bitmap) to indicate the free areas of the page file and other structures to maintain the mapping of logical addresses to the files.
I believe you are asking about page replacement algorithms within memory management.
When the operating system needs to save a new page in memory and keep track of its information on the paging file (also known as the page table), there's no guarantee that there will be a free spot --meaning that other pages' information might have taken up all of it. In that case, the OS will have to evict an existing page. The OS doesn't need free space since, if there isn't any, it will make it.
If you're interested in learning more (this is a pretty large topic), you may find the lecture notes from NYU's "Operating Systems" class helpful. This is the demand paging unit, and further below you can read about a few page replacement algorithms ("WS Clock" and "Aging" are probably the most important).
Hopefully this is helpful!
I'm really stuck on this question for my OS class, I don't want someone to just give me the answer though, instead if someone could tell me how to work it out.
Example Question:
This system uses simple paging and TLB
Each memory access requires 80ns
TLB access requires 10ns
TLB hit rate is 80%.
Work out the actual speedup because of the TLB?
NOTE: I changed the memory accessed required and the TLB access requires part of the question because as I said I don't want the answer, just a way to work it out.
In case the virtual address translation is cached in the TLB, all we need is one lookup in the TLB that will give us a physical address, and we are done. The interesting part is if we need to do the page table walk. Think carefully about what the system has to do in case it did not find an address in the TLB (well it already had to do a TLB look-up). Memory access takes 80ns, but how many of them do you need to actually get the physical address? Pretty much every paging architecture follows the schema that page-tables are stored in memory and only the entry point, the address that points to the base of the first page table (the root) is in a register.
If you have the amount of time you can calculate the speed-up by comparing it to the TLB access time.
On TLB Hit 80% your required to access time 2ns and to access that page in main memory required 20ns therefore one part is
0.8×(2+20)
On TLB miss i.e. (1-0.8) 20% for that you are checking TLB again so required 2ns when it is TLB miss it will check into Page Table but base Address of Page Table is into Main Memory so it requires 20ns and when it searches into PT it will getting desired Frame and again required memory access time to access data from main memory so miss calculation is
0.2×(2+20+20)
From above 2 :
Effective access time=0.8×(2+20)+0.2×(2+20+20)
= 26ns
I've got a question about virtual memory management, more specifically, the address translation.
When an application runs, the CPU receives instructions containing virtual memory addresses, and translates them into physical addresses via the page table.
My question is, since the page table also aside at a memory block, does that means the CPU has to access the memory twice in a single memory-access instruction? If the answer is no, then how does this actually work? Which part did I miss?
Could anyone give me some details about this?
As usual the answer is neither yes or no.
Worst case you have to do a walk of the page table, which is indeed stored in (some kind of) memory, this is not necessarily only one lookup, it can be multiple lookups, see for example a two-level table (example from wikipedia).
However, typically this page table is accompanied by a hardware assist called the translation lookaside buffer, this is essentially a cache for the page table, the lookup process can be seen in this image. It works just as you would expect a cache too work, if a lookup succeeds you happily continue with the physical fetch, if it fails you proceed to the aforementioned page walk and you update the cache afterwards.
This hardware assist is usually implemented as a CAM (Content Addressable Memory), something that's most used in network processing but is also very useful here. It is a memory-component that does not do the lookup based upon an address but based upon 'content', or any generic key (the keys dont' have to be contiguous, incrementing numbers). In this case the key would be your virtual address, and the resulting memory lookup would be your physical address. As this CAM is a separate component and as it is very fast you could state that as long as you hit it you don't incur any extra memory overhead for virtual -> physical address translation.
You could ask why they don't put the whole page table in a CAM? Quite simply, CAM's are both quite expensive and more importantly quite energy-hungry, so you don't want to make them too big (we wouldn't want a laptop that requires 1KW to run do we?).
Sometimes.
The MMU contains a cache of virtual to physical address mapping, called a TLB (Translation Lookaside Buffer).
If the page in question is not in the TLB (a TLB miss), then it needs to load the relevant piece of page table from main memory into that cache first, which will need additional memory access.
Finally, if the page cannot be found at all, a trap is issued to the CPU (a page fault), and the CPU have an opportunity to fix this - e.g. allocate memory, load the piece from a file, swap space and similar.
The details on how this is done varies between architectures, on some, the TLB miss also involves the CPU to configure the TLB, though on most this is automatic. (but the CPU would have to flush the TLB when doing a context switch, and load a new pagetable for e.g. a new process)
More info e.g. here https://www.kernel.org/doc/gorman/html/understand/understand006.html
When I do VirtualAlloc with MEM_COMMIT this "Allocates physical storage in memory or in the paging file on disk for the specified reserved memory pages" (quote from MSDN article http://msdn.microsoft.com/en-us/library/aa366887%28VS.85%29.aspx).
All is fine up until now BUT:
the description of Committed Bytes Counter says that "Committed memory is the physical memory which has space reserved on the disk paging file(s)."
I also read "Windows via C/C++ 5th edition" and this book says that commiting memory means reserving space in the page file....
The last two cases don't make sense to me... If you commit memory, doesn't that mean that you commit to physical storage (RAM)? The page file being there for swaping out currently unused pages of memory in case memory gets low.
The book says that when you commit memory you actually reserve space in the paging file. If this were true than that would mean that for a committed page there is space reserved in the paging file and a page frame in physical in memory... So twice as much space is needed ?! Isn't the page file's purpose to make the total physical memory larger than it actually is? If I have a 1G of RAM with a 1G page file => 2G of usable "physical memory"(the book also states this but right after that it says what I discribed at point 2).
What am I missing? Thanks.
EDIT: The way I see it is perfectly described here: http://support.microsoft.com/kb/555223
"It shows how many bytes have been allocated by processes and to which the operating system has committed a RAM page frame or a page slot in the pagefile (perhaps both)"
but I have read too many things that contradict my belief like those two points above and others like this one for instance: http://blogs.msdn.com/ricom/archive/2005/08/01/446329.aspx
You're misunderstanding the way windows' memory model works. The terminology and the documentation confuse things a bit which doesn't help.
When you commit memory, the OS is providing you with a "commitment" to providing a page to back that memory. It is not actually allocating one, either from physical memory or from the page file, it is simply checking that the "uncommited pages" counter is larger than zero and then decrementing it. If this succeeds the page is marked as commited in your page table.
What happens next depends on whether you access the memory. If you don't, all you did was stop someone else using a page - it was never actually allocated though so it is impossible to say which page you didn't use. When you touch the memory though a page fault is generated. At this point the page fault handler sees that the page is commited and starts to look for a page that can be used on a number of lists of pages the memory manager keeps. If it can't find one then it will force something else out to the page file and give you that page.
So really a page is never actually allocated until you need it, when it is allocated by the page fault handler. The reason the documentation is confusing is that the above description is quite complicated. The documentation has to describe how it works without going into the gory details of how a paging memory manager works and the description is good enough.
Each page of physical memory is backed by either a pagefile (for volatile pages of memory, such as data) or a disk file (for read-only memory pages, such as code and resources). When I say backed, I mean the behavior/process that when the page is needed, the operating system reads it from the disk file, when it is no longer needed, the operating system frees it and flush the content to the disk file.
When you commit a block of virtual memory, the Virtual Memory Manager(VMM) will allocate you an entry into the process's virtual address descriptor (VAD) tree, and reserve the space in the paging file. The physical memory will only be allocated until this block memory is accessed. In the case that page file is disabled, the space will be reserved in the physical memory directly.
Please refer to these links: Managing Virtual Memory and Managing Memory-Mapped Files