Can we access memory through a struct page structure?
Note: The page belongs to high memory and has not been mapped to kernel logical address space.
Yes we can access the page belonging to highmem through struct page's virtual field. But in your case you can't access as you mentioned that highmem page is not mapped into kernel virtual memory.
To access it you need to create mapping either permanent or temporary mappping.
To create permanent mapping map page through kmap.
void *kmap(struct page *page)
This function works on either high or low memory. If the page structure belongs to a page in low memory, the page’s virtual address is simply returned. If the page resides in high memory, a permanent mapping is created and the address is returned.The function may sleep, so kmap() works only in process context. Because the number of permanent mappings are limited (if not, we would not be in this mess and could just permanently map all memory), high memory should be unmapped when no longer needed.This is done via the following function, which unmaps the given page:
void kunmap(struct page *page)
The temporary mapping can be created via:
void *kmap_atomic(struct page *page, enum km_type type)
This is an atomic function so you can't sleep and can be called in interrupt context. It is called temporary because next call to kmap_atomic will overwrite the previous mapping.
in case there is no value for virtual field then you can not access that specific physical frame. the simple reason is struct page denotes the mappings between physical and virtual addresses so a system with large memory can not map all memory in kernel space. so high memory is mapped dynamically. but to access that memory it should be mapped i.e. void *virtual should not be NULL.
Related
When a program calls mmap to allocate an anonymous page, also known as a demand-zero page, what appears in the address field of the corresponding page table entry (PTE)? I am assuming that the kernel does not create a zero-initialized page in physical memory (and enter that physical page's page number into the PTE) until the requesting process actually touches the page — hence the term demand-zero. Since it would not be a disk address, and would not be 0 (which is for unallocated pages), what value would appear there? As a different but related question, how does the kernel "know" that this page is to be handled as a demand-zero page, i.e., that the fault handler should find a physical page and initialize it with 0 rather than copy a page from disk?
I am assuming that the kernel does not create a zero-initialized page in physical memory
Indeed, this is usually the case. Unless special cases, like for example if MAP_POPULATE is specified to explicitly request the page to be initialized (also called "pre-fauting").
what appears in the address field of the corresponding page table entry (PTE)?
Right after mmap you don't even have a PTE allocated for the page (or in general, you don't have any entry at any page table level). For what the CPU is concerned, the page doesn't even exist. If you were to walk the page table you would just get to a point (at an arbitrary level) where the corresponding entry is marked as "not present".
Since it would not be a disk address, and would not be 0 (which is for unallocated pages), what value would appear there?
For what the CPU is concerned, the page is unallocated. At the first page fault, two things can happen:
For a read page fault, the PTE is updated to point to the zero page: this is a special page that is always entirely zeroed-out and is pointed to by the PTEs of any anonymous (demand-zero) page in the system that has not been modified yet.
For a write page fault, an actual physical page will be allocated and the corresponding PTE updated to point to its physical address.
Quoting directly from the documentation:
The anonymous memory or anonymous mappings represent memory that is not backed by a filesystem. Such mappings are implicitly created for program’s stack and heap or by explicit calls to mmap(2) system call. Usually, the anonymous mappings only define virtual memory areas that the program is allowed to access. The read accesses will result in creation of a page table entry that references a special physical page filled with zeroes. When the program performs a write, a regular physical page will be allocated to hold the written data. The page will be marked dirty and if the kernel decides to repurpose it, the dirty page will be swapped out.
how does the kernel "know" that this page is to be handled as a demand-zero page, i.e., that the fault handler should find a physical page and initialize it with 0 rather than copy a page from disk?
When a page fault occurs, the kernel page fault handler (architecture-dependent) determines to which VMA the page belongs to, and retrieves the corresponding struct vm_area_struct (which was created earlier either by the kernel itself or by a mmap syscall). This structure is then passed on to architecture-independent code (do_fault()) along with the needed fault information (struct vm_fault).
The vm_area_struct then contains all the remaining necessary information to handle the fault (for example the ->vm_file field which is != NULL in case of a file-backed mapping). The field ->vm_ops points to a struct vm_operations_struct which defines a set of function pointers to call in different occasions. In particular anonymous VMAs have ->vm_ops == NULL.
For other kind of pages, ->fault() is the function used when handling a page fault. This function knows what to check and how to actually handle the fault.
B & O also describe the VMA, but do not explain how the kernel could use the VMA to distinguish between, say, an unallocated page and an allocated page to be created and zero-initialized.
Simple, just check vma->vm_ops == NULL and in such case you know that the page is a demand-zero anon page. Then on a page fault act as needed (read fault -> update PTE to point to global zero page, write fault -> allocate a page and update PTE).
I'm wondering if anyone could help me understand how a virtual address is mapped to its address one the backing store, which is used to hold moved-out pages of all user processes.
Is it a static mapping or a hash algorithm? If it's static, where such mapping is kept? It seems it can't be in the TLB or page table since according to https://en.wikipedia.org/wiki/Page_table, the PTE will be removed from both TLB and page table when a page is moved out. A description of the algorithm and C structs containing such info will be helpful.
Whether it's static mapping or hash algorithm, how to garrantee no 2 process will map its address to the same location on the swap partition, since the virtual address space of each process is so big (2^64) and the swap space is so small?
So:
during page-in, how the OS know where to find the address (corresponding to the virtual address accessed by the user process) on the swap partition to move in?
when a physical page needs to be paged out, how does the OS know where to put on the swap partition?
For the first part of your question : It is actually hardware dependent but the generic way is to keep a reference to the swap block containing the swapped out page (Depending on the implementation of the swap subsystem, it could be a pointer or a block number or an offset into a table) in it's corresponding page table entry.
EDIT:The TLB is a fast associative cache that help to do the virtual to physical page mapping very quickly. When a page is swapped out, it's entry in the TLB could be replaced by a newly active Page. But the entry in the page table cannot be replaced because page tables are not associative memory. A page table remains persistent in memory for all the duration of the process and no entry could be removed or replaced (By another virtual page). Entries in page tables could only be mapped or unmapped. When they are unmapped (Because of Swapping or freeing), the content of the entry could either hold a reference to the swap block or just an invalid value.
For the second part of your question : The system kernel maintains a list of free blocks in the swap partition. Whenever it needs to evict a RAM page, it allocates a free block and then the block reference is returned so that it can be inserted in the PTE. When the page comes back to RAM, the disk block is freed so that it could be used by other pages.
During page-in, how the OS know where to find the page (corresponding to the virtual address accessed by the user process) on the swap device to move in?
That's can actually be a fairly complicated process. The operating system has to maintain a table of where the process's pages are mapped to. This can be complicated because pages can be mapped to multiple devices and even multiple files on the same device. Some systems use the executable file for paging.
when a physical page needs to be paged out, after the virtual address for a physical page is looked up in TLB, how does the OS know where to put on the swap device?
On a rationally designed operating system, the secondary storage is allocated (or determined) when the virtual page is mapped to the process. That location remains fixed for the duration of the program being run.
I use kmap to get the first virtual address of a low-memory page, inside a Linux Kernel module.
What happens if I call kunmap after that mapping? Is the persistent page mapping totally deleted or just some mapping counter is decreased? (should be 2 before the unmapping)
There is no mapping done by kmap if page belongs to low memory and hence there is no action done by kunmap too, but calling them is harmless as these checks are handled in there implementation.
First about kmap
kmap checks if page is below highmem_start_page(ie lowmemory page) as
pages from lowmem are already visible and do not need to be mapped. If
the page is already in low memory kmap simply returns the address of it.
Now about kunmap
kunmap checks if page is below highmem_start_page. If it is, the
page already exists in low memory and needs no further handling, hence nop.
I'm studying the MM in Linux and I got very confused when I could't find where the raw data is stored. I thought it was stored in some field of a page struct but I couldn't find there.
Where is the actual data represented by a page stored? And how to get a pointer to it?
page struct is just a helper which stores the metadata. it doesn't actually store any data, but the directions to locate the data in memory. That is, the address space mapping to the physical addresses etc. The actual data is still stored in the physical memory.
Where is the actual data represented by a page stored?
The actual data is in a physical page address by at least one virtual address AND/OR it is on disk in an inode and has never been mapped. For the inode case, accessing the virtual address will trigger a page fault and that handler will read the memory into a physical page and the faulted code will resume.
And how to get a pointer to it?
I believe that the struct page is contained in another array, like mem_map. For instance the function mem_map_next, is use to iterate through an array of struct page. Perhaps the structure you are interested in is struct vm_area_struct? This is a virtual address tracking structure. There maybe multiple virtual addresses mapping to the same physical page.
You need to know the context of a composing struct to know the address a struct page represents. Then it is simply a base address plus the index multiplied by the page size.
You could use page_address() to get virtual address of a page.
But the return address might be NULL due to the fact that not all pages have mapped virtual addresses.
void *page_address(const struct page *page);
You could use kmap to map a highmen page to a virtual address.
Also, remember to use kunmap to unmap this page when you don't need to access it.
struct page *page = alloc_pages(GFP_KERNEL | __GFP_HIGHMEM, 0);
if (page) {
void *addr = kmap(page);
if (addr) {
memset(addr, 0, PAGE_SIZE);
kunmap(addr);
}
}
This is an assignment problem which asks for partial implementation of process checkpointing:
The test program allocates an array, does a system call and passes the start and end address of array to the call. In the system call function I have to save the contents in the give range to a file.
From my understanding, I could simply use copy_from_usr function to save the contents from the give range. However since the assignment is based on topic "Process address space", I probably need to walk through page tables. Say I manage to get the struct pages that correspond to given range. How do I get the data corresponding to the pages?
Can I just use page_to_virt function and access data directly?
Since the array is contiguous in virtual space, I guess I will just need to translate the starting address to page and then back to virtual address and then just copy the range size of data to file. Is that right?
I think copy_from_user() is ok, nothing else needed. When executing the system call, although it trap to kernel space, the context is still the process context which doing the system call. The kernel still use the process's page table. So just to use copy_from_user(), and nothing else needed.
Okey, if you want to do this experiment, I think you can use the void __user *vaddr to traverse the mm->pgd(page table), using pgd_offset/pud_offset/pmd_offset/pte_offset to get the page physical address(page size alignment). Then in kernel space, using ioremap() to create a kernel space mapping, then using the kernel virtual address(page size) + offset(inside the page), you get the start virtual address of the array. Now in kernel, you can using the virtual address to access the array.