So I saw the virt_to_phy function in asm/io.h but this only works for kmalloc allocated memory, is there an equivalent for memory allocated in the data segment? For example, global variables (which includes a page table walk right?).
Such a function would make no sense. In fact, it only makes sense in a kernel that does not page, as in Linux.
In the user space, the mapping changes. By the time you got a return value from such a function, the mapping could already be changed.
Related
What is the difference between copying from user space buffer to kernel space buffer and, mapping user space buffer to kernel space buffer and then copying kernel space buffer to another kernel data structure?
What I meant to say is:
The first method is copy_from_user() function.
The second method is say, a user space buffer is mapped to kernel space and the kernel is passed with physical address(say using /proc/self/pagemap), then kernel space calls phys_to_virt() on the passed physical address to get it's corresponding kernel virtual address. Then kernel copies the data from one of its data structures say skb_buff to the kernel virtual address it got from the call to phys_to_virt() call.
Note: phys_to_virt() adds an offset of 0xc0000000 to the passed physical address to get kernel virtual address, right?
The second method describes the functionality in DPDK for KNI module and they say in documentation that it eliminates the overhead of copying from user space to kernel space. Please explain me how.
It really depends on what you're trying to accomplish, but still some differences I can think about?
To begin with, copy_from_user has some built-in security checks that should be considered.
While mapping your data "manually" to kernel space enables you to read from it continuously, and maybe monitor something that the user process is doing to the data in that page, while using the copy_to_user method will require constantly calling it to be aware of changes.
Can you elaborate on what you are trying to do?
This is actually a theoretical question about memory management. Since different operating systems implement things differently, I'll have to relieve my thirst for knowledge asking how things work in only one of them :( Preferably the open source and widely used one: Linux.
Here is the list of things I know in the whole puzzle:
malloc() is user space. libc is responsible for the syscall job (calling brk/sbrk/mmap...). It manages to get big chunks of memory, described by ranges of virtual addresses. The library slices these chunks and manages to respond the user application requests.
I know what brk/sbrk syscalls do. I know what 'program break' means. These calls basically push the program break offset. And this is how libc gets its virtual memory chunks.
Now that user application has a new virtual address to manipulate, it simply writes some value to it. Like: *allocated_integer = 5;. Ok. Now, what? If brk/sbrk only updates offsets in the process' entry in the process table, or whatever, how the physical memory is actually allocated?
I know about virtual memory, page tables, page faults, etc. But I wanna know exactly how these things are related to this situation that I depicted. For example: is the process' page table modified? How? When? A page fault occurs? When? Why? With what purpose? When is this 'buddy algorithm' called, and this free_area data structure accessed? (http://www.tldp.org/LDP/tlk/mm/memory.html, section 3.4.1 Page Allocation)
Well, after finally finding an excellent guide (http://duartes.org/gustavo/blog/post/how-the-kernel-manages-your-memory/) and some hours digging the Linux kernel, I found the answers...
Indeed, brk only pushes the virtual memory area.
When the user application hits *allocated_integer = 5;, a page fault occurs.
The page fault routine will search for the virtual memory area responsible for the address and then call the page table handler.
The page table handler goes through each level (2 levels in x86 and 4 levels in x86_64), allocating entries if they're not present (2nd, 3rd and 4th), and then finally calls the real handler.
The real handler actually calls the function responsible for allocating page frames.
I have some shared libraries mapped into virtual address space of my task. What happens when I change some data for example in .bss section? I do it using kmap with physical page address as argument. I can suggest 2 ways. Data is changed and it influences at all tasks which use the library or the certain page is copied due to COW.
I think it's neither. The .bss area is set up when the executable is loaded. Virtual memory space is allocated for it at that time, and that space won't be shared with any other task. Pages won't be allocated initially (by default, mlock* can change that); they will be faulted in (i.e. demand-zeroed) as referenced.
I think that even if the process forks before touching the memory, the new process would then just get the equivalent (same virtual memory space marked as demand-zero).
So if you already have a physical address for it, I would think that's already happened and you won't be changing anything except the one page belonging to the current process.
This is an assignment problem which asks for partial implementation of process checkpointing:
The test program allocates an array, does a system call and passes the start and end address of array to the call. In the system call function I have to save the contents in the give range to a file.
From my understanding, I could simply use copy_from_usr function to save the contents from the give range. However since the assignment is based on topic "Process address space", I probably need to walk through page tables. Say I manage to get the struct pages that correspond to given range. How do I get the data corresponding to the pages?
Can I just use page_to_virt function and access data directly?
Since the array is contiguous in virtual space, I guess I will just need to translate the starting address to page and then back to virtual address and then just copy the range size of data to file. Is that right?
I think copy_from_user() is ok, nothing else needed. When executing the system call, although it trap to kernel space, the context is still the process context which doing the system call. The kernel still use the process's page table. So just to use copy_from_user(), and nothing else needed.
Okey, if you want to do this experiment, I think you can use the void __user *vaddr to traverse the mm->pgd(page table), using pgd_offset/pud_offset/pmd_offset/pte_offset to get the page physical address(page size alignment). Then in kernel space, using ioremap() to create a kernel space mapping, then using the kernel virtual address(page size) + offset(inside the page), you get the start virtual address of the array. Now in kernel, you can using the virtual address to access the array.
I have some code (which I cannot change) that I need to get working in a native Win32 environment. This code calls mmap() and munmap(), so I have created those functions using CreateFileMapping(), MapViewOfFile(), etc., to accomplish the same thing. Initially this works fine, and the code is able to access files as expected. Unfortunately the code goes on to munmap() selected parts of the file that it no longer needs.
x = mmap(0, size, PROT_READ, MAP_SHARED, fd, 0);
...
munmap(x, hdr_size);
munmap(x + foo, bar);
...
Unfortunately, when you pass a pointer into the middle of the mapped range to UnmapViewOfFile() it destroys the entire mapping. Even worse, I can't see how I would be able to detect that this is a partial un-map request and just ignore it.
I have tried calling VirtualFree() on the range but, unsurprisingly, this produces ERROR_INVALID_PARAMETER.
I'm beginning to think that I will have to use static/global variables to track all the open memory mappings so that I can detect and ignore partial unmappings, but I hope you have a better idea...
edit:
Since I wasn't explicit enough above: the docs for UnMapViewOfFile do not accurately reflect the behavior of that function.
Un-mapping the whole view and remapping pieces is not a good solution because you can only suggest a base address for a new mapping, you can't really control it. The semantics of munmap() don't allow for a change to the base address of the still-mapped portion.
What I really need is a way to find the base address and size of a already-mapped memory area.
edit2: Now that I restate the problem that way, it looks like the VirtualQuery() function will suffice.
It is quite explicit in the MSDN Library docs for UnmapViewOfFile:
lpBaseAddress A pointer to the
base address of the mapped view of a
file that is to be unmapped. This
value must be identical to the value
returned by a previous call to the
MapViewOfFile or MapViewOfFileEx
function.
You changing the mapping by unmapping the old one and creating a new one. Unmapping bits and pieces isn't well supported, nor would it have any useful side-effects from a memory management point of view. You don't want to risk getting the address space fragmented.
You'll have to do this differently.
You could keep track each mapping and how many pages of it are still allocated by the client and only free the mapping when that counter reaches zero. The middle sections would still be mapped, but it wouldn't matter since the client wouldn't be accessing that memory anyway.
Create a global dictionary of memory mappings through this interface. When a mapping request comes through, record the address, size and number of pages that are in the range. When a unmap request is made, find out which mapping owns that address and decrease the page count by the number of pages that are being freed. When that count reaches zero, really unmap the view.