How to resolve Persistent-memory address? - linux-kernel

When developing a file system for Optane DC PM, how do I differentiate between DRAM addresses and PMEM addresses?
e.g.) File system is a kind of OS kernel code. When file system code is executed, how can the code understand byte-addressable PMEM address, separating with DRAM address.
Is it logically adjacent?
Such like,
|DRAM |NVM(PMEM) |
0x0000 500G 1.25T

Related

Windows Virtual Address Space

as I read here the virtual address space of a 32 bit Windows application has 2GB of storage (from 0x00000000-0x7FFFFFFF). The other 2GB are reserved for the system address space.
However, I found a pointer in a 32bit program (using Cheat Engine) which is pointing to an address which isn't in range of the virutal address space. The addresses in my last exploration were 0x301DDC3C -> 0x87F56190 like you can see in the picture:
(The expansion in the first line means a dereference of the pointer 0x301DDC3C, in the next line you can see what's in the dereference location 0x87F56190 in RAM)
After dereferencing the pointer there are pointers back into the process virtual address space.
How is it possible that a user mode application has a valid pointer into system address space?
Does this mean the pointer in location 0x301DDC3C is pointing to an location in the system address space? And so the process I'm examining is using kernel mode stuff?
from Memory and Address Space Limits
Limits on memory and address space vary by platform, operating system, and by whether the IMAGE_FILE_LARGE_ADDRESS_AWARE flag in the IMAGE_FILE_HEADER.Characteristics. IMAGE_FILE_LARGE_ADDRESS_AWARE (The application can handle addresses larger than 2 GB) is set or cleared by using the /LARGEADDRESSAWARE linker option.
by default IMAGE_FILE_LARGE_ADDRESS_AWARE cleared for 32-bit PE and set for 64-bit PE, but we can overwrite default:
so 32-bit process with set IMAGE_FILE_LARGE_ADDRESS_AWARE flag - up to 4Gb memory is avaible.
really of course [0, 0x800000000000) (win8.1 +) or [0, 0x80000000000) (before win 8.1) memory space is avaible for user mode in x64 windows. but system artificially restrict this by reserve big range of memory (this allocation is protected and can not be free)
for 32-bit process this reservation begin from 7FFF0000 or FFFE0000 and up to 64-bit ntdll.dll. very interesting that in 64-bit process, where IMAGE_FILE_LARGE_ADDRESS_AWARE cleared - also was such reserved memory space begin from 0x80000000. also interesting that in this case kernel32.dll is loaded at another address compare usual 64-bit process. so base of kernel32.dll not the same in general in all 64-bit processes. but ntdll.dll loaded at the same address in all processes anyway.
usual memory allocations on x64 windows:
32 bit process, IMAGE_FILE_LARGE_ADDRESS_AWARE cleared (default)
32 bit process, IMAGE_FILE_LARGE_ADDRESS_AWARE set
64 bit process, IMAGE_FILE_LARGE_ADDRESS_AWARE cleared
64 bit process, IMAGE_FILE_LARGE_ADDRESS_AWARE set (default)
ALL of the addresses you see are virtual addresses, of the process (not "physical" addresses). A user-space process may use pointers that happen to come from "system space", but that does NOT mean a process can freely access kernel resources , nor does it mean that these pointers necessarily map to physical addresses.
Here is another Microsoft link, that might help clarify:
Virtual Address Space
When a processor reads or writes to a memory location, it uses a
virtual address. As part of the read or write operation, the processor
translates the virtual address to a physical address.
...
The range of
virtual addresses that is available to a process is called the virtual
address space for the process. Each user-mode process has its own
private virtual address space. For a 32-bit process, the virtual
address space is usually the 2-gigabyte range 0x00000000 through
0x7FFFFFFF.
...
Processes like Notepad.exe and MyApp.exe run in user
mode. Core operating system components and many drivers run in the
more privileged kernel mode. For more information about processor
modes, see User mode and kernel mode. Each user-mode process has its
own private virtual address space, but all code that runs in kernel
mode shares a single virtual address space called system space. The
virtual address space for a user-mode process is called user space.
...
In 32-bit Windows, the total available virtual address space is
2^32 bytes (4 gigabytes). Usually the lower 2 gigabytes are used for
user space, and the upper 2 gigabytes are used for system space.
...
Code running in user mode has access to user space but does not have
access to system space. This restriction prevents user-mode code from
reading or altering protected operating system data structures. Code
running in kernel mode has access to both user space and system space.
That is, code running in kernel mode has access to system space and
the virtual address space of the current user-mode process.
...
It's also worthwhile to note the difference betwee kernel mode and user mode:
User mode and kernel mode
When you start a user-mode application, Windows creates a process for
the application. The process provides the application with a private
virtual address space and a private handle table. Because an
application's virtual address space is private, one application cannot
alter data that belongs to another application. Each application runs
in isolation, and if an application crashes, the crash is limited to
that one application. Other applications and the operating system are
not affected by the crash.
...
In addition to being private, the virtual address space of a user-mode application is limited. A processor running in user mode
cannot access virtual addresses that are reserved for the operating
system. Limiting the virtual address space of a user-mode application
prevents the application from altering, and possibly damaging,
critical operating system data.
...

Memory mapping in Virtual Address Space(VAS)

This [wiki article] about Virtual memory says:
The process then starts executing bytes in the exe file. However, the
only way the process can use or set '-' values in its VAS is to ask
the OS to map them to bytes from a file. A common way to use VAS
memory in this way is to map it to the page file.
A diagram follows :
0 4GB
VAS |---vvvvvvv----vvvvvv---vvvv----vv---v----vvv--|
mapping ||||||| |||||| |||| || | |||
file bytes app.exe kernel user system_page_file
I didn't understand the part values in its VAS is to ask the OS to map them to bytes from a file.
What is the system page file here?
First off, I can't imagine such a badly written article to exist in Wikipedia. One has to be an expert already familiar with the topic before being able to understand what was described.
Assuming you understand the rest of the article, the '-' part represents unallocated virtual address within the 4GB address space available to a process. So the sentence "the only way the process can use or set '-' values in its VAS is to ask the OS to map them to bytes from a file" means to allocate virtual memory address e.g. in a Windows native program calling VirtualAlloc(), or a C program calling malloc() to allocate some memory to store program data while those memory were not already existing in the current process's virtual address space.
When Windows allocates memory to a process address space, it normally associate those memory with the paging file in the hard disk. The c:\pagefile.sys is this paging file which is the system_page_file mentioned in the article. Memory page is swapped out to that file when there is not enough physical page to accommodate the demand.
Hope that clarifies

Is device address is virtual address? what is functionality of mmap in this case?

Is device address is virtual address? what is functionality of mmap in this case?or device address mapped to physical address
Usally, device address are allocated by specific system/host bus. It can identify devices on the bus.
Virtual address and physical address are used in memory system.
For mmap, the system allocate an I/O address for specific device in physical address space, application can access device in the way of memory access.
Usually devices comes with resources like registers, internal memory, etc that can be accessed from the CPU.
In order to be able to access a specific device register for example from the CPU you need to know the physical address of this device registers and then map this physical address to either kernel or user depending on your use case.
mmap maps resources to be accessed from user space. The result of mmap is a user space cpu address that is mapped to this resource.
This resource can be anything. It can be:
a file
anonymous memory
some external device resource ( memory, registers, etc )
mmap can't directly map device registers for example simply because it doesn't know how to do that. In this case you will probably need add some kernel space support for your mmap operation.

How does the system define the portion of virtual memory a process gets?

If there is a 32 bit system (assume Windows), the virtual address space is 4GB. So CPu can generate any address between this range. Then shoudn't a process also be able to address anywhere in this range?
It is said that each process has its own private virtual address space.Then How does the system facilitate this?
In other words the CPU generates a 32 bit address, and that gets translated into physical address. Now how does CPU know that a specific process has to address only a specific part of the virtual address space(its private virtual address space).
Suppose a process addresses an address out of its private virtual address space, what happens?
A program has to call VirtualAlloc() on Windows to tell the operating system that it wants to use a chunk of virtual memory. Often called indirectly as a result of allocating memory from a heap or loading a DLL.
The operating system, in turn, sets up the page mapping tables that the CPU uses to translate a virtual address as used in the program to a physical RAM address as output on its address bus pins. One of three unusual things can happen whenever the CPU reads or writes data or executes code at a virtual memory address:
if there is no entry in the page mapping tables then the CPU raises a general protection fault trap. The operating system verifies that the address is invalid and terminates the program
if the page is not mapped to RAM yet then the CPU raises a page fault trap. The operating system finds a page of RAM that's unused, swapping out a used page if necessary. And ensures the content is valid, loading it from a file or the paging file if necessary. And updates the table entry so it now has the physical address of the RAM page. Execution resumes as normal
the CPU verifies that access to the page is allowed. A write to a page that is marked as read-only or an execute of a instruction in the page that's marked as no-execute generates a general protection fault trap. The operating system terminates the program.
Every process has its own set of page mapping tables, ensuring that one process cannot access the RAM pages that are used by another. Unless sharing is specifically requested, common for pages of code loaded from an executable file and memory mapped files. A context switch loads the CR2 register, the CPU register that contains the address of the page mapping table.
So there is no scenario where a process can ever address memory outside of its private virtual address space, the lack of a matching paging table entry ensures that this terminates the program.
The whole 4 GB address space is available to the process (although typically the upper half is reserved for kernel data), and the MMU maps parts of it to physical memory. The process cannot go "out" of its address space (all the 4 GB of it are allowed to be used), but if some part of it hasn't been mapped to physical memory a hardware exception is raised.
The address space is said to be private since the operating system changes the settings of the MMU at task switch, so every process sees a different independent memory layout (although parts of the address space can be shared with other processes).

User to kernel mode big picture?

I've to implement a char device, a LKM.
I know some basics about OS, but I feel I don't have the big picture.
In a C programm, when I call a syscall what I think it happens is that the CPU is changed to ring0, then goes to the syscall vector and jumps to a kernel memmory space function that handle it. (I think that it does int 0x80 and in eax is the offset of the syscall vector, not sure).
Then, I'm in the syscall itself, but I guess that for the kernel is the same process that was before, only that it is in kernel mode, I mean the current PCB is the process that called the syscall.
So far... so good?, correct me if something is wrong.
Others questions... how can I write/read in process memory?.
If in the syscall handler I refer to address, say, 0xbfffffff. What it means that address? physical one? Some virtual kernel one?
To read/write memory from the kernel, you need to use function calls such as get_user or __copy_to_user.
See the User Space Memory Access API of the Linux Kernel.
You can never get to ring0 from a regular process.
You'll have to write a kernel module to get to ring0.
And you never have to deal with any physical addresses, 0xbfffffff represents an address in a virtual address space of your process.
Big picture:
Everything happens in assembly. So in Intel assembly, there is a set of privilege instruction which can only be executed in Ring0 mode (http://en.wikipedia.org/wiki/Privilege_level). To make the transition into Ring0 mode, you can use the "Int" or "Sysenter" instruction:
what all happens in sysenter instruction is used in linux?
And then inside the Ring0 mode (which is your kernel mode), accessing the memory will require the privilege level to be matched via DPL/CPL/RPL attributes bits tagged in the segment register:
http://duartes.org/gustavo/blog/post/cpu-rings-privilege-and-protection/
You may asked, how the CPU initialize the memory and register in the first place: it is because when bootup, x86 CPU is running in realmode, unprotected (no Ring concept), and so everything is possible and lots of setup work is done.
As for virtual vs non-virtual memory address (or physical address): just remember that anything in the register used for memory addressing, is always via virtual address (if the MMU is setup, protected mode enabled). Look at the picture here (noticed that anything from the CPU is virtual address, only the memory bus will see physical address):
http://en.wikipedia.org/wiki/Memory_management_unit
As for memory separation between userspace and kernel, you can read here:
http://www.inf.fu-berlin.de/lehre/SS01/OS/Lectures/Lecture14.pdf

Resources