Two page entries reference to the same physical page

Two page entries reference to the same physical page - windows

For linux and windows, in the same process, how to make two page entries reference to the same physical page?
For windows, by reading at the MSDN, looks like I can call CreateFileMapping by passing INVALID_HANDLE_VALUE to create a file mapping without backed by a file. Then I can call MapViewOfFileEx twice with different lpBaseAddress, which essentially makes two different addresses reference to the same physical address.
My question is, how to do it under linux? I read manual for mmap, and didn't see a way to do it, unless the region is backed by a file (with flag MAP_SHARED) but modifications to this region will be written to the file, which is not what I want. Does anyone aware of someway to do this? I am not against backing by a file, as long as the writing to the region doesn't actually goes to the disk. Using tmpfs is not an option because I can't guarantee user has a tmpfs mounted.
By the way, the code should be user mode code, not kernel mode.

Use shm_open() to create a file for mmap().
"I want to write some emulator" is the same purpose as mine when I used this trick.
I did use ipc/shm, but I forgot the detail. It was very very very very probably: shmget()+shmat()

Related

Can I create a file in Windows that only exists in memory - and if so, how?

This question is not a duplicate of any of these existing questions:
How can I store an object File that only exists in memory as a file inside of my storage system? - This question is not about Java's File API.
Temp file that exists only in RAM? - This is close to what I'm asking, except the OP isn't asking how to create files from memory for the purposes of sharing passing them to child-processes
I'm not asking about Win32's Memory-mapped File either - as they're essentially the opposite of what I'm after: a memory-mapped file is a file-on disk that's mapped to a process' virtual memory space - whereas what I want is a file that exists in the OS' filesystem (but not the disk's physical filesystem) like a mount-point and that file's data is mapped to an existing buffer in memory.
I.e., with Memory Mapped Files, writing/writing to a byte at a particular buffer address and offset in memory will cause the byte at the same offset from the start of the file to be modified - but the file physically exists on-disk, which isn't what I want.
To elaborate and to provide context:
I have an ASP.NET Core server-side application that receives request streams sized between 1 and 10MB on a regular basis. This program will run only on Windows / Windows Server, so using Windows-specific functionality is fine.
75% of the time my application just reads through these streams by itself and that's it.
But a minority of the time it needs to have a separate applications read the data which it starts using Process.Start and passing the file-name as a command-line argument.
It passes the data to these separate applications by saving the stream to a temporary file on-disk and passing the filename of that stream.
Unfortunately it can't write the content to the child-process's stdin because some of the those programs expect a file on-disk rather than reading from stdin.
Additionally, while the machine it's running on has lots of RAM (so keeping the streams buffered in-memory is fine) it has slow spinning-rust HDDs, which is further reason to avoid temporary files on-disk.
I'd like to avoid unnecessary buffering and copies - ideally I'd like to stream the entire 1-10MB request into a single in-memory buffer, and then expose that same buffer to other processes and use that same buffer as the backing for a temporary file.
If I were on Linux, I could use tmpfs - it isn't perfect:
To my knowledge, an existing process can't instruct the OS to take an existing region of its virtual-memory and map a file in tmpfs to that memory region, instead tmpfs still requires that the file be populated by writing (i.e. copying) all of the data to its file-descriptor - which is counter to the aim of having a zero-copy system.
Windows' built-in RAM-disk functionality is limited to providing the basis for a RAM-disk implementation via a third-party device-driver - I'm surprised that Microsoft never shipped Windows with a built-in RAM-disk GUI or API, especially given their relative simplicity.
The ImDisk program is an implementation of a RAM-disk using Microsoft's RAM-disk driver platform, but as far as I can tell while it's more like tmpfs in that it can create a file that exists only in-memory, it doesn't allow the file's data to be backed by a buffer directly accessible to a running process (or a shared-memory buffer).

CreateFileMapping with hFile = INVALID_HANDLE_VALUE "creates a file mapping object of a specified size that is backed by the system paging file instead of by a file in the file system".
From Raymond Chen's The source of much confusion: “backed by the system paging file”:
In other words, “backed by the system paging file” just means “handled like regular virtual memory.”
If the memory is freed before it ever gets paged out, then it will never get written to the system paging file.

how to use get_user to copy data from user space to kernel space

I want to copy an integer variable from user space to kernel space.
Can anyone give me a simple example how to do this?
I came to know that we can use get_user but i am unable to know how..

Check man pages of copy_to_user and copy_from_user.
Write a simple kernel module, with read/write operations, and register and char device for them, something like /dev/sample.
Do an application write/read, on fd opened by this application.
Now you need to implement the mechanism for transferring this data to kernel space and read back whatever returned.
- In write you do a copy_from_user, before this check passed buffer is valid or not.
- In read you do a copy_to_user.
Make sure error conditions are taken care of, and open call implementation should keep track of how many opens are there, if you want to implement multiple open, and this count should be decremented, when application calls a close on opened FD.
Do you follow ?

How can I find the physical address of a file?

I'm using the GoAsm assembler on a Windows 7 - 64 bit OS and I'll be asking you a few (not so dumb) questions.
First question :
How can I find the physical address of a file ?
Let's suppose file "Text.txt" is at the root of my C:\ partition.
Is there a way to get the exact memory address where this file is ?
Second question :
Is it possible to call a routine which will just do like if I invoked a C function ?
(i.e. : Consider a C function "WriteToScreen", is it possible to have the same function, but in assembler format, that means without having the need to use high-level invokes to do that work ?
Third question :
Are there somewhere on the net some include files for GoAsm containing useful routines like (move, copy, edit, erase) commands ? I've first thought of ms-dos interrupts but I can't manage to get them to work without crashing the program. I guess it just not compatible with Windows OS even though the command prompt acts like ms-dos... ?
Fourth question :
I've heard from different sources and myself that NASM works pretty bad on Win7 x64, is it just true, or am I doing it the wrong way ?

1
An hard drive, from a logical point of view, can be seen as a sequence of "blocks" (the more common name is sectors). How these blocks are organized physically on the disks can be disregarded, but the driver must know someway how to get data of course, though you send to modern hd driver "high level" commands that, as far as you know, are not strongly related to where data physically are (you can say "read the block 123", but there's no extern evidence of where that block lives).
However this way you can "name" a block with a number, and say e.g. that block 0 is the MBR. Each block contains several bytes (512, 1024...). Not all used blocks contain actual data of a file, in fact there are metainformations of any sort, depending on the filesystem but even related to the "structure" of the hd (I mean, partitions).
A file located on an hd is not automatically loaded into memory, so it has no memory address. Once you read it, piece of it if not all are of course copied into the memory you give, which is not an intrinsic property of the file. (Filesystems retrieve the blocks belonging to the file and "show" them as we are used to see them, as a single "unit", the file)
Summarizing: files have no memory address. The physical address could be the set of blocks holding data (and metadata, like inodes ) of the file, or just the first block (but if a block of data is N, N+1 could not belong to the same file - the blocks need no to be one next to the other). To know them, you have to analyse the structure of the filesystem you use. I don't know if there's an API to retrieve them easily, but in the worst case you can analyse the source code of the filesystem... good luck!
2
C functions are translated into assembly. If you respect the C calling convention, you can write a "C function" directly in assembly. Try reading this and this for x86.
3
You can call windows API from asm. Forget MS-DOS, MS-DOS is dead, MS-DOS is not Windows, the cmd is a sort of "emulation"... indeed no, not an emulation but just a command line interface that resemble the one MS-DOS users was used to. But it is not exaclty the same, i.e. there are no MS-DOS system interrupt you can use. Iczelion's assembly tutorials, though old, could be an interesting resource. (If links expire, try with the wayback machine)
4
I do not own Win7 and never installed nasm on windows, so I can't say anything about.

For the first question just drag the file into the address bar in the browser

Memory mapped files optional write possible?

When using memory-mapped files it seems it is either read-only, or write-only. By this I mean you can't:
have one open for writing, and later decide not to save it
have open open for reading, and later decide to save it
Our application uses a writeable memory-mapped file to save data files, but since the user might want to exit without saving changes, we have to use a temporary file which the user actually edits. When the user opts to save the changes, the original file is overwritten with the temporary file so it has the latest changes. This is cumbersome because the files can be very large (>1GB) and it takes a long time to copy them.
I've tried many combinations of the flags used to create the file mapping but none seem to allow the flexibility of saving on demand. Can anyone confirm this is the case? Our application is written in Delphi, but it uses the standard Windows API to create the mapping, in our case
FMapHandle := CreateFileMapping(FFileHandle, nil, PAGE_READWRITE, 0, 2 * 65536, nil);
FBasePointer := MapViewOfFile(FileMapHandle, FILE_MAP_WRITE, FileOffsetHigh,
FileOffsetLow, NumBytes);

I don't think you can. By that I mean you may be able to, but it doesn't make any sense to me :-)
The whole point of a memory-mapped file is that it's a window onto the actual file. If you don't wany changes reflected in the file, you'll probably have to do something like batch up the changes in a data structure (e.g., an array of base address, size and data) and apply them when saving.
In which case, you wouldn't actually need the memory mapped file, just read in and maintain the chunks you want to change (lock the file first if there's a chance of multi-user access).
Update:
Have you thought of the possibility of, when doing a save, deleting the original file and just renaming the temporary file to the original file name? That's likely to be much faster than copying 1G of data from temporary to original. That way, if you don't want it saved, just delete the temporary file and keep the original.
You'll still have to copy the original data to the temporary file when loading but you won't have to copy the temporary data back (whether you save it or not) - that would halve the time taken.

Possible, but non-trivial.
You have to understand memory mapped basics, and the difference between the three modes of memory-mapped files. Both set aside a part of your virtual address space and create a mapping entry in an internal table. No physical RAM is initially allocated. Hence, when you DO try to access the memory, the CPU faults and the OS has to fix up. It does so by copying the file contents to RAM and mapping the RAM to your process, at the faulting address.
Now, the difference between the three modes is how the descriptors are set on the mapped pages. In all cases you get read access on the pages. (The first mode). However, if you ask for write access and subsequently write to it, on your first write the page is marked as writeable and dirty. It can then be written back to the original file, at the discretion of the OS (Second mode). Finally, it's possible to get copy-on-write semantics. You still start out with only read access to the page in memory. When you write to it, the CPU still faults and the OS needs to fix it up. With copy-on-write, that fixup is done by setting the backing store of the changed page to the page file, instead of the original mapped file.
So, in your case you want to use copy-on-write mode. If the user decides to discard the modifications, no problem. You simply discard the memory mapping. All pages that were modified in memory, and were backed by the page file are also discarded.
If the user does decide to save, you've got a slightly harder task. You now need to figure out which parts of the file have changed. Those changes are in memory, and you need to reapply those to the source file. You can do this with Page Guards. So, when the user decides to save, copy all modified pages to a separate memory block, remap the (unchanged) file for write, and apply the changes.

Get sector location of a file

Based on a file name, or a file handle, is there a Win-API method of determining what physical sector the file starts on?

You can get file cluster allocation by sending FSCTL_GET_RETRIEVAL_POINTERS using DeviceIoControl.

You'd have to read the allocation table directly.

I suspect that there is no such function.
Even if you know where the file starts, what good would it do? The rest of the file could be anywhere as soon as the file is larger than a single sector due to fragmentation.
You would probably need to develop deeper understanding of the file system involved and read the necessary information from the file allocation table or such mechanism.

No. Why? Because a file system is an abstraction of physical hardware. You don't need to know if you're on a RAM disk, hard drive, CD, or network drive, or if your data is compressed or encrypted -- Windows takes care of these little details for you.
You can always open the physical disk, but you'd need knowledge of the file system used.
What are you trying to accomplish with this?

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio