How to overwrite portions of a DriverKit OSData internal buffer? - macos

The documentation of OSData says that "...You can add bytes to them and overwrite portions of the byte array.". I can see a method to append bytes, but I don't understand how I am able to overwrite a portion of the buffer.
Another option would be to use IONewZero to allocate a number of elements of the type I need. I my case I just need this for ints.
Example:
T* dataBuffer = IONewZero(T, SIZE);
And then deallocate with:
IOSafeDeleteNULL(dataBuffer_, T, SIZE);
What are the advantages of using an OSData object compared to the solution with IONewZero / IOSafeDeleteNULL?

I think the documentation might just be copy-pasted from the kernel variant of OSData. I've seen that in a bunch of places, especially USBDriverKit.
OSData is mostly useful for dealing with plist-like data structures (i.e. setting and getting properties on service objects) in conjunction with the other OSTypes: OSArray, OSDictionary, OSNumber, etc. It's also used for in-band (<= 4096 byte) "struct" arguments of user client external methods.
The only use I can see outside of those scenarios is when you absolutely have to reference-count a blob of data. But it's certainly not a particularly convenient or efficient container for data-in-progress. If you subsequently need to send the data to a device or map it to user space, IOBufferMemoryDescriptor is probably a better choice (and also reference counted) though it's even more heavyweight.

Related

How do I create an instance of a class, variably sized, at a specific memory location?

I'm working on a project involving writing packets to a memory-mapped file. Our current strategy is to create a packet class containing the following members
uint32_t packetHeader;
uint8_t packetPayload[];
uint32_t packetChecksum;
When we create a packet, first we'd like to have its address in memory be a specified offset within the memory mapped file, which I think can be done with placement-new(). However, we'd also like for the packetPayload not to be a pointer to some memory from the heap, but contiguous with the rest of the class (so we can avoid memcpying from heap to our eventual output file)
i.e.
Memory
Beginning of class | BOC + 4 | (length of Payload) |
Header Payload Checksum
Would this be achievable using a length argument for the Packet class constructor? Or would we have to template this class for variably sized payloads?
Forget about trying to make that the layout of your class. You'll be fighting against the C++ language all the day long. Instead a class that provides access to the binary layout (in shared memory). But the class instance itself will not be in shared memory. And the byte range in shared/mapped memory will not be a C++ object at all, it just exists within the file mapping address range.
Presumably the length is fixed from the moment of creation? If so, then you can safely cache the length, pointer to the checksum, etc in your accessor object. Since this cache isn't inside the file, you can store whatever you want however you want without concern for its layout. You can even use virtual member functions, because the v-table is going in the class instance, not the range of the binary file.
Also, given that this lives in shared memory, if there are multiple writers you'll have to be very careful to synchronize between them. If you're just prepositioning a buffer in shared/mapped memory to avoid a copy later, but totally handing off ownership between processes so that the data is never shared by simultaneous accesses, it will be easier. You also probably want to calculate the checksum once after all the data is written, instead of trying to keep it up-to-date (and risking data races in the process) for every single write into the buffer.
First remember, that you need to know what your payload length is, somehow. Either you specify it in your instance somewhere, or you template your class over the payload length.
Having said that - you will need one of:
packetOffset being a pointer
A payload length member
A checksum offset member
and you'll want to use a named constructor idiom which takes the allocation length, and performs both the allocation and the setup of the offset/length/pointer member to a value corresponding to the length.

How can I bind a buffer resource that resides on the GPU to the input assembler (IA)?

I use compute shaders to compute a triangle list and to store it in a RWStructuredBuffer. For testing I read this buffer and pass it to the IA via context.InputAssembler.SetVertexBuffers (…). This approach works, but is valid only for testing the data for correctness.
Now I want to bind the (already existing) buffer to the IA stage using a resource view (aka without passing a pointer to the vertex buffer).
I am reading some good books (Frank D. Luna, Jason Zink), but they never mention this case.
===============
EDIT:
The syntax I am using here in imposed by the SharpDX wrapper.
I can bind the buffer to the vertex shader via context.VertexShader.SetShaderResource(...), bindig a ResoureceView. In the VS I use SV_VertexID to access the buffer. So I HAVE a working solution for moment, but there might be cases in the future where I must bind the buffer to the input assembler.
Simply put, you can't bind a structured buffer to the IA stage, at least directly, runtime will not allow this.
If you put ResourceOptionFlags.BufferStructured as OptionFlags, you are not allowed to use : VertexBuffer/IndexBuffer/StreamOutput/ConstantBuffer/RenderTarget/Depth as bind flags, Resource creation will fail.
One option, which costs you a GPU copy, is to create a second buffer with VertexBuffer BindFlags, and Default usage (same size as your structured buffer).
Once you are done processing your structuredbuffer, call:
DeviceContext.CopyResource
And you'll have a standard vertex buffer ready to use.

Implement Virtual Memory with Memory Mapped Files

Is it possible to wrap up memory mapped files something like this?
TVirtualMemoryManager = class
public
function AllocMem (Size : Integer) : Pointer;
procedure FreeMem (Ptr : Pointer);
end;
Since the memory mapped file API functions all take offsets I don't know how to manage the free areas in the memory mapped files. My only idea is to implement some kind of basic memory management (mainting free lists for different block sizes) but I don' t know how efficient this will be.
EDIT: What I really want (as David made clear to me) is this:
IVirtualMemory = interface
function ReadMem (Addr : Int64) : TBytes;
function AllocateMem (Data : TBytes) : Int64;
procedure FreeMem (Addr : Int64);
end;
I need to store continous blocks of bytes (each relatively small) in virtual memory and be able to read them back into memory using a 64-bit adress. Most of the time access is read-only. If a write is necessary I would just use FreeMem followed by AllocMem since the size will be different anyway.
I want a wrapper for a memory mapped file with this interface. Internally it has a handle to a memory mapped files and uses MapViewOfFile on each ReadMem request. The Addr 64-bit integers are just offsets into the memory mapped file. The open question is how to assign those adresses - I currently keep a list of free blocks that I maintain.
Your proposal that "Internally it has a handle to a memory mapped files and uses MapViewOfFile on each ReadMem request" will be just a waste of CPU resource, IMHO.
It is worth saying that your GetMem / FreeMem requirement won't be able to break the 3/4 GB barrier. Since all allocated memory will be mapped into memory until a call to FreeMem, you'll be short of memory space, just as with the regular Delphi memory manager. The best you can do is to rely of FastMM4, and change your program to reduce its memory use.
IMHO you'll have to change/update your specification. For instance, your "updated" question sounds just like a regular storage problem.
What you want is to be able to allocate more than 3/4 GB of data for your application. You have a working implementation of such a feature in our SynBigTable open source unit. This is a fast and light NoSQL solution in pure Delphi.
It is able to create a file of any size (only 64 bit limited), then will map the content of each record into memory, on request. It will use a memory mapping of the file, if possible. You can implement your interface very directly with TSynBigTable methods: ReadMem=Get, AllocMem=Add, FreeMem=Delete. The IDs will be your pointer-like values, and RawByteString will be used instead of TBytes.
You can access any block of data using an integer ID, or a string ID, or even use a sophisticated field layout (inside the record, or as in-memory metadata - including indexes and fast search).
Or rely on a regular embedded SQL database. For instance, SQLite3 is very good at handling BLOB fields, and is able to store huge amount of data. With a simple in-memory caching mechanism for most used records, it could be a powerful solution.

Partial unmap of Win32 memory-mapped file

I have some code (which I cannot change) that I need to get working in a native Win32 environment. This code calls mmap() and munmap(), so I have created those functions using CreateFileMapping(), MapViewOfFile(), etc., to accomplish the same thing. Initially this works fine, and the code is able to access files as expected. Unfortunately the code goes on to munmap() selected parts of the file that it no longer needs.
x = mmap(0, size, PROT_READ, MAP_SHARED, fd, 0);
...
munmap(x, hdr_size);
munmap(x + foo, bar);
...
Unfortunately, when you pass a pointer into the middle of the mapped range to UnmapViewOfFile() it destroys the entire mapping. Even worse, I can't see how I would be able to detect that this is a partial un-map request and just ignore it.
I have tried calling VirtualFree() on the range but, unsurprisingly, this produces ERROR_INVALID_PARAMETER.
I'm beginning to think that I will have to use static/global variables to track all the open memory mappings so that I can detect and ignore partial unmappings, but I hope you have a better idea...
edit:
Since I wasn't explicit enough above: the docs for UnMapViewOfFile do not accurately reflect the behavior of that function.
Un-mapping the whole view and remapping pieces is not a good solution because you can only suggest a base address for a new mapping, you can't really control it. The semantics of munmap() don't allow for a change to the base address of the still-mapped portion.
What I really need is a way to find the base address and size of a already-mapped memory area.
edit2: Now that I restate the problem that way, it looks like the VirtualQuery() function will suffice.
It is quite explicit in the MSDN Library docs for UnmapViewOfFile:
lpBaseAddress A pointer to the
base address of the mapped view of a
file that is to be unmapped. This
value must be identical to the value
returned by a previous call to the
MapViewOfFile or MapViewOfFileEx
function.
You changing the mapping by unmapping the old one and creating a new one. Unmapping bits and pieces isn't well supported, nor would it have any useful side-effects from a memory management point of view. You don't want to risk getting the address space fragmented.
You'll have to do this differently.
You could keep track each mapping and how many pages of it are still allocated by the client and only free the mapping when that counter reaches zero. The middle sections would still be mapped, but it wouldn't matter since the client wouldn't be accessing that memory anyway.
Create a global dictionary of memory mappings through this interface. When a mapping request comes through, record the address, size and number of pages that are in the range. When a unmap request is made, find out which mapping owns that address and decrease the page count by the number of pages that are being freed. When that count reaches zero, really unmap the view.

Can address space be recycled for multiple calls to MapViewOfFileEx without chance of failure?

Consider a complex, memory hungry, multi threaded application running within a 32bit address space on windows XP.
Certain operations require n large buffers of fixed size, where only one buffer needs to be accessed at a time.
The application uses a pattern where some address space the size of one buffer is reserved early and is used to contain the currently needed buffer.
This follows the sequence:
(initial run) VirtualAlloc -> VirtualFree -> MapViewOfFileEx
(buffer changes) UnMapViewOfFile -> MapViewOfFileEx
Here the pointer to the buffer location is provided by the call to VirtualAlloc and then that same location is used on each call to MapViewOfFileEx.
The problem is that windows does not (as far as I know) provide any handshake type operation for passing the memory space between the different users.
Therefore there is a small opportunity (at each -> in my above sequence) where the memory is not locked and another thread can jump in and perform an allocation within the buffer.
The next call to MapViewOfFileEx is broken and the system can no longer guarantee that there will be a big enough space in the address space for a buffer.
Obviously refactoring to use smaller buffers reduces the rate of failures to reallocate space.
Some use of HeapLock has had some success but this still has issues - something still manages to steal some memory from within the address space.
(We tried Calling GetProcessHeaps then using HeapLock to lock all of the heaps)
What I'd like to know is there anyway to lock a specific block of address space that is compatible with MapViewOfFileEx?
Edit: I should add that ultimately this code lives in a library that gets called by an application outside of my control
You could brute force it; suspend every thread in the process that isn't the one performing the mapping, Unmap/Remap, unsuspend the suspended threads. It ain't elegant, but it's the only way I can think of off-hand to provide the kind of mutual exclusion you need.
Have you looked at creating your own private heap via HeapCreate? You could set the heap to your desired buffer size. The only remaining problem is then how to get MapViewOfFileto use your private heap instead of the default heap.
I'd assume that MapViewOfFile internally calls GetProcessHeap to get the default heap and then it requests a contiguous block of memory. You can surround the call to MapViewOfFile with a detour, i.e., you rewire the GetProcessHeap call by overwriting the method in memory effectively inserting a jump to your own code which can return your private heap.
Microsoft has published the Detour Library that I'm not directly familiar with however. I know that detouring is surprisingly common. Security software, virus scanners etc all use such frameworks. It's not pretty, but may work:
HANDLE g_hndPrivateHeap;
HANDLE WINAPI GetProcessHeapImpl() {
return g_hndPrivateHeap;
}
struct SDetourGetProcessHeap { // object for exception safety
SDetourGetProcessHeap() {
// put detour in place
}
~SDetourGetProcessHeap() {
// remove detour again
}
};
void MapFile() {
g_hndPrivateHeap = HeapCreate( ... );
{
SDetourGetProcessHeap d;
MapViewOfFile(...);
}
}
These may also help:
How to replace WinAPI functions calls in the MS VC++ project with my own implementation (name and parameters set are the same)?
How can I hook Windows functions in C/C++?
http://research.microsoft.com/pubs/68568/huntusenixnt99.pdf
Imagine if I came to you with a piece of code like this:
void *foo;
foo = malloc(n);
if (foo)
free(foo);
foo = malloc(n);
Then I came to you and said, help! foo does not have the same address on the second allocation!
I'd be crazy, right?
It seems to me like you've already demonstrated clear knowledge of why this doesn't work. There's a reason that the documention for any API that takes an explicit address to map into lets you know that the address is just a suggestion, and it can't be guaranteed. This also goes for mmap() on POSIX.
I would suggest you write the program in such a way that a change in address doesn't matter. That is, don't store too many pointers to quantities inside the buffer, or if you do, patch them up after reallocation. Similar to the way you'd treat a buffer that you were going to pass into realloc().
Even the documentation for MapViewOfFileEx() explicitly suggests this:
While it is possible to specify an address that is safe now (not used by the operating system), there is no guarantee that the address will remain safe over time. Therefore, it is better to let the operating system choose the address. In this case, you would not store pointers in the memory mapped file, you would store offsets from the base of the file mapping so that the mapping can be used at any address.
Update from your comments
In that case, I suppose you could:
Not map into contiguous blocks. Perhaps you could map in chunks and write some intermediate function to decide which to read from/write to?
Try porting to 64 bit.
As the earlier post suggests, you can suspend every thread in the process while you change the memory mappings. You can use SuspendThread()/ResumeThread() for that. This has the disadvantage that your code has to know about all the other threads and hold thread handles for them.
An alternative is to use the Windows debug API to suspend all threads. If a process has a debugger attached, then every time the process faults, Windows will suspend all of the process's threads until the debugger handles the fault and resumes the process.
Also see this question which is very similar, but phrased differently:
Replacing memory mappings atomically on Windows

Resources