Copying the NetBufferListInfo array instead of calling NdisCopySendNetBufferListInfo? - ndis

I have a kernel driver that seperates every net buffer in every recv/send NBL, and allocates a local structure for every NB which contains the packet content and some other stuff and sends it to a user buffer, and completes/returns the original NBL. The user service then sends back the packets that are OK back to driver and i then finally send/indicate them, by creating an NBL for every packet, and then chaining them together.
This causes the loss of some metadata about the packet, that are seem to be inside NetBufferListInfo (specially WFP metadata).
My question is, can i just save the content of the NetBufferListInfo in the original NBL of the corresponding NB, and save it to my local structure that represents NBs, and the recopy it when i create a NBL that contains that NB?
I thought i should use NdisAllocateCloneNetBufferList + NetBufferListInfo and copy the pointer to the cloned NBL to my local structure as a new member, but the problem with this is that freeing these buffers using NdisFreeCloneNetBufferList will become very complicated, as i create a local structure for every net buffer and insert them in a list that is sent to user service, and each of them could point to a arbitrary NBL (and many of them could be pointing to the same NBL, as i clone the NBL once for multiple NBs inside an NBL), and when the user sends those that are OK to me to, i need to be careful not to free a cloned NBL twice after i indicate/send the NBLs (i assume it will cause double free BSOD). Therefore i need to then keep a reference count for every cloned NBL and make sure to free it when it hits 0, and also some other complications..
So what is the easiest solution for me? Can i just save the content of NetBufferListInfo array of the corresponding NBL in my local structure, and recopy it when i create the final NBL corresponding to a NB?

You're allowed to copy the NetBufferListInfo fields that are fully documented and that you understand semantically. For example, you can copy the TcpIpChecksumNetBufferListInfo, because it's fully documented, and semantically it makes sense to propagate it to another NBL that has the same payload.
But there are some fields that are private to the OS, and at least one of those is a refcounted pointer. So if you were to copy that field, then return the original NBL, the pointer would dangle. The only way to influence the refcount is to call NdisCopySendNetBufferListInfo or NdisCopyReceiveNetBufferListInfo, which handle this for you.
TL;DR: You can write code like this:
new_nbl->NetBufferListInfo[TcpIpChecksumNetBufferListInfo] = old_nbl->NetBufferListInfo[TcpIpChecksumNetBufferListInfo];
new_nbl->NetBufferListInfo[NetBufferListHashValue] = old_nbl->NetBufferListInfo[NetBufferListHashValue];
. . .
but you can't write code like this:
RtlCopyMemory(
new_nbl->NetBufferListInfo,
old_nbl->NetBufferListInfo,
sizeof(old_nbl->NetBufferListInfo));
since not all the slots are blittable.

Related

Proper way to absorb and reinject a MAC_FRAME_ETHERNET in WFP?

I want to write a WFP driver that works on the inbound/outbound MAC_FRAME_ETHERNET WFP layer in order to capture the entire packet (That is why I chose MAC_FRAME instead of IPPACKET). Therefore i have a thread that receives these absorbed MAC_FRAMES and reinjects those that are OK and not malicious.
My question is, what are the proper steps that i need to do this?
Currently I'm doing it this way:
ClassifyFn:
// I DO NOT use FwpsReferenceNetBufferList( nbl )
FwpsAllocateCloneNetBufferList( nbl, cloneNbl ) // I retreat before clone with the size of ETH header in case of inbound and advance it afterward, otherwise no retreat.
// Absorb
classifyOut->actionType = FWP_ACTION_BLOCK
classifyOut->flags |= FWPS_CLASSIFY_OUT_FLAG_ABSORB
classifyOut->rights &= ~FWPS_RIGHT_ACTION_WRITE
Thread:
FwpsInjectMacSendAsync( cloneNbl ) // if the packet was OK
FwpsInjectMacReceiveAsync( cloneNbl ) // if the packet was OK
InjectCompletion:
FwpsFreeCloneNetBufferList
So these are my questions:
Am i doing it correctly? Is there anything i can do to improve and make it more stable?
Do i need to reference the original NBL and deref it in the injection completion?
What is the difference between using FwpsAllocateNetBufferAndNetBufferList0 vs FwpsAllocateCloneNetBufferList in this scenario?
Can i safely access the cloned NBL forever without referencing the original NBL?
Note that i do not Modify the packets at all, either i drop it or allow it.
I'm asking this because there seems to be some pool corruption somewhere that is causing random BSODs, and I'm not sure if its related to me doing something wrong or not?

How do I create an instance of a class, variably sized, at a specific memory location?

I'm working on a project involving writing packets to a memory-mapped file. Our current strategy is to create a packet class containing the following members
uint32_t packetHeader;
uint8_t packetPayload[];
uint32_t packetChecksum;
When we create a packet, first we'd like to have its address in memory be a specified offset within the memory mapped file, which I think can be done with placement-new(). However, we'd also like for the packetPayload not to be a pointer to some memory from the heap, but contiguous with the rest of the class (so we can avoid memcpying from heap to our eventual output file)
i.e.
Memory
Beginning of class | BOC + 4 | (length of Payload) |
Header Payload Checksum
Would this be achievable using a length argument for the Packet class constructor? Or would we have to template this class for variably sized payloads?
Forget about trying to make that the layout of your class. You'll be fighting against the C++ language all the day long. Instead a class that provides access to the binary layout (in shared memory). But the class instance itself will not be in shared memory. And the byte range in shared/mapped memory will not be a C++ object at all, it just exists within the file mapping address range.
Presumably the length is fixed from the moment of creation? If so, then you can safely cache the length, pointer to the checksum, etc in your accessor object. Since this cache isn't inside the file, you can store whatever you want however you want without concern for its layout. You can even use virtual member functions, because the v-table is going in the class instance, not the range of the binary file.
Also, given that this lives in shared memory, if there are multiple writers you'll have to be very careful to synchronize between them. If you're just prepositioning a buffer in shared/mapped memory to avoid a copy later, but totally handing off ownership between processes so that the data is never shared by simultaneous accesses, it will be easier. You also probably want to calculate the checksum once after all the data is written, instead of trying to keep it up-to-date (and risking data races in the process) for every single write into the buffer.
First remember, that you need to know what your payload length is, somehow. Either you specify it in your instance somewhere, or you template your class over the payload length.
Having said that - you will need one of:
packetOffset being a pointer
A payload length member
A checksum offset member
and you'll want to use a named constructor idiom which takes the allocation length, and performs both the allocation and the setup of the offset/length/pointer member to a value corresponding to the length.

Initializing map elements, where value is a struct with mutex lock golang

I have a map where every value is a pointer to another struct that itself has a lock.
type StatMap map[string]*Stats
type Stats struct {
sync.RWMutex
someStats, someMoreStats float64
}
I have implemented a method where I pack the StatMap into another struct and have a Mutex lock for the entire map, but I am expecting to modify every entry in the map simoultaniously from hundreds of goroutines, so it would be more effective to lock every element in the map so that two or more goroutines can read and modify values for entries in parallell.
What I am wondering is how I can initialize a new entry in the map whenever there comes a new key? I cannot lock the entry if it isn't in the map already, and I cannot check if it is in the map (as far as I know) in case another goroutine is currently modifying that entry.
I do not know what keys will be in the map before runtime.
My current implementation (that causes data races):
initializeStatMap("key")
statMap["key"].Lock()
// . . .
func initializeStatMap(key string) {
if statMap[key] != nil {
return
}
statMap[key] = &Stats{someStats: 0, someMoreStats: 0}
}
The Go's map semantics are as follows:
A map stores values (not variables) and that's why these values are
not adressable, and that's why you can't do something like
type T struct {
X int
}
m := make(map[int]T)
m[0] = T{}
m[0].x = 42 // won't compile
This requirement mostly comes from from the fact a map, being
an intricate dynamic data structure, should allow its particular
implementations to physically move the values it contains
around in memory — when doing rebalancing etc.
That's why the only three operations a map supports is adding
(or replacing) of its elements, getting them back and deleting them.
A map is not safe for concurrent use, so in order to do any of those three
operations on the same map concurrently, you need to protect it in
one way or another.
Consequently, once you have read a value from a map, orchestrating
concurrent access to it is a completely another story,
and here we're facing another fact about the map's semantics:
since it keeps values and is free to copy them around in memory,
it's not allowed to keep in a map anything which you want to have
reference semantics. For instance, it would be incorrect to keep values
of your Stats type in the map directly—because they embed instances
of sync.Mutex, and copying of them prohibited after they are first used.
Here you're already doing the right thing by storing pointers to your
variables.
Now you can see that it's pretty OK to roll like this:
Access the map itself to get a value bound to a key in a concurrent-safe
way (say, by holding a lock).
Lock the mutex on that variable and operate on it. That does not
involve the map at all.
The only remaining possible problem is as follows.
Suppose you're protecting the access to your map with a lock.
So you grab the lock, obtain the value bound to a key, by copying
it to a variable, release the lock and work with the copy of the
value.
Now while you're working with the copy of that value another
goroutine is free to update the map by deleting the value or replacing it.
While in your case it's fine technically — because your map operates on
pointers to variables, and it's fine to copy pointers — this might be inappropriate from the standpoint of the semantics of your program,
and this is something you have to think through.
To make it more clear, once you've got a pointer to some instance of Stats
and locked it, a pointer to this instance can be removed from the map,
or the map entry which held it could be updated by another pointer —
pointing to another instance of Stats, so once you're done with the
instance, it might have become unreachable via the map.

Transfer a pointer through boost::interprocess::message_queue

What I am trying to do is have application A send application B a pointer to an object which A has allocated on shared memory ( using boost::interprocess ). For that pointer transfer I intend to use boost::interprocess::message_queue. Obviously a direct raw pointer from A is not valid in B so I try to transfer an offset_ptr allocated on the shared memory. However that also does not seem to work.
Process A does this:
typedef offset_ptr<MyVector> MyVectorPtr;
MyVectorPtr * myvector;
myvector = segment->construct<MyVectorPtr>( boost::interprocess::anonymous_instance )();
*myvector = segment->construct<MyVector>( boost::interprocess::anonymous_instance )
(*alloc_inst_vec); ;
// myvector gets filled with data here
//Send on the message queue
mq->send(myvector, sizeof(MyVectorPtr), 0);
Process B does this:
// Create a "buffer" on this side of the queue
MyVectorPtr * myvector;
myvector = segment->construct<MyVectorPtr>( boost::interprocess::anonymous_instance )();
mq->receive( myvector, sizeof(MyVectorPtr), recvd_size, priority);
As I see it, in this way a do a bit copy of the offset pointer which invalidates him in process B. How do I do this right?
It seems you can address it as described in this post on the boost mailing list.
I agree there is some awkwardness here and offset_ptr doesn't really work for what you are trying to do. offset_ptr is useful if the pointer itself is stored inside of another class/struct which also is allocated in your shared memory segment, but generally you have some top-level item which is not a member of some object allocated in shared memory.
You'll notice the offset_ptr example kindof glosses over this - it just has a comment "Communicate list to other processes" with no details. In some cases you may have a single named top-level object and that name can be how you communicate it, but if you have an arbitrary number of top-level objects to communicate, it seems like just sending the offset from the shared memory's base address is the best you can do.
You calculate the offset on the sending in, send it, and then add to the base adddress on the receiving end. If you want to be able to send nullptr as well, you could do like offset_ptr does and agree that 1 is an offset that is sufficiently unlikely to be used, or pick another unlikely sentinel value.

Partial unmap of Win32 memory-mapped file

I have some code (which I cannot change) that I need to get working in a native Win32 environment. This code calls mmap() and munmap(), so I have created those functions using CreateFileMapping(), MapViewOfFile(), etc., to accomplish the same thing. Initially this works fine, and the code is able to access files as expected. Unfortunately the code goes on to munmap() selected parts of the file that it no longer needs.
x = mmap(0, size, PROT_READ, MAP_SHARED, fd, 0);
...
munmap(x, hdr_size);
munmap(x + foo, bar);
...
Unfortunately, when you pass a pointer into the middle of the mapped range to UnmapViewOfFile() it destroys the entire mapping. Even worse, I can't see how I would be able to detect that this is a partial un-map request and just ignore it.
I have tried calling VirtualFree() on the range but, unsurprisingly, this produces ERROR_INVALID_PARAMETER.
I'm beginning to think that I will have to use static/global variables to track all the open memory mappings so that I can detect and ignore partial unmappings, but I hope you have a better idea...
edit:
Since I wasn't explicit enough above: the docs for UnMapViewOfFile do not accurately reflect the behavior of that function.
Un-mapping the whole view and remapping pieces is not a good solution because you can only suggest a base address for a new mapping, you can't really control it. The semantics of munmap() don't allow for a change to the base address of the still-mapped portion.
What I really need is a way to find the base address and size of a already-mapped memory area.
edit2: Now that I restate the problem that way, it looks like the VirtualQuery() function will suffice.
It is quite explicit in the MSDN Library docs for UnmapViewOfFile:
lpBaseAddress A pointer to the
base address of the mapped view of a
file that is to be unmapped. This
value must be identical to the value
returned by a previous call to the
MapViewOfFile or MapViewOfFileEx
function.
You changing the mapping by unmapping the old one and creating a new one. Unmapping bits and pieces isn't well supported, nor would it have any useful side-effects from a memory management point of view. You don't want to risk getting the address space fragmented.
You'll have to do this differently.
You could keep track each mapping and how many pages of it are still allocated by the client and only free the mapping when that counter reaches zero. The middle sections would still be mapped, but it wouldn't matter since the client wouldn't be accessing that memory anyway.
Create a global dictionary of memory mappings through this interface. When a mapping request comes through, record the address, size and number of pages that are in the range. When a unmap request is made, find out which mapping owns that address and decrease the page count by the number of pages that are being freed. When that count reaches zero, really unmap the view.

Resources