Update vertex buffer in non UI thread using DirectX11 - directx-11

I want to render a object with a dynamic vertex buffer and I do rendering in UI thread. I am thinking is it possible to change this vertex buffer content in a non UI thread using Map and Unmap.
Thanks.
YL

The Direct3D 11 multi-threading model is fairly simple:
Calls to the ID3D11Device are thread-safe (unless you used the D3D11_CREATE_DEVICE_SINGLETHREADED flag when you created the device). You can call the methods on this interface from any thread.
Calls to the ID3D11DeviceContext11 are not thread-safe, and you should only call methods on this interface for a given context from a single thread at a time.
This is why Map and Unmap are part of the ID3D11DeviceContext11 rather than ID3D11Device or on the ID3D11Resource itself like it was in Direct3D 10. The operation is inherently serial with other operations.
This means you should have a single thread using the immediate device context (and DXGI), and this should probably be the same thread as your main windows message pump (for the reasons covered in DirectX Graphics Infrastructure (DXGI): Best Practices.
You could Map on the same thread as the one using the immediate context, marshal the pointer to another thread, and then Unmap it from the original thread when that thread completes but this is highly unlikely to improve performance.
See Introduction to Multithreading in Direct3D 11

Related

What happens if you call glBufferData on a mapped buffer?

What happens if you call glBufferData on a buffer currently mapped with glMapBufferRange? I suggest it would be illegal, but I cannot find anything in the spec:
https://www.khronos.org/registry/OpenGL-Refpages/es3.0/html/glBufferData.xhtml
It is illegal in glDrawArrays spec.
Ok, additional challenge:
What if we have context resource sharing and the buffer is currently mapped in thread A with context A, then thread B on context B calls glBufferData on it?
For a single context scenario when glBufferData() is called then the existing buffer object is deleted, any active bindings of that resource in that context will be unbound, and any active mappings will be removed. If you call glUnmapBuffer() after glBufferData() from within the same context then you'll get a GL_INVALID_OPERATION error, because the state of the new version of the buffer is not initially mapped.
For a multi-context scenario it gets more complicated. OpenGL ES defines a weakly coherent state management model (to avoid expensive locking requirements on performance critical call paths).
Render state that belongs to the context (e.g. binding information, enable bits) is never modifiable by another context (can be implemented lockless).
Resource state that belongs to an object (e.g. buffers, samplers, textures) is weakly coherent. A context will see its own changes immediately, but only pick up changes written by another context when it binds the resource (only needs locking on bind changes).
Resource data payload is not coherent at all. If you want to ensure that data is available from one context in another then you must include manual synchronization between the threads.
Thread A calling glMapBuffer() will create a copy of the master state of Buffer "version 1", including a local state setting that the buffer is "mapped".
Thread B calling glBufferData() will create a new version of the buffer resource "version 2", but this will not impact the state held by Thread A which will continue to reflect the state at the time that the buffer was bound in thread A (version 1).
Thread A calling glUnmapBuffer() will work fine, because it will unmap buffer "version 1" (the mapped state is local to the Thread A context, and that still says the buffer is "mapped").
Note that the data contents of the buffer that Thread A sees after Thread B calls glBufferData() is unpredictable (it could be the old data, it could be the new data), in accordance with the design that data is not coherent at all. If there were no pending draw operations then it is valid for the driver to simply reuse the memory that "version 1" of the buffer was backed by to contain the content that was uploaded for "version 2". If you want guarantees about data consistency across contexts then you need manual synchronization (it's conceptually just like having two threads call glBufferData() on the same buffer at the same time).
I'd recommend reading Chapter 5 of the OpenGL ES 3.2 spec.

Map buffer with glMapBuffer, then use pointer in different thread

I'm trying to optimize a program that issues all OpenGL ES calls in the main thread. Main performance issue seems to be frequent buffer uploads via glBufferData, more specifically a memcpy inside this function that is done synchronously with the main thread (the buffers a pretty large).
My current plan would be to instead map the buffer in the main thread using glMapBuffer, then send the pointer to a different thread which performs the memcpy, once this thread is finished call glUnmapBuffer again in the main thread. After that, the buffer is used for rendering.
Would this approach work or is it dangerous to use glMapBuffer pointers in a thread that doesn't have the gl context? Or is there a way to ensure no memcpy is performed on the main thread and everything is done on the pipeline thread?
Regards
Once you've mapped the buffer then the pointer is a "normal" CPU pointer, so can be used just like any other CPU pointer including cross-thread access.
Just make sure that you've complete any writes and sync the threads before calling glUnmapBuffer().

Windows: how to spawn threads from (NDIS) kernel driver?

Which function is recommended to spawn a new thread within NDIS5/6 context? Looking for something that is guaranteed to work at IRQL=PASSIVE (e.g. no bsods out of nothing); by a quick examination of ndis.h contents, found nothing.
Also, it is planned to use a newly spawned thread for calling upon NdisFreeMemory* family, will it be causing any problems to free allocated, but unused memory from a different thread?
Threading is outside the scope of NDIS. If you need to start a new thread, use the standard kernel routines (like PsCreateSystemThread). Note that usually timers and work items are sufficicent for most miniport needs. It is unusual for an NDIS miniport to create its own thread, although I suppose there are valid cases where it might be a fair design.
It is ok to allocate memory on one thread and free it on another.

Can you re-use buffers with Windows wave audio input?

I'm using the Windows multimedia APIs to record and process wave audio (waveInOpen and friends). I'd like to use a small number of buffers in a round robin fashion.
I know that you're supposed to use waveInPrepareHeader before adding a buffer to the device, and that you're supposed to call waveInUnprepareHeader after the wave device has "returned the buffer to the application" and before you deallocate it.
My question is, do I have to unprepare and re-prepare in order to re-use a buffer? Or can I just add a previously used buffer back to the device?
Also, does it matter what thread I do this on? I'm using the callback function, which seems to be called on a worker thread that belongs to the audio system. Can I call waveInUnprepareHeader, waveInPrepareHeader, and waveInAddBuffer on that thread, during the callback?
Yes, my experience has been you need to call prepare and unprepare every time. From memory, it returns an error if you try to reuse the same one.
And you typically call the prepare and unprepare on whatever thread you are handling the callbacks on.
When you create the buffers, call waveInPrepareHeader. Then you can simply set the prepared flag before you call waveInAddBuffer on a buffer that was returned from the device.
pHdr->dwFlags = WHDR_PREPARED;
You can do this on the callback thread (or in the message handler).

How best to synchronize memory access shared between kernel and user space, in Windows

I can't find any function to acquire spinlock in Win32 Apis.
Is there a reason?
When I need to use spinlock, what do I do?
I know there is an CriticalSectionAndSpinCount function.
But that's not what I want.
Edit:
I want to synchronize a memory which will be shared between kernel space and user space. -The memory will be mapped.
I should lock it when I access the data structure and the locking time will be very short.
The data structure(suppose it is a queue) manages event handles to interaction each other.
What synchronization mechanism should I use?
A spinlock is clearly not appropriate for user-level synchronization. From http://www.microsoft.com/whdc/driver/kernel/locks.mspx:
All types of spin locks raise the IRQL
to DISPATCH_LEVEL or higher. Spin
locks are the only synchronization
mechanism that can be used at IRQL >=
DISPATCH_LEVEL. Code that holds a spin
lock runs at IRQL >= DISPATCH_LEVEL,
which means that the system’s thread
switching code (the dispatcher) cannot
run and, therefore, the current thread
cannot be pre-empted.
Imagine if it were possible to take a spin lock in user mode: Suddenly the thread would not be able to be pre-empted. So on a single-cpu machine, this is now an exclusive and real-time thread. The user-mode code would now be responsible for handling interrupts and other kernel-level tasks. The code could no longer access any paged memory, which means that the user-mode code would need to know what memory is currently paged and act accordingly. Cats and dogs living together, mass hysteria!
Perhaps a better question would be to tell us what you are trying to accomplish, and ask what synchronization method would be most appropriate.
There is a managed user-mode SpinLock as described here. Handle with care, as advised in the docs - it's easy to go badly wrong with these locks.
The only way to access this in native code is via the Win32 API you named already - CriticalSectionAndSpinCount and its siblings.

Resources