Create a vertex buffer with a variant size - directx-11

I want to create a vertex buffer with variant size for desc.ByteWidth as shown in the following. How to do this?
Thanks a lot.
D3D11_BUFFER_DESC desc;
ZeroMemory( &desc, sizeof( desc ) );`enter code here`
desc.Usage = D3D11_USAGE_DYNAMIC;
desc.ByteWidth = size;//make it variant
desc.BindFlags = D3D11_BIND_VERTEX_BUFFER;
desc.CPUAccessFlags = D3D11_CPU_ACCESS_WRITE;
d3dDevice->CreateBuffer( &desc, initialVertexData, &vertexBuffer );

Buffer size and description in DirectX11 are fixed on creation time, that means if you need to update any of those parameters (size/usage) , you need to release this buffer and create a new one.
However, if you want to upload dynamic amount of data, you can create a buffer "big enough" , and only upload part of it ( as your buffer is dynamic, you do it via the Map function.
It is totally allowed to fill only a small part of this, and then use a draw function that only uses a subset of the data.
For example, using the Draw function.
If you buffer size is for 32 vertices, you can still use:
devicecontext.Draw(5, 0);
This will only draw the first 5 vertices, regardless of the total size.

Related

Improving the performance of Webgl2 texSubImage2D call with large texture

Using WebGL2 I stream a 4K by 2K stereoscopic video as a texture onto the inside of a sphere in order to provide 360° VR video playback capability. I've optimized as much of the codebase as is feasible given the returns on time and the application runs flawlessly when using an .H264 video source.
However; when using 8bit VP8 or VP9 (which offer superior fidelity and file size, AV1 isn't available to me) I encounter FPS drops on weaker systems due to the extra CPU requirements for decoding VP8/VP9 video.
When profiling the app, I've identified that the per-frame call of texSubImage2D that updates the texture from the video consumes the large majority of each frame (texImage2D was even worse due to it's allocations), but am unsure how to further optimize it's use. Below are the things I'm already doing to minimize it's impact:
I cache the texture's memory space at initial load using texStorage2D to keep it as contiguous as possible.
let glTexture = gl.createTexture();
let pixelData = new Uint8Array(4096*2048*3);
pixelData.fill(255);
gl.bindTexture(GL.TEXTURE_2D, glTexture);
gl.texStorage2D(GL.TEXTURE_2D, 1, GL.RGB8, 4096, 2048);
gl.texSubImage2D(GL.TEXTURE_2D, 0, 0, 0, 4096, 2048, GL.RGB, GL.RGB, pixelData);
gl.generateMipmap(GL.TEXTURE_2D);
Then, during my render loop, both left and right eye-poses are processed for each object before moving on to the next object. This allows me to only need to call gl.bindTexture and gl.texSubImage2D once per object per frame. Additionally I also, skip populating shader program defines if the material for this entity is the same as the one for the previous entity, the video is paused, or still loading.
/* Main Render Loop Extract */
//Called each frame after pre-sorting entities
function DrawScene(glLayer, pose, scene){
//Entities are pre-sorted for transparency blending, rendering opaque first, and transparent second.
for (let ii = 0; ii < _opaqueEntities.length; ii++){
//Only render if entity and it's parent chain are active
if(_opaqueEntities[ii] && _opaqueEntities[ii].isActiveHeirachy){
for (let i = 0; i < pose.views.length; i++) {
_RenderEntityView(pose, i, _opaqueEntities[ii]);
}
}
}
for (let ii = 0; ii < _transparentEntities.length; ii++) {
//Only render if entity and it's parent chain are active
if(_transparentEntities[ii] && _transparentEntities[ii].isActiveHeirachy){
for (let i = 0; i < pose.views.length; i++) {
_RenderEntityView(pose, i, _transparentEntities[ii]);
}
}
}
}
let _programData;
function _RenderEntityView(pose, viewIdx, entity){
//Calculates/manipualtes view matrix for entity for this view. (<0.1ms)
//...
//Store reference to make stack overflow lines shorter :-)
_programData = entity.material.shaderProgram;
_BindEntityBuffers(entity, _programData);//The buffers Thomas, mind the BUFFERS!!!
gl.uniformMatrix4fv(
_programData.uniformData.uProjectionMatrix,
false,
_view.projectionMatrix
);
gl.uniformMatrix4fv(
_programData.uniformData.uModelViewMatrix,
false,
_modelViewMatrix
);
//Render all triangles that make up the object.
gl.drawElements(GL.TRIANGLES, entity.tris.length, GL.UNSIGNED_SHORT, 0);
}
let _attrName;
let _attrLoc;
let textureData;
function _BindEntityBuffers(entity, programData){
gl.useProgram(programData.program);
//Binds pre-defined shader atributes on an as needed basis
for(_attrName in programData.attributeData){
_attrLoc = programData.attributeData[_attrName];
//Bind only if exists in shader
if(_attrLoc.key >= 0){
_BindShaderAttributes(_attrLoc.key, entity.attrBufferData[_attrName].buffer,
entity.attrBufferData[_attrName].compCount);
}
}
//Bind triangle index buffer
gl.bindBuffer(GL.ELEMENT_ARRAY_BUFFER, entity.triBuffer);
//If already in use, is instanced material so skip configuration.
if(_materialInUse == entity.material){return;}
_materialInUse = entity.material;
//Use the material by applying it's specific uniforms
//Apply base color
gl.uniform4fv(programData.uniformData.uColor, entity.material.color);
//If shader uses a difuse texture
if(programData.uniformData.uDiffuseSampler){
//Store reference to make stack overflow lines shorter :-)
textureData = entity.material.diffuseTexture;
gl.activeTexture(gl.TEXTURE0);
//Use assigned texture
gl.bindTexture(gl.TEXTURE_2D, textureData);
//If this is a video, update the texture buffer using the current video's playback frame data
if(textureData.type == TEXTURE_TYPE.VIDEO &&
textureData.isLoaded &&
!textureData.paused){
//This accounts for 42% of all script execution time!!!
gl.texSubImage2D(gl.TEXTURE_2D, textureData.level, 0, 0,
textureData.width, textureData.height, textureData.internalFormat,
textureData.srcType, textureData.video);
}
gl.uniform1i(programData.uniformData.uDiffuseSampler, 0);
}
}
function _BindShaderAttributes(attrKey, buffer, compCount, type=GL.FLOAT, normalize=false, stride=0, offset=0){
gl.bindBuffer(GL.ARRAY_BUFFER, buffer);
gl.vertexAttribPointer(attrKey, compCount, type, normalize, stride, offset);
gl.enableVertexAttribArray(attrKey);
}
I've contemplated using pre-defined counters for all for loops to avoid the var i=0; allocation, but the gain from that seems hardly worth the effort.
Side Note, The source video is actually larger than 4K, but anything above 4K and FPS grinds to about 10-12.
Obligatory: The key functionality above is extracted from a larger WebGL rendering framework I wrote that itself runs pretty damn fast already. The reason I'm not 'just using' Three, AFrame, or other such common libraries is that they do not have an ATO from the DOD, whereas in-house developed code is ok.
Update 9/9/21: At some point when chrome updated from 90 to 93 the WebGL performance of texSubImage2D dropped dramatically, resulting in +100ms per frame execution regardless of CPU/GPU capability. Changing to use texImage2D now results in around 16ms per frame. In addition shifting from RGB to RGB565 offers up a few ms of performance while minimally sacrificing color.
I'd still love to hear from GL/WebGL experts as to what else I can do to improve performance.

How To Set Up Byte Alignment From a MTLBuffer to a 2D MTLTexture?

I have an array of float values that represents a 2D image (think from a CCD) that I ultimately want to render into a MTLView. This is on macOS, but I'd like to be able to apply the same to iOS at some point. I initially create an MTLBuffer with the data:
NSData *floatData = ...;
id<MTLBuffer> metalBuffer = [device newBufferWithBytes:floatData.bytes
length:floatData.length
options:MTLResourceCPUCacheModeDefaultCache | MTLResourceStorageModeManaged];
From here, I run the buffer through a few compute pipelines. Next, I want to create an RGB MTLTexture object to pass to a few CIFilter/MPS filters and then display. It seems to make sense to create a texture that uses the already created buffer as backing to avoid making another copy. (I've successfully used textures with a pixel format of MTLPixelFormatR32Float.)
// create texture with scaled buffer - this is a wrapper, i.e. it shares memory with the buffer
MTLTextureDescriptor *desc;
desc = [MTLTextureDescriptor texture2DDescriptorWithPixelFormat:MTLPixelFormatR32Float
width:width
height:height
mipmapped:NO];
desc.usage = MTLResourceUsageRead;
desc.storageMode = scaledBuffer.storageMode; // must match buffer
id<MTLTexture> scaledTexture = [scaledBuffer newTextureWithDescriptor:desc
offset:0
bytesPerRow:imageWidth * sizeof(float)];
The image dimensions are 242x242. When I run this I get:
validateNewTexture:89: failed assertion `BytesPerRow of a buffer-backed
texture with pixelFormat(MTLPixelFormatR32Float) must be aligned to 256 bytes,
found bytesPerRow(968)'
I know I need to use:
NSUInteger alignmentBytes = [self.device minimumLinearTextureAlignmentForPixelFormat:MTLPixelFormatR32Float];
How do I define the buffer such that the bytes are properly aligned?
More generally, is this the appropriate approach for this kind of data? This is the stage where I effectively convert the float data into something that has color. To clarify, this is my next step:
// render into RGB texture
MPSImageConversion *imageConversion = [[MPSImageConversion alloc] initWithDevice:self.device
srcAlpha:MPSAlphaTypeAlphaIsOne
destAlpha:MPSAlphaTypeAlphaIsOne
backgroundColor:nil
conversionInfo:NULL];
[imageConversion encodeToCommandBuffer:commandBuffer
sourceImage:scaledTexture
destinationImage:intermediateRGBTexture];
where intermediateRGBTexture is a 2D texture defined with MTLPixelFormatRGBA16Float to take advantage of EDR.
If it's important to you that the texture share the same backing memory as the buffer, and you want the texture to reflect the actual image dimensions, you need to ensure that the data in the buffer is correctly aligned from the start.
Rather than copying the source data all at once, you need to ensure the buffer has room for all of the aligned data, then copy it one row at a time.
NSUInteger rowAlignment = [self.device minimumLinearTextureAlignmentForPixelFormat:MTLPixelFormatR32Float];
NSUInteger sourceBytesPerRow = imageWidth * sizeof(float);
NSUInteger bytesPerRow = AlignUp(sourceBytesPerRow, rowAlignment);
id<MTLBuffer> metalBuffer = [self.device newBufferWithLength:bytesPerRow * imageHeight
options:MTLResourceCPUCacheModeDefaultCache];
const uint8_t *sourceData = floatData.bytes;
uint8_t *bufferData = metalBuffer.contents;
for (int i = 0; i < imageHeight; ++i) {
memcpy(bufferData + (i * bytesPerRow), sourceData + (i * sourceBytesPerRow), sourceBytesPerRow);
}
Where AlignUp is your alignment function or macro of choice. Something like this:
static inline NSUInteger AlignUp(NSUInteger n, NSInteger alignment) {
return ((n + alignment - 1) / alignment) * alignment;
}
It's up to you to determine whether the added complexity is worth saving a copy, but this is one way to achieve what you want.

What is the syntax of ImageResize()

I have data that change in size and want to display them in the same window. The command
void ImageResize( BasicImage im, Number num_dim, Number... )
seems like a potential fit, but the syntax is not clear at all.
Let's say I have 512x5 data set and now it needs to be 367x5.
The , Number...) indicates that this command takes a different number of parameters, all of them interpreted as number parameters. Commands which do this, usually use one of their other parameters to specify how many such parameters follow.
A typical example for this is also the SliceN command.
In this particular case, the command not only allows you to change the size of the dimensions in the image, but also the number of dimensions. It is a very useful command to f.e. change a 2D image into a 3D stack or the like.
The command ImageResize( BasicImage im, Number num_dim, Number... ) does several things:
It replaces im in-place, so the meta-data, display and window remains the same
It adjusts the dimension calibration when the dimension size is changed. Here, the assumption is, that the field-of-view before and
after the resize is the same. (The command can be used to easily scale
images as shown in the example below.)
All values of the image im are set to zero. ( If you need to keep the values, you need to act on an image clone!)
Example 1: Resizing image with bilinar interpolation
image before := GetFrontImage()
number sx, sy
before.GetSize(sx,sy)
number factor = 1.3
image after := before.ImageClone()
after.ImageResize( 2, factor*sx, factor*sy ) // Adjusts the empty container with meta-data
after = warp(before, icol/factor, irow/factor ) // interpolate data
after.ShowImage()
Example 2: Extend 2D image into 3D stack
number sx = 100
number sy = 100
image img := RealImage("2D",4,sx,sy)
img = iradius* Random()
img.ShowImage()
OKDialog("Now into a stack...")
number sz = 10
img.ImageResize(3,sx,sy,sz) // All values are zero now!
img = iradius * Random()

Change bitmap size without create a new bitmap

I created a bitmap using CreateDIBSection and specified .biWidth = 100 ; .biHeight = 100 like this pseudo-code :
pBitmapInfo->bmiHeader.biWidth = 100;
pBitmapInfo->bmiHeader.biHeight = 100;
....
CreateDIBSection(DibDC, pBitmapInfo, DIB_RGB_COLORS, 0, 0, 0);
Later, i want to reuse this bitmap, just change the bitmap size to 300x100 (and may clear the old image because i don't need it anymore). Many one say I need to create a new bitmap with new size and delete the old bitmap. But I expected in someway that we can re-use the old bitmap. I don't want to re-create a new bitmap because it cause slow performance while i need to do it repeatly many times. So is there any way to change the bitmap size without re-create a new bitmap?
If you are worried about performance it is indeed not a good idea to keep destroying and creating bitmaps.
There is however an easier solution. Simply create a pool of bitmaps in predefined sizes and use bitmaps from the pool as needed.
If you have a long lived DC, you can use:
hBitmap100x100 = CreateCompatibleBitmap(MyDC, 100,100);
hBitmap300x300 = CreateCompatibleBitmap(MyDC, 300,300);
If you keep changing DC's then use a DIB section
hBitmap100x100 = CreateDIBSection(DibDC, pBitmapInfo100x100, DIB_RGB_COLORS, null, 0, 0);
hBitmap300x300 = CreateDIBSection(DibDC, pBitmapInfo100x100, DIB_RGB_COLORS, null, 0, 0);
Just keep reusing these over and over.
You can even have a dozen of them in an array if you like.
You create them at program startup and dispose of them when done.

CreatePatternBrush and screen color depth

I am creating a brush using CreatePatternBrush with a bitmap created with CreateBitmap.
The bitmap is 1 pixel wide and 24 pixels tall, I have the RGB value for each pixel, so I create an array of rgbquads and pass that to CreateBitmap.
This works fine when the screen color depth is 32bpp, since the bitmap I create is also 32bpp.
When the screen color depth is not 32bpp, this fails, and I understand why it does, since I should be creating a compatible bitmap instead.
It seems I should use CreateCompatibleBitmap instead, but how do I put the pixel data I have into that bitmap?
I have also read about CreateDIBPatternBrushPt, CreateDIBitmap, CreateDIBSection, etc.
I don´t understand what is a DIBSection, and find the subject generally confusing.
I do understand that I need a bitmap with the same color depth as the screen, but how do I create it having only the 32bpp pixel data?
You could create a DIB because you can use a Device Independent Bitmap independently of the screen color depth. See CreateDIBSection().
How can you create it having only the 32bpp pixel data? A DIB can be created with 32bpp data. As you can read in the documentation:
The CreateDIBSection function creates
a DIB that applications can write to
directly. The function gives you a
pointer to the location of the bitmap
bit values.
If hSection is NULL, the system
allocates memory for the DIB. If the
function succeeds, the return value is
a handle to the newly created DIB, and
*ppvBits points to the bitmap bit values.
Try something like this:
VOID *ppvBits = NULL;
BITMAPINFO BitmapInfo;
memset(&BitmapInfo, 0, sizeof(BITMAPINFOHEADER));
BitmapInfo.bmiHeader.biSize = sizeof(BITMAPINFOHEADER);
BitmapInfo.bmiHeader.biWidth = 1;
BitmapInfo.bmiHeader.biHeight = 24;
BitmapInfo.bmiHeader.biPlanes = 1;
BitmapInfo.bmiHeader.biBitCount = 32;
BitmapInfo.bmiHeader.biCompression = BI_RGB;
HBITMAP hBitmap = CreateDIBSection(hDC, &BitmapInfo, DIB_RGB_COLORS, &ppvBits, NULL, 0);
In our case *ppvBits points to 1 * 24 * (32 / 8) allocated bytes.
It is important to know that if biHeight is positive, the bitmap is a bottom-up DIB and its origin is the lower-left corner. See BITMAPINFOHEADER Structure for more info.
I solved it by using CreateCompatibleBitmap and SetPixel. Not the best option I guess, but it works.

Resources