I have currently the problem that a library creates a DX11 texture with BGRA pixel format.
But the displaying library can only display RGBA correctly. (This means the colors are swapped in the rendered image)
After looking around I found a simple for-loop to solve the problem, but the performance is not very good and scales bad with higher resolutions. I'm new to DirectX and maybe I just missed a simple function to do the converting.
// Get the image data
unsigned char* pDest = view->image->getPixels();
// Prepare source texture
ID3D11Texture2D* pTexture = static_cast<ID3D11Texture2D*>( tex );
// Get context
ID3D11DeviceContext* pContext = NULL;
// Copy data, fast operation
pContext->CopySubresourceRegion(texStaging, 0, 0, 0, 0, tex, 0, nullptr);
// Create mapping
HRESULT hr = pContext->Map( texStaging, 0, D3D11_MAP_READ, 0, &mapped );
if ( FAILED( hr ) )
// Calculate size
const size_t size = _width * _height * 4;
// Access pixel data
unsigned char* pSrc = static_cast<unsigned char*>( mapped.pData );
// Offsets
int offsetSrc = 0;
int offsetDst = 0;
int rowOffset = mapped.RowPitch % _width;
// Loop through it, BRGA to RGBA conversation
for (int row = 0; row < _height; ++row)
for (int col = 0; col < _width; ++col)
pDest[offsetDst] = pSrc[offsetSrc+2];
pDest[offsetDst+1] = pSrc[offsetSrc+1];
pDest[offsetDst+2] = pSrc[offsetSrc];
pDest[offsetDst+3] = pSrc[offsetSrc+3];
offsetSrc += 4;
offsetDst += 4;
// Adjuste offset
offsetSrc += rowOffset;
// Unmap texture
pContext->Unmap( texStaging, 0 );
Texture2D txDiffuse : register(t0);
SamplerState texSampler : register(s0);
struct VSScreenQuadOutput
float4 Position : SV_POSITION;
float2 TexCoords0 : TEXCOORD0;
float4 PSMain(VSScreenQuadOutput input) : SV_Target
return txDiffuse.Sample(texSampler, input.TexCoords0).rgba;

Obviously iterating over a texture on you CPU is not the most effective way. If you know that colors in a texture are always swapped like that and you don't want to modify the texture itself in your C++ code, the most straightforward way would be to do it in the pixel shader. When you sample the texture, simply swap colors there. You won't even notice any performance drop.


Maximum float value in 10-bit image in WIC

I'm trying to convert a HDR image float array I load to a 10-bit DWORD with WIC.
The type of the loading file is GUID_WICPixelFormat128bppPRGBAFloat and I got an array of 4 floats per color.
When I try to convert these to 10 bit as follows:
struct RGBX
unsigned int b : 10;
unsigned int g : 10;
unsigned int r : 10;
int a : 2;
} rgbx;
(which is the format requested by the NVIDIA encoding library for 10-bit rgb),
then I assume I have to divide each of the floats by 1024.0f in order to get them inside the 10 bits of a DWORD.
However, I notice that some of the floats are > 1, which means that their range is not [0,1] as it happens when the image is 8 bit.
What would their range be? How to store a floating point color into a 10-bits integer?
I'm trying to use the NVidia's HDR encoder which requires an ARGB10 like the above structure.
How is the 10 bit information of a color stored as a floating point number?
Btw I tried to convert with WIC but conversion from GUID_WICPixelFormat128bppPRGBAFloat to GUID_WICPixelFormat32bppR10G10B10A2 fails.
HRESULT ConvertFloatTo10(const float* f, int wi, int he, std::vector<DWORD>& out)
CComPtr<IWICBitmap> b;
wbfact->CreateBitmapFromMemory(wi, he, GUID_WICPixelFormat128bppPRGBAFloat, wi * 16, wi * he * 16, (BYTE*)f, &b);
CComPtr<IWICFormatConverter> wf;
wf->Initialize(b, GUID_WICPixelFormat32bppR10G10B10A2, WICBitmapDitherTypeNone, 0, 0, WICBitmapPaletteTypeCustom);
// This last call fails with 0x88982f50 : The component cannot be found.
Edit: I found a paper (https://hal.archives-ouvertes.fr/hal-01704278/document), is this relevant to this question?
Floating-point color content that is greater than the 0..1 range is High Dynamic Range (HDR) content. If you trivially convert it to 10:10:10:2 UNORM then you are using 'clipping' for values over 1. This doesn't give good results.
SDR 10:10:10 or 8:8:8
You should instead use tone-mapping which converts the HDR signal to a SDR (Standard Dynamic Range a.k.a. 0..1) before or as part of doing the conversion to 10:10:10:2.
There a many different approaches to tone-mapping, but a common 'generic' solution is the Reinhard tone-mapping operator. Here's an implementation using DirectXTex.
std::unique_ptr<ScratchImage> timage(new (std::nothrow) ScratchImage);
if (!timage)
wprintf(L"\nERROR: Memory allocation failed\n");
return 1;
// Compute max luminosity across all images
XMVECTOR maxLum = XMVectorZero();
hr = EvaluateImage(image->GetImages(), image->GetImageCount(), image->GetMetadata(),
[&](const XMVECTOR* pixels, size_t w, size_t y)
for (size_t j = 0; j < w; ++j)
static const XMVECTORF32 s_luminance = { { { 0.3f, 0.59f, 0.11f, 0.f } } };
XMVECTOR v = *pixels++;
v = XMVector3Dot(v, s_luminance);
maxLum = XMVectorMax(v, maxLum);
if (FAILED(hr))
wprintf(L" FAILED [tonemap maxlum] (%08X%ls)\n", static_cast<unsigned int>(hr), GetErrorDesc(hr));
return 1;
maxLum = XMVectorMultiply(maxLum, maxLum);
hr = TransformImage(image->GetImages(), image->GetImageCount(), image->GetMetadata(),
[&](XMVECTOR* outPixels, const XMVECTOR* inPixels, size_t w, size_t y)
for (size_t j = 0; j < w; ++j)
XMVECTOR value = inPixels[j];
const XMVECTOR scale = XMVectorDivide(
XMVectorAdd(g_XMOne, XMVectorDivide(value, maxLum)),
XMVectorAdd(g_XMOne, value));
const XMVECTOR nvalue = XMVectorMultiply(value, scale);
value = XMVectorSelect(value, nvalue, g_XMSelect1110);
outPixels[j] = value;
}, *timage);
if (FAILED(hr))
wprintf(L" FAILED [tonemap apply] (%08X%ls)\n", static_cast<unsigned int>(hr), GetErrorDesc(hr));
return 1;
UPDATE: If you are trying to convert HDR floating-point content to an "HDR10" signal, then you need to do:
Color-space rotate from Rec.709 or P3D65 to Rec.2020.
Normalize for 'paper white' / 10,000 nits.
Apply the ST.2084 gamma curve.
Quantize to 10-bit.
// HDTV to UHDTV (Rec.709 color primaries into Rec.2020)
const XMMATRIX c_from709to2020 =
0.6274040f, 0.0690970f, 0.0163916f, 0.f,
0.3292820f, 0.9195400f, 0.0880132f, 0.f,
0.0433136f, 0.0113612f, 0.8955950f, 0.f,
0.f, 0.f, 0.f, 1.f
// DCI-P3-D65 https://en.wikipedia.org/wiki/DCI-P3 to UHDTV (DCI-P3-D65 color primaries into Rec.2020)
const XMMATRIX c_fromP3D65to2020 =
0.753845f, 0.0457456f, -0.00121055f, 0.f,
0.198593f, 0.941777f, 0.0176041f, 0.f,
0.047562f, 0.0124772f, 0.983607f, 0.f,
0.f, 0.f, 0.f, 1.f
// Custom Rec.709 into Rec.2020
const XMMATRIX c_fromExpanded709to2020 =
0.6274040f, 0.0457456f, -0.00121055f, 0.f,
0.3292820f, 0.941777f, 0.0176041f, 0.f,
0.0433136f, 0.0124772f, 0.983607f, 0.f,
0.f, 0.f, 0.f, 1.f
inline float LinearToST2084(float normalizedLinearValue)
const float ST2084 = pow((0.8359375f + 18.8515625f * pow(abs(normalizedLinearValue), 0.1593017578f)) / (1.0f + 18.6875f * pow(abs(normalizedLinearValue), 0.1593017578f)), 78.84375f);
return ST2084; // Don't clamp between [0..1], so we can still perform operations on scene values higher than 10,000 nits
// You can adjust this up to 10000.f
float paperWhiteNits = 200.f;
hr = TransformImage(image->GetImages(), image->GetImageCount(), image->GetMetadata(),
[&](XMVECTOR* outPixels, const XMVECTOR* inPixels, size_t w, size_t y)
const XMVECTOR paperWhite = XMVectorReplicate(paperWhiteNits);
for (size_t j = 0; j < w; ++j)
XMVECTOR value = inPixels[j];
XMVECTOR nvalue = XMVector3Transform(value, c_from709to2020);
// Some people prefer the look of using c_fromP3D65to2020
// or c_fromExpanded709to2020 instead.
// Convert to ST.2084
nvalue = XMVectorDivide(XMVectorMultiply(nvalue, paperWhite), c_MaxNitsFor2084);
XMStoreFloat4A(&tmp, nvalue);
tmp.x = LinearToST2084(tmp.x);
tmp.y = LinearToST2084(tmp.y);
tmp.z = LinearToST2084(tmp.z);
nvalue = XMLoadFloat4A(&tmp);
value = XMVectorSelect(value, nvalue, g_XMSelect1110);
outPixels[j] = value;
}, *timage);
You should really take a look at texconv.
Reinhard et al., "Photographic tone reproduction for digital images", ACM Transactions on Graphics, Volume 21, Issue 3 (July 2002). ACM DL.
#ChuckWalbourn answer is helpful, however I don't want to tonemap to [0,1] as there is no point in tonemapping to SDR then going to 10-bit HDR.
What I 'd think it's correct is to scale to [0,4] instead by first using g_XMFour.
const XMVECTOR scale = XMVectorDivide(
XMVectorAdd(g_XMFour, XMVectorDivide(v, maxLum)),
XMVectorAdd(g_XMFour, v));
then using a specialized 10-bit store which scales by 255 instead of 1023:
void XMStoreUDecN4a(DirectX::PackedVector::XMUDECN4* pDestination,DirectX::FXMVECTOR V)
using namespace DirectX;
static const XMVECTOR Scale = { 255.0f, 255.0f, 255.0f, 3.0f };
N = XMVectorClamp(V, XMVectorZero(), g_XMFour);
N = XMVectorMultiply(N, Scale);
pDestination->v = ((uint32_t)DirectX::XMVectorGetW(N) << 30) |
(((uint32_t)DirectX::XMVectorGetZ(N) & 0x3FF) << 20) |
(((uint32_t)DirectX::XMVectorGetY(N) & 0x3FF) << 10) |
(((uint32_t)DirectX::XMVectorGetX(N) & 0x3FF));
And then a specialized 10-bit load which divides with 255 instead of 1023:
DirectX::XMVECTOR XMLoadUDecN4a(DirectX::PackedVector::XMUDECN4* pSource)
using namespace DirectX;
fourx vectorOut;
uint32_t Element;
Element = pSource->v & 0x3FF;
vectorOut.r = (float)Element / 255.f;
Element = (pSource->v >> 10) & 0x3FF;
vectorOut.g = (float)Element / 255.f;
Element = (pSource->v >> 20) & 0x3FF;
vectorOut.b = (float)Element / 255.f;
vectorOut.a = (float)(pSource->v >> 30) / 3.f;
const DirectX::XMVECTORF32 j = { vectorOut.r,vectorOut.g,vectorOut.b,vectorOut.a };
return j;

Correct RGB values for AVFrame

I have to fill the ffmpeg AVFrame->data from a cairo surface pixel data. I have this code:
/* Image info and pixel data */
width = cairo_image_surface_get_width( surface );
height = cairo_image_surface_get_height( surface );
stride = cairo_image_surface_get_stride( surface );
pix = cairo_image_surface_get_data( surface );
for( row = 0; row < height; row++ )
data = pix + row * stride;
for( col = 0; col < width; col++ )
img->video_frame->data[0][row * img->video_frame->linesize[0] + col] = data[0];
img->video_frame->data[1][row * img->video_frame->linesize[1] + col] = data[1];
//img->video_frame->data[2][row * img->video_frame->linesize[2] + col] = data[2];
data += 4;
But the colors in the exported video are wrong. The original heart is red. Can someone point me in the right direction? The encode.c example is useless sadly and on the Internet there is a lot of confusion about Y, Cr and Cb which I really don't understand. Please feel free to ask for more details. Many thanks.
You need to use libswscale to convert the source image data from RGB24 to YUV420P.
Something like:
int width = cairo_image_surface_get_width( surface );
int height = cairo_image_surface_get_height( surface );
int stride = cairo_image_surface_get_stride( surface );
uint8_t *pix = cairo_image_surface_get_data( surface );
uint8_t *data[1] = { pix };
int linesize[1] = { stride };
struct SwsContext *sws_ctx = sws_getContext(width, height, AV_PIX_FMT_RGB24 ,
width, height, AV_PIX_FMT_YUV420P,
sws_scale(sws_ctx, data, linesize, 0, height,
img->video_frame->data, img->video_frame->linesize);
See the example here: scaling_video

How to set alpha value for all pixels in a bitmap using MFC or GDI or GDI+

I am in an MFC application. I created a bitmap using a memory DC I want to save it to DIB file.
I found this code to be most elegant so far:
void Save(CBitmap * bitmap) {
CImage image;
image.Save("bla.bmp", Gdiplus::ImageFormatBMP);
The resulting file is 32 BPP colorspace with all alpha values set to '0'.
Now I want use the Bitmap as toolbar bitmap:
But all the icons are gone.
MFC internally calls PreMultiplyAlpha() when importing the bitmap.
Then RGB components of all pixels are '0'. Effectively the whole bitmap was zeroed.
How can I set the alpha value for each pixel to '0xFF' before saving?
I tried:
void Save(CBitmap * bitmap) {
CImage image;
image.AlphaBlend(myBitmapDC, 0, 0);
image.Save("bla.bmp", Gdiplus::ImageFormatBMP);
But that affects only RGB values of the pixels.
So far I resisted to iterate over each pixel and modifying the memory of the bitmap. I'm asking for an elegant solution. Maybe a one-liner.
Use GetDIBits to read 32-bit pixel data, and loop through the bits to set alpha to 0xFF.
bool Save(CBitmap *bitmap)
return false;
if(bm.bmBitsPixel < 16)
return false;
DWORD size = bm.bmWidth * bm.bmHeight * 4;
BITMAPINFOHEADER bih = { sizeof(bih), bm.bmWidth, bm.bmHeight, 1, 32, BI_RGB };
BITMAPFILEHEADER bfh = { 'MB', 54 + size, 0, 0, 54 };
CClientDC dc(0);
std::vector<BYTE> vec(size, 0xFF);
int test = GetDIBits(dc, *bitmap, 0, bm.bmHeight, &vec[0],
for(DWORD i = 0; i < size; i += 4)
vec[i + 3] = 0xFF;
CFile fout;
if(fout.Open(filename, CFile::modeCreate | CFile::modeWrite))
fout.Write(&bfh, sizeof(bfh));
fout.Write(&bih, sizeof(bih));
fout.Write(&vec[0], size);
return true;
return false;
As an alternative (but I am not sure if this is reliable) initialize the memory with 0xFF. GetDIBits will set the RGB part but won't overwrite the alpha values:
std::vector<BYTE> vec(size, 0xFF);
Or using GDI+
bool Save(CBitmap *bitmap)
return false;
if(bm.bmBitsPixel < 16)
return false; //needs palette
Gdiplus::GdiplusStartupInput tmp;
ULONG_PTR token;
Gdiplus::GdiplusStartup(&token, &tmp, NULL);
Gdiplus::Bitmap *src = Gdiplus::Bitmap::FromHBITMAP(*bitmap, NULL);
Gdiplus::Bitmap *dst = src->Clone(0, 0, src->GetWidth(), src->GetHeight(),
LPCOLESTR clsid_bmp = L"{557cf400-1a04-11d3-9a73-0000f81ef32e}";
CLSID clsid;
CLSIDFromString(clsid_bmp, &clsid);
bool result = dst->Save(L"file.bmp", &clsid) == 0;
delete src;
delete dst;
return result;

How do I create a texture 3d programatically?

I am trying to create a texture3d programatically but I am not really understanding how it is done. Should each slice of the texture be a subresource? This what I am trying to do, but it is not working:
// Create texture3d
const int32 cWidth = 6;
const int32 cHeight = 7;
const int32 cDepth = 3;
desc.Width = cWidth;
desc.Height = cHeight;
desc.MipLevels = 1;
desc.Depth = cDepth;
desc.Format = DXGI_FORMAT_R32G32B32A32_FLOAT;
desc.Usage = D3D11_USAGE_DEFAULT;
desc.BindFlags = D3D11_BIND_RENDER_TARGET;
desc.CPUAccessFlags = 0;
desc.MiscFlags = 0;
const uint32 bytesPerPixel = 4;
uint32 sliceSize = cWidth*cHeight*bytesPerPixel;
float tex3d[cWidth*cHeight*cDepth];
memset(tex3d, 0x00, sizeof(tex3d));
uint32 colorIndex = 0;
for (uint32 depthCount = 0; depthCount<depthSize; depthCount++)
for (uint32 ii=0; ii<cHeight; ii++)
for (uint32 jj=0; jj<cWidth; jj++)
// Add some dummy color
tex3d[colorIndex++] = 1.f;
tex3d[colorIndex++] = 0.f;
tex3d[colorIndex++] = 1.f;
tex3d[colorIndex++] = 0.f;
D3D11_SUBRESOURCE_DATA initData[cDepth] = {0};
uint8 *pMem = (uint8*)tex3d;
// What do I pass here? Each slice?
for (uint32 depthCount = 0; depthCount<depthSize; depthCount++)
initData[depthCount].pSysMem = static_cast<const void*>(pMem);
initData[depthCount].SysMemPitch = static_cast<UINT>(sliceSize); // not sure
initData[depthCount].SysMemSlicePitch = static_cast<UINT>(sliceSize); // not sure
pMem += sliceSize;
ID3D11Texture3D* tex = nullptr;
hr = m_d3dDevice->CreateTexture3D(&desc, &initData[0], &tex);
ID3D11RenderTargetView *pRTV = nullptr;
hr = m_d3dDevice->CreateRenderTargetView(tex, nullptr, &pRTV);
This creates the texture but when I gives me 1 sub-resource? Should it be 3?
I looked at this article, but it refers to texture2d;
D3D11: Creating a cube map from 6 images
If anyone has some snipped of a code that works, I'd like to take a look.
In Direct3D, 3D textures are laid out such that sub-resources are mipmap levels. Each mipmap level contains 1/2 as many slices as the previous, but in this case you only have 1 mipmap LOD, so you will only have 1 subresource (containing 3 slices).
As for the pitch, SysMemPitch is the number of bytes between rows in each image slice (cWidth * bytesPerPixel assuming you tightly pack this). SysMemSlicePitch is the number of bytes between 2D slices (cWidth * cHeight * bytesPerPixel). Thus, the memory for each mipmap needs to be arranged as a series of 2D images with the same dimensions.

Image flips, OpenGL output to JPEG using libjpeg

The below code helps me to convert OpenGL output to JPEG image using libjpg but the resultant image is flipped vertical...
The code works perfect but the final image is flipped I dont know why ?!
unsigned char *pdata = new unsigned char[width*height*3];
glReadPixels(0, 0, width, height, GL_RGB, GL_UNSIGNED_BYTE, pdata);
FILE *outfile;
if ((outfile = fopen("sample.jpeg", "wb")) == NULL) {
printf("can't open %s");
struct jpeg_compress_struct cinfo;
struct jpeg_error_mgr jerr;
cinfo.err = jpeg_std_error(&jerr);
jpeg_stdio_dest(&cinfo, outfile);
cinfo.image_width = width;
cinfo.image_height = height;
cinfo.input_components = 3;
cinfo.in_color_space = JCS_RGB;
/*set the quality [0..100] */
jpeg_set_quality (&cinfo, 100, true);
jpeg_start_compress(&cinfo, true);
JSAMPROW row_pointer;
int row_stride = width * 3;
while (cinfo.next_scanline < cinfo.image_height) {
row_pointer = (JSAMPROW) &pdata[cinfo.next_scanline*row_stride];
jpeg_write_scanlines(&cinfo, &row_pointer, 1);
OpenGL's coordinate system has the origin in the lower left corner of the image. LIBJPEG assumes that the origin of the image is in the upper left corner of the image. Make the following change to fix your code:
while (cinfo.next_scanline < cinfo.image_height)
row_pointer = (JSAMPROW) &pdata[(cinfo.image_height-1-cinfo.next_scanline)*row_stride];
jpeg_write_scanlines(&cinfo, &row_pointer, 1);
