Correct RGB values for AVFrame - ffmpeg

I have to fill the ffmpeg AVFrame->data from a cairo surface pixel data. I have this code:
/* Image info and pixel data */
width = cairo_image_surface_get_width( surface );
height = cairo_image_surface_get_height( surface );
stride = cairo_image_surface_get_stride( surface );
pix = cairo_image_surface_get_data( surface );
for( row = 0; row < height; row++ )
{
data = pix + row * stride;
for( col = 0; col < width; col++ )
{
img->video_frame->data[0][row * img->video_frame->linesize[0] + col] = data[0];
img->video_frame->data[1][row * img->video_frame->linesize[1] + col] = data[1];
//img->video_frame->data[2][row * img->video_frame->linesize[2] + col] = data[2];
data += 4;
}
img->video_frame->pts++;
}
But the colors in the exported video are wrong. The original heart is red. Can someone point me in the right direction? The encode.c example is useless sadly and on the Internet there is a lot of confusion about Y, Cr and Cb which I really don't understand. Please feel free to ask for more details. Many thanks.

You need to use libswscale to convert the source image data from RGB24 to YUV420P.
Something like:
int width = cairo_image_surface_get_width( surface );
int height = cairo_image_surface_get_height( surface );
int stride = cairo_image_surface_get_stride( surface );
uint8_t *pix = cairo_image_surface_get_data( surface );
uint8_t *data[1] = { pix };
int linesize[1] = { stride };
struct SwsContext *sws_ctx = sws_getContext(width, height, AV_PIX_FMT_RGB24 ,
width, height, AV_PIX_FMT_YUV420P,
SWS_BILINEAR, NULL, NULL, NULL);
sws_scale(sws_ctx, data, linesize, 0, height,
img->video_frame->data, img->video_frame->linesize);
sws_freeContext(sws_ctx);
See the example here: scaling_video

Related

Can't get BITMAPINFOHEADER data to display odd width bmp images correctly

I am trying to display a 24-bit uncompressed bitmap with an odd width using standard Win32 API calls, but it seems like I have a stride problem.
According to the msdn:
https://msdn.microsoft.com/en-us/library/windows/desktop/dd318229%28v=vs.85%29.aspx
"For uncompressed RGB formats, the minimum stride is always the image width in bytes, rounded up to the nearest DWORD. You can use the following formula to calculate the stride:
stride = ((((biWidth * biBitCount) + 31) & ~31) >> 3)"
but this simply does not work for me and below is is the code:
void Init()
{
pImage = ReadBMP("data\\bird.bmp");
size_t imgSize = pImage->width * pImage->height * 3;
BITMAPINFOHEADER bmih;
bmih.biSize = sizeof(BITMAPINFOHEADER);
bmih.biBitCount = 24;
// This is probably where the bug is
LONG stride = ((((pImage->width * bmih.biBitCount) + 31) & ~31) >> 3);
//bmih.biWidth = pImage->width;
bmih.biWidth = stride;
bmih.biHeight = -((LONG)pImage->height);
bmih.biPlanes = 1;
bmih.biCompression = BI_RGB;
bmih.biSizeImage = 0;
bmih.biXPelsPerMeter = 1;
bmih.biYPelsPerMeter = 1;
bmih.biClrUsed = 0;
bmih.biClrImportant = 0;
BITMAPINFO dbmi;
ZeroMemory(&dbmi, sizeof(dbmi));
dbmi.bmiHeader = bmih;
dbmi.bmiColors->rgbBlue = 0;
dbmi.bmiColors->rgbGreen = 0;
dbmi.bmiColors->rgbRed = 0;
dbmi.bmiColors->rgbReserved = 0;
HDC hdc = ::GetDC(NULL);
mTestBMP = CreateDIBitmap(hdc,
&bmih,
CBM_INIT,
pImage->pSrc,
&dbmi,
DIB_RGB_COLORS);
hdc = ::GetDC(NULL);
}
and here the drawing fuction
RawBMP *pImage;
HBITMAP mTestBMP;
void UpdateScreen(HDC srcHDC)
{
if (pImage != nullptr && mTestBMP != 0x00)
{
HDC hdc = CreateCompatibleDC(srcHDC);
SelectObject(hdc, mTestBMP);
BitBlt(srcHDC,
0, // x
0, // y
// I tried passing the stride here and it did not work either
pImage->width, // width of the image
pImage->height, // height
hdc,
0, // x and
0, // y of upper left corner
SRCCOPY);
DeleteDC(hdc);
}
}
If I pass the original image width (odd number) instead of the stride
LONG stride = ((((pImage->width * bmih.biBitCount) + 31) & ~31) >> 3);
//bmih.biWidth = stride;
bmih.biWidth = pImage->width;
the picture looks skewed, below shows the differences:
and if I pass the stride according to msdn, then nothing shows up because the stride is too large.
any clues? Thank you!
thanks Jonathan for the solution. I need to copy row by row with the proper padding for odd width images. More or less the code for 24-bit uncompressed images:
const uint32_t bitCount = 24;
LONG strideInBytes;
// if the width is odd, then we need to add padding
if (width & 0x1)
{
strideInBytes = ((((width * bitCount) + 31) & ~31) >> 3);
}
else
{
strideInBytes = width * 3;
}
// allocate the new buffer
unsigned char *pBuffer = new unsigned char[strideInBytes * height];
memset(pBuffer, 0xaa, strideInBytes * height);
// Copy row by row
for (uint32_t yy = 0; yy < height; yy++)
{
uint32_t rowSizeInBytes = width * 3;
unsigned char *pDest = &pBuffer[yy * strideInBytes];
unsigned char *pSrc = &pData[yy * rowSizeInBytes];
memcpy(pDest, pSrc, rowSizeInBytes);
}
rawBMP->pSrc = pBuffer;
rawBMP->width = width;
rawBMP->height = height;
rawBMP->stride = strideInBytes;

DX11 convert pixel format BGRA to RGBA

I have currently the problem that a library creates a DX11 texture with BGRA pixel format.
But the displaying library can only display RGBA correctly. (This means the colors are swapped in the rendered image)
After looking around I found a simple for-loop to solve the problem, but the performance is not very good and scales bad with higher resolutions. I'm new to DirectX and maybe I just missed a simple function to do the converting.
// Get the image data
unsigned char* pDest = view->image->getPixels();
// Prepare source texture
ID3D11Texture2D* pTexture = static_cast<ID3D11Texture2D*>( tex );
// Get context
ID3D11DeviceContext* pContext = NULL;
dxDevice11->GetImmediateContext(&pContext);
// Copy data, fast operation
pContext->CopySubresourceRegion(texStaging, 0, 0, 0, 0, tex, 0, nullptr);
// Create mapping
D3D11_MAPPED_SUBRESOURCE mapped;
HRESULT hr = pContext->Map( texStaging, 0, D3D11_MAP_READ, 0, &mapped );
if ( FAILED( hr ) )
{
return;
}
// Calculate size
const size_t size = _width * _height * 4;
// Access pixel data
unsigned char* pSrc = static_cast<unsigned char*>( mapped.pData );
// Offsets
int offsetSrc = 0;
int offsetDst = 0;
int rowOffset = mapped.RowPitch % _width;
// Loop through it, BRGA to RGBA conversation
for (int row = 0; row < _height; ++row)
{
for (int col = 0; col < _width; ++col)
{
pDest[offsetDst] = pSrc[offsetSrc+2];
pDest[offsetDst+1] = pSrc[offsetSrc+1];
pDest[offsetDst+2] = pSrc[offsetSrc];
pDest[offsetDst+3] = pSrc[offsetSrc+3];
offsetSrc += 4;
offsetDst += 4;
}
// Adjuste offset
offsetSrc += rowOffset;
}
// Unmap texture
pContext->Unmap( texStaging, 0 );
Solution:
Texture2D txDiffuse : register(t0);
SamplerState texSampler : register(s0);
struct VSScreenQuadOutput
{
float4 Position : SV_POSITION;
float2 TexCoords0 : TEXCOORD0;
};
float4 PSMain(VSScreenQuadOutput input) : SV_Target
{
return txDiffuse.Sample(texSampler, input.TexCoords0).rgba;
}
Obviously iterating over a texture on you CPU is not the most effective way. If you know that colors in a texture are always swapped like that and you don't want to modify the texture itself in your C++ code, the most straightforward way would be to do it in the pixel shader. When you sample the texture, simply swap colors there. You won't even notice any performance drop.

ffmpeg: RGB to YUV conversion loses color and scale

I am trying to convert RGB frames to YUV420P format in ffmpeg/libav. Following is the code for conversion and also the images before and after conversion. The converted image loses all color information and also the scale changes significantly. Does anybody have idea how to handle this? I am completely new to ffmpeg/libav!
// Did we get a video frame?
if(frameFinished)
{
i++;
sws_scale(img_convert_ctx, (const uint8_t * const *)pFrame->data,
pFrame->linesize, 0, pCodecCtx->height,
pFrameRGB->data, pFrameRGB->linesize);
//==============================================================
AVFrame *pFrameYUV = avcodec_alloc_frame();
// Determine required buffer size and allocate buffer
int numBytes2 = avpicture_get_size(PIX_FMT_RGB24, pCodecCtx->width,
pCodecCtx->height);
uint8_t *buffer = (uint8_t *)av_malloc(numBytes2*sizeof(uint8_t));
avpicture_fill((AVPicture *)pFrameYUV, buffer, PIX_FMT_RGB24,
pCodecCtx->width, pCodecCtx->height);
rgb_to_yuv_ctx = sws_getContext(pCodecCtx->width, pCodecCtx->height,
PIX_FMT_RGB24,
pCodecCtx->width,pCodecCtx->height,
PIX_FMT_RGB24,
SWS_BICUBIC, NULL,NULL,NULL);
sws_scale(rgb_to_yuv_ctx, pFrameRGB->data, pFrameRGB->linesize, 0,
pCodecCtx->height, pFrameYUV->data, pFrameYUV->linesize);
sws_freeContext(rgb_to_yuv_ctx);
SaveFrame(pFrameYUV, pCodecCtx->width, pCodecCtx->height, i);
av_free(buffer);
av_free(pFrameYUV);
}
Well for starters I will assume where you have:
rgb_to_yuv_ctx = sws_getContext(pCodecCtx->width, pCodecCtx->height,
PIX_FMT_RGB24,
pCodecCtx->width,pCodecCtx->height,
PIX_FMT_RGB24,
SWS_BICUBIC, NULL,NULL,NULL);
You really intended:
rgb_to_yuv_ctx = sws_getContext(pCodecCtx->width, pCodecCtx->height,
PIX_FMT_RGB24,
pCodecCtx->width,pCodecCtx->height,
PIX_FMT_YUV420P,
SWS_BICUBIC, NULL,NULL,NULL);
I'm also not sure why you are calling swscale twice!
YUV is a planar format. This means all three channels are stored independently. Whre RGB is stored like:
RGBRGBRGB
YUV420P is stores like:
YYYYYYYYYYYYYYYY..UUUUUUUUUU..VVVVVVVV
So swscale required you give it three pointers.
Next, You want your line stride to be a multiple of 16, or 32 so the vector units of the processor can be used. And finally the dimensions of the Y plane need to be divisible by two (because the U and V planes are a quarter size of the Y plane).
So, lets rewrite this:
#define RNDTO2(X) ( ( (X) & 0xFFFFFFFE )
#define RNDTO32(X) ( ( (X) % 32 ) ? ( ( (X) + 32 ) & 0xFFFFFFE0 ) : (X) )
if(frameFinished)
{
static SwsContext *swsCtx = NULL;
int width = RNDTO2 ( pCodecCtx->width );
int height = RNDTO2 ( pCodecCtx->height );
int ystride = RNDTO32 ( width );
int uvstride = RNDTO32 ( width / 2 );
int ysize = ystride * height;
int vusize = uvstride * ( height / 2 );
int size = ysize + ( 2 * vusize )
void * pFrameYUV = malloc( size );
void *plane[] = { pFrameYUV, pFrameYUV + ysize, pFrameYUV + ysize + vusize, 0 };
int *stride[] = { ystride, vustride, vustride, 0 };
swsCtx = sws_getCachedContext ( swsCtx, pCodecCtx->width, pCodecCtx->height,
pCodecCtx->pixfmt, width, height, AV_PIX_FMT_YUV420P,
SWS_LANCZOS | SWS_ACCURATE_RND , NULL, NULL, NULL );
sws_scale ( swsCtx, pFrameRGB->data, pFrameRGB->linesize, 0,
pFrameRGB->height, plane, stride );
}
I also switched your algorithm to use SWS_LANCZOS | SWS_ACCURATE_RND. This will give you better looking images. Change it back if it is to slow. I also used the pixel format from the source frame instead of assuming it RGB all the time.

Image flips, OpenGL output to JPEG using libjpeg

The below code helps me to convert OpenGL output to JPEG image using libjpg but the resultant image is flipped vertical...
The code works perfect but the final image is flipped I dont know why ?!
unsigned char *pdata = new unsigned char[width*height*3];
glReadPixels(0, 0, width, height, GL_RGB, GL_UNSIGNED_BYTE, pdata);
FILE *outfile;
if ((outfile = fopen("sample.jpeg", "wb")) == NULL) {
printf("can't open %s");
exit(1);
}
struct jpeg_compress_struct cinfo;
struct jpeg_error_mgr jerr;
cinfo.err = jpeg_std_error(&jerr);
jpeg_create_compress(&cinfo);
jpeg_stdio_dest(&cinfo, outfile);
cinfo.image_width = width;
cinfo.image_height = height;
cinfo.input_components = 3;
cinfo.in_color_space = JCS_RGB;
jpeg_set_defaults(&cinfo);
/*set the quality [0..100] */
jpeg_set_quality (&cinfo, 100, true);
jpeg_start_compress(&cinfo, true);
JSAMPROW row_pointer;
int row_stride = width * 3;
while (cinfo.next_scanline < cinfo.image_height) {
row_pointer = (JSAMPROW) &pdata[cinfo.next_scanline*row_stride];
jpeg_write_scanlines(&cinfo, &row_pointer, 1);
}
jpeg_finish_compress(&cinfo);
fclose(outfile);
jpeg_destroy_compress(&cinfo);
OpenGL's coordinate system has the origin in the lower left corner of the image. LIBJPEG assumes that the origin of the image is in the upper left corner of the image. Make the following change to fix your code:
while (cinfo.next_scanline < cinfo.image_height)
{
row_pointer = (JSAMPROW) &pdata[(cinfo.image_height-1-cinfo.next_scanline)*row_stride];
jpeg_write_scanlines(&cinfo, &row_pointer, 1);
}

Taking snapshot of contents in CGL?

I want to create a image out of Core OpenGL context.
I used following code but it creates a black image. So I guess I cannot use glReadPixles there? Any other suggestions please?
int myDataLength = 480 * 480 * 4;
// allocate array and read pixels into it.
GLubyte *buffer = (GLubyte *) malloc(myDataLength);
glReadPixels(0, 0, 320, 480, GL_RGBA, GL_UNSIGNED_BYTE, buffer);
// gl renders "upside down" so swap top to bottom into new array.
// there's gotta be a better way, but this works.
GLubyte *buffer2 = (GLubyte *) malloc(myDataLength);
for(int y = 0; y < 480; y++)
{
for(int x = 0; x < 320 * 4; x++)
{
buffer2[(479 - y) * 320 * 4 + x] = buffer[y * 4 * 320 + x];
}
}
// make data provider with data.
CGDataProviderRef provider = CGDataProviderCreateWithData(NULL, buffer2, myDataLength, NULL);
// prep the ingredients
int bitsPerComponent = 8;
int bitsPerPixel = 32;
int bytesPerRow = 4 * 320;
CGColorSpaceRef colorSpaceRef = CGColorSpaceCreateDeviceRGB();
CGBitmapInfo bitmapInfo = kCGBitmapByteOrderDefault;
CGColorRenderingIntent renderingIntent = kCGRenderingIntentDefault;
// make the cgimage
CGImageRef image= CGImageCreate(320, 480, bitsPerComponent, bitsPerPixel, bytesPerRow, colorSpaceRef, bitmapInfo, provider, NULL, false, renderingIntent);
//PRINT image... Its black!!!!!!
CGDataProviderRelease(provider);
free(buffer);
free(buffer2);
Before you do a glReadPixels call you must
set proper packing (see glPixelStorei reference page)
select the right buffer to read from with glReadBuffer (front after swapping, back before swapping, I recommend swap and read from front)

Resources