I'm trying to dump a YUV420 data into the AVFrame structure of FFMPEG. From the below link:
http://ffmpeg.org/doxygen/trunk/structAVFrame.html, i can derive that i need to put my data into
data[AV_NUM_DATA_POINTERS]
using
linesize [AV_NUM_DATA_POINTERS].
The YUV data i'm trying to dump is YUV420 and the picture size is 416x240. So how do i dump/map this yuv data to AVFrame structures variable? Iknow that linesize represents the stride i.e. i suppose the width of my picture, I have tried with some combinations but do not get the output.I kindly request you to help me map the buffer. Thanks in advance.
AVFrame can be interpreted as an AVPicture to fill the data and linesize fields. The easiest way to fill these field is to the use the avpicture_fill function.
To fill in the AVFrame's Y U and V buffers, it depends on your input data and what you want to do with the frame (do you want to write into the AVFrame and erase the initial data? or keep a copy).
If the buffer is large enough (at least linesize[0] * height for Y data, linesize[1 or 2] * height/2 for U/V data), you can directly use input buffers:
// Initialize the AVFrame
AVFrame* frame = avcodec_alloc_frame();
frame->width = width;
frame->height = height;
frame->format = AV_PIX_FMT_YUV420P;
// Initialize frame->linesize
avpicture_fill((AVPicture*)frame, NULL, frame->format, frame->width, frame->height);
// Set frame->data pointers manually
frame->data[0] = inputBufferY;
frame->data[1] = inputBufferU;
frame->data[2] = inputBufferV;
// Or if your Y, U, V buffers are contiguous and have the correct size, simply use:
// avpicture_fill((AVPicture*)frame, inputBufferYUV, frame->format, frame->width, frame->height);
If you want/need to manipulate a copy of input data, you need to compute the needed buffer size, and copy input data in it.
// Initialize the AVFrame
AVFrame* frame = avcodec_alloc_frame();
frame->width = width;
frame->height = height;
frame->format = AV_PIX_FMT_YUV420P;
// Allocate a buffer large enough for all data
int size = avpicture_get_size(frame->format, frame->width, frame->height);
uint8_t* buffer = (uint8_t*)av_malloc(size);
// Initialize frame->linesize and frame->data pointers
avpicture_fill((AVPicture*)frame, buffer, frame->format, frame->width, frame->height);
// Copy data from the 3 input buffers
memcpy(frame->data[0], inputBufferY, frame->linesize[0] * frame->height);
memcpy(frame->data[1], inputBufferU, frame->linesize[1] * frame->height / 2);
memcpy(frame->data[2], inputBufferV, frame->linesize[2] * frame->height / 2);
Once you are done with the AVFrame, do not forget to free it with av_frame_free (and any buffer allocated by av_malloc).
FF_API int ff_get_format_plane_size(int fmt, int plane, int scanLine, int height)
{
const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(fmt);
if (desc)
{
int h = height;
if (plane == 1 || plane == 2)
{
h = FF_CEIL_RSHIFT(height, desc->log2_chroma_h);
}
return h*scanLine;
}
else
return AVERROR(EINVAL);
}
Related
I have to fill the ffmpeg AVFrame->data from a cairo surface pixel data. I have this code:
/* Image info and pixel data */
width = cairo_image_surface_get_width( surface );
height = cairo_image_surface_get_height( surface );
stride = cairo_image_surface_get_stride( surface );
pix = cairo_image_surface_get_data( surface );
for( row = 0; row < height; row++ )
{
data = pix + row * stride;
for( col = 0; col < width; col++ )
{
img->video_frame->data[0][row * img->video_frame->linesize[0] + col] = data[0];
img->video_frame->data[1][row * img->video_frame->linesize[1] + col] = data[1];
//img->video_frame->data[2][row * img->video_frame->linesize[2] + col] = data[2];
data += 4;
}
img->video_frame->pts++;
}
But the colors in the exported video are wrong. The original heart is red. Can someone point me in the right direction? The encode.c example is useless sadly and on the Internet there is a lot of confusion about Y, Cr and Cb which I really don't understand. Please feel free to ask for more details. Many thanks.
You need to use libswscale to convert the source image data from RGB24 to YUV420P.
Something like:
int width = cairo_image_surface_get_width( surface );
int height = cairo_image_surface_get_height( surface );
int stride = cairo_image_surface_get_stride( surface );
uint8_t *pix = cairo_image_surface_get_data( surface );
uint8_t *data[1] = { pix };
int linesize[1] = { stride };
struct SwsContext *sws_ctx = sws_getContext(width, height, AV_PIX_FMT_RGB24 ,
width, height, AV_PIX_FMT_YUV420P,
SWS_BILINEAR, NULL, NULL, NULL);
sws_scale(sws_ctx, data, linesize, 0, height,
img->video_frame->data, img->video_frame->linesize);
sws_freeContext(sws_ctx);
See the example here: scaling_video
I use FFMPEG to decode H264 stream. After I get decoded YUV420 frames I want to convert them into RGB24.
struct SwsContext * ctx = NULL;
// frame is AVFrame in YUV420 obtained from decoder. It has all three strides and seem to be valid.
if (ctx == NULL)
{
ctx = sws_getContext(frame->width, frame->height, frame->format, frame->width, frame->height,
AV_PIX_FMT_RGB24, SWS_BICUBIC, 0, 0, 0);
}
AVFrame* frame2 = av_frame_alloc();
int num_bytes = av_image_get_buffer_size(AV_PIX_FMT_RGB24, frame->width, frame->height, 32);
uint8_t* frame2_buffer = (uint8_t *)av_malloc(num_bytes * sizeof(uint8_t));
int size = av_image_fill_arrays(frame2->data, frame2->linesize, frame2_buffer, AV_PIX_FMT_RGB24, frame->width, frame->height, 32);
int height_of_output = sws_scale(ctx, frame->data, frame->linesize, 0, frame->height, frame2->data, frame2->linesize);
callbackFullRGB(state, frameIndex, 0, frame2->data[0], num_bytes, (__int32)frame2->format, (__int32)frame2->width, (__int32)frame2->height);
av_frame_free(&frame2);
However frame2 has no resolution set, pixel format is -1 and data buffer is empty. I have 1280x720 input, stride length is set to 3840 for output frame which is correct. sws_scale also returns 720 as a result - no errors, no exceptions.
What might be wrong?
My question is what should I do when I use real-time time stretch?
I understand that the change of rate will change the count of samples for output.
For example, if I stretch audio with 2.0 coefficient, the output buffer is bigger (twice).
So, what should I do if I create reverb, delay or real-time time stretch?
For example, my input buffer is 1024 samples. Then I stretch audio with 2.0 coefficient. Now my Buffer is 2048 samples.
In this code with superpowered audio stretch, everything is work. But if I do not change the rate... When I change rate - it sounds with distortion without actual change of speed.
return ^AUAudioUnitStatus(AudioUnitRenderActionFlags *actionFlags,
const AudioTimeStamp *timestamp,
AVAudioFrameCount frameCount,
NSInteger outputBusNumber,
AudioBufferList *outputBufferListPtr,
const AURenderEvent *realtimeEventListHead,
AURenderPullInputBlock pullInputBlock ) {
pullInputBlock(actionFlags, timestamp, frameCount, 0, renderABLCapture);
Float32 *sampleDataInLeft = (Float32*) renderABLCapture->mBuffers[0].mData;
Float32 *sampleDataInRight = (Float32*) renderABLCapture->mBuffers[1].mData;
Float32 *sampleDataOutLeft = (Float32*)outputBufferListPtr->mBuffers[0].mData;
Float32 *sampleDataOutRight = (Float32*)outputBufferListPtr->mBuffers[1].mData;
SuperpoweredAudiobufferlistElement inputBuffer;
inputBuffer.samplePosition = 0;
inputBuffer.startSample = 0;
inputBuffer.samplesUsed = 0;
inputBuffer.endSample = frameCount;
inputBuffer.buffers[0] = SuperpoweredAudiobufferPool::getBuffer(frameCount * 8 + 64);
inputBuffer.buffers[1] = inputBuffer.buffers[2] = inputBuffer.buffers[3] = NULL;
SuperpoweredInterleave(sampleDataInLeft, sampleDataInRight, (Float32*)inputBuffer.buffers[0], frameCount);
timeStretch->setRateAndPitchShift(1.0f, -2);
timeStretch->setSampleRate(48000);
timeStretch->process(&inputBuffer, outputBuffers);
if (outputBuffers->makeSlice(0, outputBuffers->sampleLength)) {
int numSamples = 0;
int samplesOffset =0;
while (true) {
Float32 *timeStretchedAudio = (Float32 *)outputBuffers->nextSliceItem(&numSamples);
if (!timeStretchedAudio) break;
SuperpoweredDeInterleave(timeStretchedAudio, sampleDataOutLeft + samplesOffset, sampleDataOutRight + samplesOffset, numSamples);
samplesOffset += numSamples;
};
outputBuffers->clear();
}
return noErr;
};
So, how can I create my Audio Unit render block, when my input and output buffers have the different count of samples (reverb, delay or time stretch)?
If your process creates more samples than provided by the audio callback input/output buffer size, you have to save those samples and play them later, by mixing in with subsequent output in a later audio unit callback if necessary.
Often circular buffers are used to decouple input, processing, and output sample rates or buffer sizes.
I am trying to convert RGB frames to YUV420P format in ffmpeg/libav. Following is the code for conversion and also the images before and after conversion. The converted image loses all color information and also the scale changes significantly. Does anybody have idea how to handle this? I am completely new to ffmpeg/libav!
// Did we get a video frame?
if(frameFinished)
{
i++;
sws_scale(img_convert_ctx, (const uint8_t * const *)pFrame->data,
pFrame->linesize, 0, pCodecCtx->height,
pFrameRGB->data, pFrameRGB->linesize);
//==============================================================
AVFrame *pFrameYUV = avcodec_alloc_frame();
// Determine required buffer size and allocate buffer
int numBytes2 = avpicture_get_size(PIX_FMT_RGB24, pCodecCtx->width,
pCodecCtx->height);
uint8_t *buffer = (uint8_t *)av_malloc(numBytes2*sizeof(uint8_t));
avpicture_fill((AVPicture *)pFrameYUV, buffer, PIX_FMT_RGB24,
pCodecCtx->width, pCodecCtx->height);
rgb_to_yuv_ctx = sws_getContext(pCodecCtx->width, pCodecCtx->height,
PIX_FMT_RGB24,
pCodecCtx->width,pCodecCtx->height,
PIX_FMT_RGB24,
SWS_BICUBIC, NULL,NULL,NULL);
sws_scale(rgb_to_yuv_ctx, pFrameRGB->data, pFrameRGB->linesize, 0,
pCodecCtx->height, pFrameYUV->data, pFrameYUV->linesize);
sws_freeContext(rgb_to_yuv_ctx);
SaveFrame(pFrameYUV, pCodecCtx->width, pCodecCtx->height, i);
av_free(buffer);
av_free(pFrameYUV);
}
Well for starters I will assume where you have:
rgb_to_yuv_ctx = sws_getContext(pCodecCtx->width, pCodecCtx->height,
PIX_FMT_RGB24,
pCodecCtx->width,pCodecCtx->height,
PIX_FMT_RGB24,
SWS_BICUBIC, NULL,NULL,NULL);
You really intended:
rgb_to_yuv_ctx = sws_getContext(pCodecCtx->width, pCodecCtx->height,
PIX_FMT_RGB24,
pCodecCtx->width,pCodecCtx->height,
PIX_FMT_YUV420P,
SWS_BICUBIC, NULL,NULL,NULL);
I'm also not sure why you are calling swscale twice!
YUV is a planar format. This means all three channels are stored independently. Whre RGB is stored like:
RGBRGBRGB
YUV420P is stores like:
YYYYYYYYYYYYYYYY..UUUUUUUUUU..VVVVVVVV
So swscale required you give it three pointers.
Next, You want your line stride to be a multiple of 16, or 32 so the vector units of the processor can be used. And finally the dimensions of the Y plane need to be divisible by two (because the U and V planes are a quarter size of the Y plane).
So, lets rewrite this:
#define RNDTO2(X) ( ( (X) & 0xFFFFFFFE )
#define RNDTO32(X) ( ( (X) % 32 ) ? ( ( (X) + 32 ) & 0xFFFFFFE0 ) : (X) )
if(frameFinished)
{
static SwsContext *swsCtx = NULL;
int width = RNDTO2 ( pCodecCtx->width );
int height = RNDTO2 ( pCodecCtx->height );
int ystride = RNDTO32 ( width );
int uvstride = RNDTO32 ( width / 2 );
int ysize = ystride * height;
int vusize = uvstride * ( height / 2 );
int size = ysize + ( 2 * vusize )
void * pFrameYUV = malloc( size );
void *plane[] = { pFrameYUV, pFrameYUV + ysize, pFrameYUV + ysize + vusize, 0 };
int *stride[] = { ystride, vustride, vustride, 0 };
swsCtx = sws_getCachedContext ( swsCtx, pCodecCtx->width, pCodecCtx->height,
pCodecCtx->pixfmt, width, height, AV_PIX_FMT_YUV420P,
SWS_LANCZOS | SWS_ACCURATE_RND , NULL, NULL, NULL );
sws_scale ( swsCtx, pFrameRGB->data, pFrameRGB->linesize, 0,
pFrameRGB->height, plane, stride );
}
I also switched your algorithm to use SWS_LANCZOS | SWS_ACCURATE_RND. This will give you better looking images. Change it back if it is to slow. I also used the pixel format from the source frame instead of assuming it RGB all the time.
I have a raw image buffer in the RGB format. I need to draw it to CGContext so that I get a new buffer of the format ARGB. I accomplish this in the following way:
Create a data provider out of raw buffer using CGDataProviderCreateWithData and then create image out of the data provider with the api: CGImageCreate.
Now if I write this image back to the CGBitmapContext using CGContextImageDraw.
Instead of creating an intermediate image, is there any way of writing the buffer directly to CGContext so that I can avoid the image creation phase?
Thanks
If all you want is to take RGB data with no alpha component and turn it into ARGB data with full opacity (alpha = 1.0 at all points), why not just copy the data yourself into a new buffer?
// assuming 24-bit RGB (1 byte per color component)
unsigned char *rgb = /* ... */;
size_t rgb_bytes = /* ... */;
const size_t bpp_rgb = 3; // bytes per pixel - rgb
const size_t bpp_argb = 4; // bytes per pixel - argb
const size_t npixels = rgb_bytes / bpp_rgb;
unsigned char *argb = malloc(npixels * bpp_argb);
for (size_t i = 0; i < npixels; ++i) {
const size_t argbi = bpp_argb * i;
const size_t rgbi = bpp_rgb * i;
argb[argbi] = 0xFF; // alpha - full opacity
argb[argbi + 1] = rgb[rgbi]; // r
argb[argbi + 2] = rgb[rgbi + 1]; // g
argb[argbi + 3] = rgb[rgbi + 2]; // b
}
If you are using a CGBitmapContext then you can get a pointer to the bitmap buffer using the CGBitmapContextGetData() function. You can then write your data directly to the buffer.