Decoder crashes after ffmpeg upgrade - ffmpeg

Recently I upgraded ffmpeg from 0.9 to 1.0 (tested on Win7x64 and on iOS), and now avcodec_decode_video2 seagfaults. Long story short: the crash occurs every time the video dimensions change (eg. from 320x240 to 160x120 or vice versa).
I receive mpeg4 video stream from some proprietary source and decode it like this:
// once, during initialization:
AVCodec *codec_ = avcodec_find_decoder(CODEC_ID_MPEG4);
AVCodecContext ctx_ = avcodec_alloc_context3(codec_);
avcodec_open2(ctx_, codec_, 0);
AVPacket packet_;
av_init_packet(&packet_);
AVFrame picture_ = avcodec_alloc_frame();
// on every frame:
int got_picture;
packet_.size = size;
packet_.data = (uint8_t *)buffer;
avcodec_decode_video2(ctx_, picture_, &got_picture, &packet_);
Again, all the above had worked flawlessly until I upgraded to 1.0. Now every time the frame dimensions change - avcodec_decode_video2 crashes. Note that I don't assign width/height in AVCodecContext - neither in the beginning, nor when the stream changes - can it be the reason?
I'd appreciate any idea!
Update: setting ctx_.width and ctx_.height doesn't help.
Update2: just before the crash I get the following log messages:
mpeg4, level 24: "Found 2 unreleased buffers!".
level 8: "Assertion i < avci->buffer_count failed at libavcodec/utils.c:603"
Update3 upgrading to 1.1.2 fixed this crash. The decoder is able again to cope with dimensions change on the fly.

You can try to fill the AVPacket::side_data. If you change the frame size, codec receives information from it (see libavcodec/utils.c apply_param_change function)
This structure can be filled as follows:
int my_ff_add_param_change(AVPacket *pkt, int32_t width, int32_t height)
{
uint32_t flags = 0;
int size = 4 * 3;
uint8_t *data;
if (!pkt)
return AVERROR(EINVAL);
flags = AV_SIDE_DATA_PARAM_CHANGE_DIMENSIONS;
data = av_packet_new_side_data(pkt, AV_PKT_DATA_PARAM_CHANGE, size);
if (!data)
return AVERROR(ENOMEM);
((uint32_t*)data)[0] = flags;
((uint32_t*)data)[1] = width;
((uint32_t*)data)[2] = height;
return 0;
}
You need to call this function every time the size changes.
I think this feature has appeared recently. I didn't know about it until I looked new ffmpeg sources.
UPD
As you write, the easiest method to solve the problem is to perform codec restart. Just call avcodec_close / avcodec_open2

I just ran into same issue when my frames were changing size on the fly. However, calling avcodec_close/avcodec_open2 is superflous. A cleaner way is to just reset your AVPacket data structure before the call to avcodec_decode_video2. Here it is the code:
av_init_packet(&packet_)
The key here is that this method resets the all of the values of AVPacket to defaults. Check docs for more info.

Related

VideoToolbox hardware encoded I frame not clear on Intel Mac

When I captured video from camera on Intel Mac, used VideoToolbox to hardware encode raw pixel buffers to H.264 codec slices, I found that the VideoToolbox encoded I frame not clear, causing it looks like blurs every serveral seconds. Below are properties setted:
self.bitrate = 1000000;
self.frameRate = 20;
int interval_second = 2;
int interval_second = 2;
NSDictionary *compressionProperties = #{
(id)kVTCompressionPropertyKey_ProfileLevel: (id)kVTProfileLevel_H264_High_AutoLevel,
(id)kVTCompressionPropertyKey_RealTime: #YES,
(id)kVTCompressionPropertyKey_AllowFrameReordering: #NO,
(id)kVTCompressionPropertyKey_H264EntropyMode: (id)kVTH264EntropyMode_CABAC,
(id)kVTCompressionPropertyKey_PixelTransferProperties: #{
(id)kVTPixelTransferPropertyKey_ScalingMode: (id)kVTScalingMode_Trim,
},
(id)kVTCompressionPropertyKey_AverageBitRate: #(self.bitrate),
(id)kVTCompressionPropertyKey_ExpectedFrameRate: #(self.frameRate),
(id)kVTCompressionPropertyKey_MaxKeyFrameInterval: #(self.frameRate * interval_second),
(id)kVTCompressionPropertyKey_MaxKeyFrameIntervalDuration: #(interval_second),
(id)kVTCompressionPropertyKey_DataRateLimits: #[#(self.bitrate / 8), #1.0],
};
result = VTSessionSetProperties(self.compressionSession, (CFDictionaryRef)compressionProperties);
if (result != noErr) {
NSLog(#"VTSessionSetProperties failed: %d", (int)result);
return;
} else {
NSLog(#"VTSessionSetProperties succeeded");
}
These are very strange compression settings. Do you really need short GOP and very strict data rate limits?
I very much suspect you just copied some code off the internet without having any idea what it does. If it's the case, just set interval_second = 300 and remove kVTCompressionPropertyKey_DataRateLimits completely

FFmpeg transcoded sound (AAC) stops after half video time

I have a strange problem in my C/C++ FFmpeg transcoder, which takes an input MP4 (varying input codecs) and produces and output MP4 (x264, baseline & AAC LC #44100 sample rate with libfdk_aac):
The resulting mp4 video has fine images (x264) and the audio (AAC LC) works fine as well, but is only played until exactly the half of the video.
The audio is not slowed down, not stretched and doesn't stutter. It just stops right in the middle of the video.
One hint may be that the input file has a sample rate of 22050 and 22050/44100 is 0.5, but I really don't get why this would make the sound just stop after half the time. I'd expect such an error leading to sound being at the wrong speed. Everything works just fine if I don't try to enforce 44100 and instead just use the incoming sample_rate.
Another guess would be that the pts calculation doesn't work. But the audio sounds just fine (until it stops) and I do exactly the same for the video part, where it works flawlessly. "Exactly", as in the same code, but "audio"-variables replaced with "video"-variables.
FFmpeg reports no errors during the whole process. I also flush the decoders/encoders/interleaved_writing after all the package reading from the input is done. It works well for the video so I doubt there is much wrong with my general approach.
Here are the functions of my code (stripped off the error handling & other class stuff):
AudioCodecContext Setup
outContext->_audioCodec = avcodec_find_encoder(outContext->_audioTargetCodecID);
outContext->_audioStream =
avformat_new_stream(outContext->_formatContext, outContext->_audioCodec);
outContext->_audioCodecContext = outContext->_audioStream->codec;
outContext->_audioCodecContext->channels = 2;
outContext->_audioCodecContext->channel_layout = av_get_default_channel_layout(2);
outContext->_audioCodecContext->sample_rate = 44100;
outContext->_audioCodecContext->sample_fmt = outContext->_audioCodec->sample_fmts[0];
outContext->_audioCodecContext->bit_rate = 128000;
outContext->_audioCodecContext->strict_std_compliance = FF_COMPLIANCE_EXPERIMENTAL;
outContext->_audioCodecContext->time_base =
(AVRational){1, outContext->_audioCodecContext->sample_rate};
outContext->_audioStream->time_base = (AVRational){1, outContext->_audioCodecContext->sample_rate};
int retVal = avcodec_open2(outContext->_audioCodecContext, outContext->_audioCodec, NULL);
Resampler Setup
outContext->_audioResamplerContext =
swr_alloc_set_opts( NULL, outContext->_audioCodecContext->channel_layout,
outContext->_audioCodecContext->sample_fmt,
outContext->_audioCodecContext->sample_rate,
_inputContext._audioCodecContext->channel_layout,
_inputContext._audioCodecContext->sample_fmt,
_inputContext._audioCodecContext->sample_rate,
0, NULL);
int retVal = swr_init(outContext->_audioResamplerContext);
Decoding
decodedBytes = avcodec_decode_audio4( _inputContext._audioCodecContext,
_inputContext._audioTempFrame,
&p_gotAudioFrame, &_inputContext._currentPacket);
Converting (only if decoding produced a frame, of course)
int retVal = swr_convert( outContext->_audioResamplerContext,
outContext->_audioConvertedFrame->data,
outContext->_audioConvertedFrame->nb_samples,
(const uint8_t**)_inputContext._audioTempFrame->data,
_inputContext._audioTempFrame->nb_samples);
Encoding (only if decoding produced a frame, of course)
outContext->_audioConvertedFrame->pts =
av_frame_get_best_effort_timestamp(_inputContext._audioTempFrame);
// Init the new packet
av_init_packet(&outContext->_audioPacket);
outContext->_audioPacket.data = NULL;
outContext->_audioPacket.size = 0;
// Encode
int retVal = avcodec_encode_audio2( outContext->_audioCodecContext,
&outContext->_audioPacket,
outContext->_audioConvertedFrame,
&p_gotPacket);
// Set pts/dts time stamps for writing interleaved
av_packet_rescale_ts( &outContext->_audioPacket,
outContext->_audioCodecContext->time_base,
outContext->_audioStream->time_base);
outContext->_audioPacket.stream_index = outContext->_audioStream->index;
Writing (only if encoding produced a packet, of course)
int retVal = av_interleaved_write_frame(outContext->_formatContext, &outContext->_audioPacket);
I am quite out of ideas about what would cause such a behaviour.
So, I finally managed to figure things out myself.
The problem was indeed in the difference of the sample_rate.
You'd assume that a call to swr_convert() would give you all the samples you need for converting the audio frame when called like I did.
Of course, that would be too easy.
Instead, you need to call swr_convert (potentially) multiple times per frame and buffer its output, if required. Then you need to grab a single frame from the buffer and that is what you will have to encode.
Here is my new convertAudioFrame function:
// Calculate number of output samples
int numOutputSamples = av_rescale_rnd(
swr_get_delay(outContext->_audioResamplerContext, _inputContext._audioCodecContext->sample_rate)
+ _inputContext._audioTempFrame->nb_samples,
outContext->_audioCodecContext->sample_rate,
_inputContext._audioCodecContext->sample_rate,
AV_ROUND_UP);
if (numOutputSamples == 0)
{
return;
}
uint8_t* tempSamples;
av_samples_alloc( &tempSamples, NULL,
outContext->_audioCodecContext->channels, numOutputSamples,
outContext->_audioCodecContext->sample_fmt, 0);
int retVal = swr_convert( outContext->_audioResamplerContext,
&tempSamples,
numOutputSamples,
(const uint8_t**)_inputContext._audioTempFrame->data,
_inputContext._audioTempFrame->nb_samples);
// Write to audio fifo
if (retVal > 0)
{
retVal = av_audio_fifo_write(outContext->_audioFifo, (void**)&tempSamples, retVal);
}
av_freep(&tempSamples);
// Get a frame from audio fifo
int samplesAvailable = av_audio_fifo_size(outContext->_audioFifo);
if (samplesAvailable > 0)
{
retVal = av_audio_fifo_read(outContext->_audioFifo,
(void**)outContext->_audioConvertedFrame->data,
outContext->_audioCodecContext->frame_size);
// We got a frame, so also set its pts
if (retVal > 0)
{
p_gotConvertedFrame = 1;
if (_inputContext._audioTempFrame->pts != AV_NOPTS_VALUE)
{
outContext->_audioConvertedFrame->pts = _inputContext._audioTempFrame->pts;
}
else if (_inputContext._audioTempFrame->pkt_pts != AV_NOPTS_VALUE)
{
outContext->_audioConvertedFrame->pts = _inputContext._audioTempFrame->pkt_pts;
}
}
}
This function I basically call until there are no more frame in the audio fifo buffer.
So, the audio was only half as long because I only encoded as many frames as I decoded. Where I actually needed to encode 2 times as many frames due to 2 times the sample_rate.

Is it possible to decode MPEG4 frames without delay with ffmpeg?

I use ffmpeg's MPEG4 decoder. The decoder has CODEC_CAP_DELAY capability among others. It means the decoder will give me decoded frames with latency of 1 frame.
I have a set of MPEG4 (I- & P- )frames from AVI file and feed ffmpeg decoder with these frames. For the very first I-frame decoder gives me nothing, but decodes the frames successfully. I can force the decoder to get the decoded frame with the second call of avcodec_decode_video2 and providing nulls (flush it), but if I do so for each frame I get artifacts for the first group of pictures (e.g. second decoded P-frame is of gray color).
If I do not force ffmpeg decoder to give me decoded frame right now, then it works flawlessly and without artifacts.
Question: But is it possible to get decoded frame without giving the decoder next frame and without artifacts?
Small example of how decoding is implemented for each frame:
// decode
int got_frame = 0;
int err = 0;
int tries = 5;
do
{
err = avcodec_decode_video2(m_CodecContext, m_Frame, &got_frame, &m_Packet);
/* some codecs, such as MPEG, transmit the I and P frame with a
latency of one frame. You must do the following to have a
chance to get the last frame of the video */
m_Packet.data = NULL;
m_Packet.size = 0;
--tries;
}
while (err >= 0 && got_frame == 0 && tries > 0);
But as I said that gave me artifacts for the first gop.
Use the "-flags +low_delay" option (or in code, set AVCodecContext.flags |= CODEC_FLAG_LOW_DELAY).
I tested several options and "-flags low_delay" and "-probesize 32" is more important than others. bellow code worked for me.
AVDictionary* avDic = nullptr;
av_dict_set(&avDic, "flags", "low_delay", 0);
av_dict_set(&avDic, "probesize", "32", 0);
const int errorCode = avformat_open_input(&pFormatCtx, mUrl.c_str(), nullptr, &avDic);

FFMPEG streaming RTP: time base not set

I'm trying to create a small demo to get a feeling for streaming programmatically with ffmpeg. I'm using the code from this question as a basis. I can compile my code, but when I try to run it I always get this error:
[rtp # 0xbeb480] time base not set
The thing is, I have set the time base parameters. I even tried setting them for the stream (and the codec associated with the stream) as well, even though this should not be necessary as far as I understand it. This is the relevant section in my code:
AVCodec* codec = avcodec_find_encoder(AV_CODEC_ID_H264);
AVCodecContext* c = avcodec_alloc_context3(codec);
c->pix_fmt = AV_PIX_FMT_YUV420P;
c->flags = CODEC_FLAG_GLOBAL_HEADER;
c->width = WIDTH;
c->height = HEIGHT;
c->time_base.den = FPS;
c->time_base.num = 1;
c->gop_size = FPS;
c->bit_rate = BITRATE;
avcodec_open2(c, codec, NULL);
struct AVStream* stream = avformat_new_stream(avctx, codec);
// TODO: causes an error
avformat_write_header(avctx, NULL);
The error occurs when calling "avformat_write_header" near the end. All methods that can fail (like avcodec_open2) are checked, I just removed the checks to make the code more readable.
Digging through google and the ffmpeg source code didn't yield any useful results. I think it's really basic, but I'm stuck. Who can help me?
You are making settings in a wrong codec context.
The streams created by avformat_new_stream() have their own internal codec contexts, the one you created with avcodec_alloc_context3() is unnecessary and has no effect on the workings of avformat_write_header().
To set the variables correctly, set them this way:
AVCodec* codec = avcodec_find_encoder(AV_CODEC_ID_H264);
struct AVStream* stream = avformat_new_stream(avctx, codec);
stream->codec->pix_fmt = AV_PIX_FMT_YUV420P;
stream->codec->flags = CODEC_FLAG_GLOBAL_HEADER;
stream->codec->width = WIDTH;
stream->codec->height = HEIGHT;
stream->codec->time_base = (AVRational){1,FPS};
stream->codec->gop_size = FPS;
stream->codec->bit_rate = BITRATE;
That solved this particular problem for me, I added the other answer given here as well, as that's how I have it set, though your method of setting the time_base probably could have worked too, if you had been talking to the correct codec context.
Try:
c->time_base = (AVRational) {1, FPS};

What is a 10.6-compatible means of recording video frames to a movie without using the QuickTime API?

I'm updating an application to be 64-bit-compatible, but I'm having a little difficulty with our movie recording code. We have a FireWire camera that feeds YUV frames into our application, which we process and encode out to disk within an MPEG4 movie. Currently, we are using the C-based QuickTime API to do this (using Image Compression Manager, etc.), but the old QuickTime API does not have support for 64 bit.
My first attempt was to use QTKit's QTMovie and encode individual frames using -addImage:forDuration:withAttributes:, but that requires the creation of an NSImage for each frame (which is computationally expensive) and it does not do temporal compression, so it doesn't generate the most compact files.
I'd like to use something like QTKit Capture's QTCaptureMovieFileOutput, but I can't figure out how to feed raw frames into that which aren't associated with a QTCaptureInput. We can't use our camera directly with QTKit Capture because of our need to manually control the gain, exposure, etc. for it.
On Lion, we now have the AVAssetWriter class in AVFoundation which lets you do this, but I still have to target Snow Leopard for the time being, so I'm trying to find a solution that works there as well.
Therefore, is there a way to do non-QuickTime frame-by-frame recording of video that is more efficient than QTMovie's -addImage:forDuration:withAttributes: and produces file sizes comparable to what the older QuickTime API can?
In the end, I decided to go with the approach suggested by TiansHUo, and use libavcodec for the video compression here. Based on the instructions by Martin here, I downloaded the FFmpeg source and built a 64-bit compatible version of the necessary libraries using
./configure --disable-gpl --arch=x86_64 --cpu=core2 --enable-shared --disable-amd3dnow --enable-memalign-hack --cc=llvm-gcc
make
sudo make install
This creates the LGPL shared libraries for the 64-bit Core2 processors in the Mac. Unfortunately, I haven't yet figured a way to make the library run without crashing when the MMX optimizations are enabled, so that is disabled right now. This slows down encoding somewhat. After some experimentation, I found that I could build a 64-bit version of the library which had MMX optimizations enabled and was stable on the Mac by using the above configuration options. This is much faster when encoding than the library built with MMX disabled.
Note that if you use these shared libraries, you should make sure you follow the LGPL compliance instructions on FFmpeg's site to the letter.
In order to get these shared libraries to function properly when placed in proper folder within my Mac application bundle, I needed to use install_name_tool to adjust the internal search paths in these libraries to point to their new location in the Frameworks directory within the application bundle:
install_name_tool -id #executable_path/../Frameworks/libavutil.51.9.1.dylib libavutil.51.9.1.dylib
install_name_tool -id #executable_path/../Frameworks/libavcodec.53.7.0.dylib libavcodec.53.7.0.dylib
install_name_tool -change /usr/local/lib/libavutil.dylib #executable_path/../Frameworks/libavutil.51.9.1.dylib libavcodec.53.7.0.dylib
install_name_tool -id #executable_path/../Frameworks/libavformat.53.4.0.dylib libavformat.53.4.0.dylib
install_name_tool -change /usr/local/lib/libavutil.dylib #executable_path/../Frameworks/libavutil.51.9.1.dylib libavformat.53.4.0.dylib
install_name_tool -change /usr/local/lib/libavcodec.dylib #executable_path/../Frameworks/libavcodec.53.7.0.dylib libavformat.53.4.0.dylib
install_name_tool -id #executable_path/../Frameworks/libswscale.2.0.0.dylib libswscale.2.0.0.dylib
install_name_tool -change /usr/local/lib/libavutil.dylib #executable_path/../Frameworks/libavutil.51.9.1.dylib libswscale.2.0.0.dylib
Your specific paths may vary. This adjustment lets them work from within the application bundle without having to install them in /usr/local/lib on the user's system.
I then had my Xcode project link against these libraries, and I created a separate class to handle the video encoding. This class takes in raw video frames (in BGRA format) through the videoFrameToEncode property and encodes them within the movieFileName file as MPEG4 video in an MP4 container. The code is as follows:
SPVideoRecorder.h
#import <Foundation/Foundation.h>
#include "libavcodec/avcodec.h"
#include "libavformat/avformat.h"
#include "libswscale/swscale.h"
uint64_t getNanoseconds(void);
#interface SPVideoRecorder : NSObject
{
NSString *movieFileName;
CGFloat framesPerSecond;
AVCodecContext *codecContext;
AVStream *videoStream;
AVOutputFormat *outputFormat;
AVFormatContext *outputFormatContext;
AVFrame *videoFrame;
AVPicture inputRGBAFrame;
uint8_t *pictureBuffer;
uint8_t *outputBuffer;
unsigned int outputBufferSize;
int frameColorCounter;
unsigned char *videoFrameToEncode;
dispatch_queue_t videoRecordingQueue;
dispatch_semaphore_t frameEncodingSemaphore;
uint64_t movieStartTime;
}
#property(readwrite, assign) CGFloat framesPerSecond;
#property(readwrite, assign) unsigned char *videoFrameToEncode;
#property(readwrite, copy) NSString *movieFileName;
// Movie recording control
- (void)startRecordingMovie;
- (void)encodeNewFrameToMovie;
- (void)stopRecordingMovie;
#end
SPVideoRecorder.m
#import "SPVideoRecorder.h"
#include <sys/time.h>
#implementation SPVideoRecorder
uint64_t getNanoseconds(void)
{
struct timeval now;
gettimeofday(&now, NULL);
return now.tv_sec * NSEC_PER_SEC + now.tv_usec * NSEC_PER_USEC;
}
#pragma mark -
#pragma mark Initialization and teardown
- (id)init;
{
if (!(self = [super init]))
{
return nil;
}
/* must be called before using avcodec lib */
avcodec_init();
/* register all the codecs */
avcodec_register_all();
av_register_all();
av_log_set_level( AV_LOG_ERROR );
videoRecordingQueue = dispatch_queue_create("com.sonoplot.videoRecordingQueue", NULL);;
frameEncodingSemaphore = dispatch_semaphore_create(1);
return self;
}
#pragma mark -
#pragma mark Movie recording control
- (void)startRecordingMovie;
{
dispatch_async(videoRecordingQueue, ^{
NSLog(#"Start recording to file: %#", movieFileName);
const char *filename = [movieFileName UTF8String];
// Use an MP4 container, in the standard QuickTime format so it's readable on the Mac
outputFormat = av_guess_format("mov", NULL, NULL);
if (!outputFormat) {
NSLog(#"Could not set output format");
}
outputFormatContext = avformat_alloc_context();
if (!outputFormatContext)
{
NSLog(#"avformat_alloc_context Error!");
}
outputFormatContext->oformat = outputFormat;
snprintf(outputFormatContext->filename, sizeof(outputFormatContext->filename), "%s", filename);
// Add a video stream to the MP4 file
videoStream = av_new_stream(outputFormatContext,0);
if (!videoStream)
{
NSLog(#"av_new_stream Error!");
}
// Use the MPEG4 encoder (other DiVX-style encoders aren't compatible with this container, and x264 is GPL-licensed)
AVCodec *codec = avcodec_find_encoder(CODEC_ID_MPEG4);
if (!codec) {
fprintf(stderr, "codec not found\n");
exit(1);
}
codecContext = videoStream->codec;
codecContext->codec_id = codec->id;
codecContext->codec_type = AVMEDIA_TYPE_VIDEO;
codecContext->bit_rate = 4800000;
codecContext->width = 640;
codecContext->height = 480;
codecContext->pix_fmt = PIX_FMT_YUV420P;
// codecContext->time_base = (AVRational){1,(int)round(framesPerSecond)};
// videoStream->time_base = (AVRational){1,(int)round(framesPerSecond)};
codecContext->time_base = (AVRational){1,200}; // Set it to 200 FPS so that we give a little wiggle room when recording at 50 FPS
videoStream->time_base = (AVRational){1,200};
// codecContext->max_b_frames = 3;
// codecContext->b_frame_strategy = 1;
codecContext->qmin = 1;
codecContext->qmax = 10;
// codecContext->mb_decision = 2; // -mbd 2
// codecContext->me_cmp = 2; // -cmp 2
// codecContext->me_sub_cmp = 2; // -subcmp 2
codecContext->keyint_min = (int)round(framesPerSecond);
// codecContext->flags |= CODEC_FLAG_4MV; // 4mv
// codecContext->flags |= CODEC_FLAG_LOOP_FILTER;
codecContext->i_quant_factor = 0.71;
codecContext->qcompress = 0.6;
// codecContext->max_qdiff = 4;
codecContext->flags2 |= CODEC_FLAG2_FASTPSKIP;
if(outputFormat->flags & AVFMT_GLOBALHEADER)
{
codecContext->flags |= CODEC_FLAG_GLOBAL_HEADER;
}
// Open the codec
if (avcodec_open(codecContext, codec) < 0)
{
NSLog(#"Couldn't initialize the codec");
return;
}
// Open the file for recording
if (avio_open(&outputFormatContext->pb, outputFormatContext->filename, AVIO_FLAG_WRITE) < 0)
{
NSLog(#"Couldn't open file");
return;
}
// Start by writing the video header
if (avformat_write_header(outputFormatContext, NULL) < 0)
{
NSLog(#"Couldn't write video header");
return;
}
// Set up the video frame and output buffers
outputBufferSize = 400000;
outputBuffer = malloc(outputBufferSize);
int size = codecContext->width * codecContext->height;
int pictureBytes = avpicture_get_size(PIX_FMT_YUV420P, codecContext->width, codecContext->height);
pictureBuffer = (uint8_t *)av_malloc(pictureBytes);
videoFrame = avcodec_alloc_frame();
videoFrame->data[0] = pictureBuffer;
videoFrame->data[1] = videoFrame->data[0] + size;
videoFrame->data[2] = videoFrame->data[1] + size / 4;
videoFrame->linesize[0] = codecContext->width;
videoFrame->linesize[1] = codecContext->width / 2;
videoFrame->linesize[2] = codecContext->width / 2;
avpicture_alloc(&inputRGBAFrame, PIX_FMT_BGRA, codecContext->width, codecContext->height);
frameColorCounter = 0;
movieStartTime = getNanoseconds();
});
}
- (void)encodeNewFrameToMovie;
{
// NSLog(#"Encode frame");
if (dispatch_semaphore_wait(frameEncodingSemaphore, DISPATCH_TIME_NOW) != 0)
{
return;
}
dispatch_async(videoRecordingQueue, ^{
// CFTimeInterval previousTimestamp = CFAbsoluteTimeGetCurrent();
frameColorCounter++;
if (codecContext == NULL)
{
return;
}
// Take the input BGRA texture data and convert it to a YUV 4:2:0 planar frame
avpicture_fill(&inputRGBAFrame, videoFrameToEncode, PIX_FMT_BGRA, codecContext->width, codecContext->height);
struct SwsContext * img_convert_ctx = sws_getContext(codecContext->width, codecContext->height, PIX_FMT_BGRA, codecContext->width, codecContext->height, PIX_FMT_YUV420P, SWS_FAST_BILINEAR, NULL, NULL, NULL);
sws_scale(img_convert_ctx, (const uint8_t* const *)inputRGBAFrame.data, inputRGBAFrame.linesize, 0, codecContext->height, videoFrame->data, videoFrame->linesize);
// Encode the frame
int out_size = avcodec_encode_video(codecContext, outputBuffer, outputBufferSize, videoFrame);
// Generate a packet and insert in the video stream
if (out_size != 0)
{
AVPacket videoPacket;
av_init_packet(&videoPacket);
if (codecContext->coded_frame->pts != AV_NOPTS_VALUE)
{
uint64_t currentFrameTime = getNanoseconds();
videoPacket.pts = av_rescale_q(((uint64_t)currentFrameTime - (uint64_t)movieStartTime) / 1000ull/*codecContext->coded_frame->pts*/, AV_TIME_BASE_Q/*codecContext->time_base*/, videoStream->time_base);
// NSLog(#"Frame time %lld, converted time: %lld", ((uint64_t)currentFrameTime - (uint64_t)movieStartTime) / 1000ull, videoPacket.pts);
}
if(codecContext->coded_frame->key_frame)
{
videoPacket.flags |= AV_PKT_FLAG_KEY;
}
videoPacket.stream_index = videoStream->index;
videoPacket.data = outputBuffer;
videoPacket.size = out_size;
int ret = av_write_frame(outputFormatContext, &videoPacket);
if (ret < 0)
{
av_log(outputFormatContext, AV_LOG_ERROR, "%s","Error while writing frame.\n");
av_free_packet(&videoPacket);
return;
}
av_free_packet(&videoPacket);
}
// CFTimeInterval frameDuration = CFAbsoluteTimeGetCurrent() - previousTimestamp;
// NSLog(#"Frame duration: %f ms", frameDuration * 1000.0);
dispatch_semaphore_signal(frameEncodingSemaphore);
});
}
- (void)stopRecordingMovie;
{
dispatch_async(videoRecordingQueue, ^{
// Write out the video trailer
if (av_write_trailer(outputFormatContext) < 0)
{
av_log(outputFormatContext, AV_LOG_ERROR, "%s","Error while writing trailer.\n");
exit(1);
}
// Close out the file
if (!(outputFormat->flags & AVFMT_NOFILE))
{
avio_close(outputFormatContext->pb);
}
// Free up all movie-related resources
avcodec_close(codecContext);
av_free(codecContext);
codecContext = NULL;
free(pictureBuffer);
free(outputBuffer);
av_free(videoFrame);
av_free(outputFormatContext);
av_free(videoStream);
});
}
#pragma mark -
#pragma mark Accessors
#synthesize framesPerSecond, videoFrameToEncode, movieFileName;
#end
This works under Lion and Snow Leopard in a 64-bit application. It records at the same bitrate as my previous QuickTime-based approach, with overall lower CPU usage.
Hopefully, this will help out someone else in a similar situation.
I asked a very similar question of a QuickTime engineer last month at WWDC and they basically suggested using a 32-bit helper process...
I know that's not what you wanted to hear. ;)
Yes, there is (at least) a way to do non-QuickTime frame-by-frame recording of video that is more efficient and produces files comparable to Quicktime.
The open-source library libavcodec is perfect for your case of video-encoding. It is used in very popular open-source and commercial software and libraries (For example: mplayer, google chrome, imagemagick, opencv) It also provides a huge amount of options to tweak and numerous file formats (all important formats and lots of exotic formats). It is efficient and produces files at all kinds of bit-rates.
From Wikipedia:
libavcodec is a free software/open source LGPL-licensed library of
codecs for encoding and decoding video and audio data.[1] It is
provided by FFmpeg project or Libav project.[2] [3] libavcodec is an
integral part of many open-source multimedia applications and
frameworks. The popular MPlayer, xine and VLC media players use it as
their main, built-in decoding engine that enables playback of many
audio and video formats on all supported platforms. It is also used by
the ffdshow tryouts decoder as its primary decoding library.
libavcodec is also used in video editing and transcoding applications
like Avidemux, MEncoder or Kdenlive for both decoding and encoding.
libavcodec is particular in that it contains decoder and sometimes
encoder implementations of several proprietary formats, including ones
for which no public specification has been released. This reverse
engineering effort is thus a significant part of libavcodec
development. Having such codecs available within the standard
libavcodec framework gives a number of benefits over using the
original codecs, most notably increased portability, and in some cases
also better performance, since libavcodec contains a standard library
of highly optimized implementations of common building blocks, such as
DCT and color space conversion. However, even though libavcodec
strives for decoding that is bit-exact to the official implementation,
bugs and missing features in such reimplementations can sometimes
introduce compatibility problems playing back certain files.
You can choose to import FFmpeg directly into your XCode project.
Another solution is to directly pipe your frames into the FFmpeg
executable.
The FFmpeg project is a fast, accurate multimedia transcoder which can
be applied in a variety of scenarios on OS X.
FFmpeg (libavcodec included) can be compiled in mac
http://jungels.net/articles/ffmpeg-howto.html
FFmpeg (libavcodec included) can be also compiled in 64 bits on snow leopard
http://www.martinlos.com/?p=41
FFmpeg supports a huge number of video and audio codecs:
http://en.wikipedia.org/wiki/Libavcodec#Implemented_video_codecs
Note that libavcodec and FFmpeg is LGPL, which means that you will have to mention you've used them, and you don't need to open source your project.

Resources