I'm trying to generate a square wave using Core Audio with the code below. The AIFF-file generates nothing whatever I pass any arguments.The sample is from Learning Core Audio by Kevin Avila http://www.amazon.com/Learning-Core-Audio-Hands-On-Programming/dp/0321636848
Any Ideas? Please help!
The main.m looks as follows;
#import <Foundation/Foundation.h>
#import <AudioToolbox/AudioToolbox.h>
#define SAMPLE_RATE 44100
#define DURATION 50.0
#define FILENAME_FORMAT #"%0.3f-square.aif"
int main (int argc, const char * argv[]) {
#autoreleasepool {
if (argc < 2) {
printf ("Usage: CAToneFileGenerator n\n(where n is tone in Hz)");
return -1;
double hz = atof(argv[1]);
assert (hz > 0);
NSLog (#"generating %f hz tone", hz);
NSString *fileName = [NSString stringWithFormat: FILENAME_FORMAT, hz];
NSString *filePath = [[[NSFileManager defaultManager] currentDirectoryPath]
stringByAppendingPathComponent: fileName]; NSURL *fileURL = [NSURL fileURLWithPath: filePath];
// Prepare the format
AudioStreamBasicDescription asbd;
memset(&asbd, 0, sizeof(asbd));
asbd.mSampleRate = SAMPLE_RATE;
asbd.mFormatID = kAudioFormatLinearPCM;
asbd.mFormatFlags = kAudioFormatFlagIsBigEndian | kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
asbd.mBitsPerChannel = 16;
asbd.mChannelsPerFrame = 1;
asbd.mFramesPerPacket = 1;
asbd.mBytesPerFrame = 2; asbd.mBytesPerPacket = 2;
// Set up the file
AudioFileID audioFile;
OSStatus audioErr = noErr;
audioErr = AudioFileCreateWithURL((__bridge CFURLRef)fileURL,
assert (audioErr == noErr);
// Start writing samples
long maxSampleCount = SAMPLE_RATE;
long sampleCount = 0;
UInt32 bytesToWrite = 2;
double wavelengthInSamples = SAMPLE_RATE / hz;
while (sampleCount < maxSampleCount) {
for (int i=0; i<wavelengthInSamples; i++) {
// Square wave
SInt16 sample;
if (i < wavelengthInSamples/2) {
sample = CFSwapInt16HostToBig (SHRT_MAX); } else {
sample = CFSwapInt16HostToBig (SHRT_MIN); }
audioErr = AudioFileWriteBytes(audioFile, false,
sampleCount*2, &bytesToWrite, &sample);
assert (audioErr == noErr); sampleCount++;
audioErr = AudioFileClose(audioFile); assert (audioErr == noErr);
NSLog (#"wrote %ld samples", sampleCount);
return 0;
Pretty sure your AudioFileClose() should not be inside the for-loop. That should go outside the loop, after you've written all the samples.
Please check against the downloadable code at the book's home page. I just downloaded and ran it, and it works as described in the book (although, since it was built for Lion, you will have to update the target SDK in the project settings if you're on Mountain Lion or Mavericks)
--Chris (invalidname)
I am trying to use MPSImageIntegral to calculate the sum of some elements in an MTLTexture. This is what I'm doing:
std::vector<float> integralSumData;
for(int i = 0; i < 10; i++)
MTLTextureDescriptor *textureDescriptor = [MTLTextureDescriptor texture2DDescriptorWithPixelFormat:MTLPixelFormatR32Float
width:(integralSumData.size()) height:1 mipmapped:NO];
textureDescriptor.usage = MTLTextureUsageShaderRead | MTLTextureUsageShaderWrite;
id<MTLTexture> texture = [_device newTextureWithDescriptor:textureDescriptor];
// Calculate the number of bytes per row in the image.
NSUInteger bytesPerRow = integralSumData.size() * sizeof(float);
MTLRegion region =
{ 0, 0, 0 }, // MTLOrigin
{integralSumData.size(), 1, 1} // MTLSize
// Copy the bytes from the data object into the texture
[texture replaceRegion:region
MTLTextureDescriptor *textureDescriptor2 = [MTLTextureDescriptor texture2DDescriptorWithPixelFormat:MTLPixelFormatR32Float
width:(integralSumData.size()) height:1 mipmapped:NO];
textureDescriptor2.usage = MTLTextureUsageShaderRead | MTLTextureUsageShaderWrite;
id<MTLTexture> outtexture = [_device newTextureWithDescriptor:textureDescriptor2];
// Create a MPS filter.
MPSImageIntegral *integral = [[MPSImageIntegral alloc] initWithDevice: _device];
MPSOffset offset = { 0,0,0};
[integral setOffset:offset];
[integral setEdgeMode:MPSImageEdgeModeZero];
[integral encodeToCommandBuffer:commandBuffer sourceTexture:texture destinationTexture:outtexture];
[commandBuffer commit];
[commandBuffer waitUntilCompleted];
But, when I check my outtexture values, its all zeroes. Am I doing something wrong? Is this a correct way in which I shall use MPSImageIntegral?
I'm using the following code to read values written into the outTexture:
float outData[100];
[outtexture getBytes:outData bytesPerRow:bytesPerRow fromRegion:region mipmapLevel:0];
for(int i = 0; i < 100; i++)
std::cout << outData[i] << "\n";
As pointed out by #Matthijis: All I had to do was use an MTLBlitEncoder to make sure I synchronise my MTLTexture before reading it into CPU, and it worked like charm!
I am trying to get the on screen position of every visible window on os x.
In my function get_position(), AXUIElementCopyAttributeValue() returns kAXErrorAttributeUnsupported for every window on screen except for the finder window. Why is this the case? and What am I doing wrong?
int get_position()
CFArrayRef a = CGWindowListCopyWindowInfo(kCGWindowListOptionOnScreenOnly, kCGNullWindowID);
NSArray * arr = CFBridgingRelease(a);
pid_t window_pid = 0;
unsigned long count = [ arr count];
NSMutableDictionary* entry;
for ( unsigned long i = 0; i < count; i++)
//CFTypeRef position;
AXValueRef temp;
CGPoint current_point;
entry = arr[i];
window_pid = [[entry objectForKey:(id)kCGWindowOwnerPID] intValue];
NSString * temp_ns_string = [entry objectForKey:(id)kCGWindowName ];
const char *window_name =[temp_ns_string UTF8String];
printf("%s - ", window_name);
printf("Pid: %i\n", window_pid);
AXUIElementRef window_ref = AXUIElementCreateApplication(window_pid);
AXError error = AXUIElementCopyAttributeValue(window_ref, kAXPositionAttribute, (CFTypeRef *)&temp);
if ((AXValueGetValue(temp, kAXValueCGPointType, ¤t_point) ))
printf("%s - ", window_name);
printf("Pid: %i - ", window_pid);
printf(" %f,%f\n", current_point.x, current_point.y);
printf("%s - ", window_name);
printf("Pid: %i\n", window_pid);
return 0;
On MacOS, each app may have more than one window, different windows use the same pid.
AXUIElementRef app = AXUIElementCreateApplication(pid)
So you can get the 1 focus window by give kAXFocusedWindowAttribute
AXUIElementRef window;
AXUIElementCopyAttributeValue(app, kAXFocusedWindowAttribute, &window)
Or you can get multi windows by give kAXWindowsAttribute
NSArray *windows;
AXUIElementCopyAttributeValues(app, kAXWindowsAttribute,
(CFArrayRef *) &result
then get position or size from window
AXValueRef pos;
AXValueRef size;
AXUIElementCopyAttributeValue(window, kAXPositionAttribute, (CFTypeRef *)&pos)
AXUIElementCopyAttributeValue(window, kAXSizeAttribute, (CFTypeRef *)&size)
now, convert the position into actual CGPoint or CGSize
CGPoint cpoint = CGPointMake(0, 0);
CGSize csize = CGSizeMake(0, 0);
AXValueGetValue(pos, kAXValueCGPointType, &cpoint);
AXValueGetValue(size, kAXValueCGSizeType, &csize);
I think above should help.
AXUIElementRef window_ref = AXUIElementCreateApplication(window_pid);
AXUIElementCreateApplication returns an application object. Most applications don't have a position. Some applications return a position but this isn't the position of the window. The position of the window is in the dictionary with key kCGWindowBounds.
I'm using CoreAudio low level API for audio capturing. The app target is MAC OSX, not iOS.
During testing it, from time to time we got very annoying noise modulate with real audio. the phenomena develops with time, started from barely noticeable and become more and more dominant.
Analyze the captured audio under Audacity indicate that the end part of the audio packet is wrong.
Here are sample picture:
the intrusion repeat every 40 ms which is the configured packetization time (in terms of buffer samples)
Over time the gap became larger, here is another snapshot from the same captured file 10 minutes later. the gap now contains 1460 samples which is 33ms from the total 40ms of the packet!!
capture callback
OSStatus MacOS_AudioDevice::captureCallback(void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList *ioData)
MacOS_AudioDevice* _this = static_cast<MacOS_AudioDevice*>(inRefCon);
// Get the new audio data
OSStatus err = AudioUnitRender(_this->m_AUHAL, ioActionFlags, inTimeStamp, inBusNumber, inNumberFrames, _this->m_InputBuffer);
if (err != noErr)
return err;
// ignore callback on unexpected buffer size
if (_this->m_params.bufferSizeSamples != inNumberFrames)
return noErr;
// Deliver audio data
DeviceIOMessage message;
message.bufferSizeBytes = _this->m_deviceBufferSizeBytes;
message.buffer = _this->m_InputBuffer->mBuffers[0].mData;
if (_this->m_callbackFunc)
_this->m_callbackFunc(_this, message);
Open and start capture device:
void MacOS_AudioDevice::openAUHALCapture()
UInt32 enableIO;
AudioStreamBasicDescription streamFormat;
UInt32 size;
SInt32 *channelArr;
std::stringstream ss;
AudioObjectPropertyAddress deviceBufSizeProperty =
AudioComponentDescription cd = {kAudioUnitType_Output, kAudioUnitSubType_HALOutput, kAudioUnitManufacturer_Apple, 0, 0};
AudioComponent HALOutput = AudioComponentFindNext(NULL, &cd);
verify_macosapi(AudioComponentInstanceNew(HALOutput, &m_AUHAL));
// enable input IO
enableIO = 1;
verify_macosapi(AudioUnitSetProperty(m_AUHAL, kAudioOutputUnitProperty_EnableIO, kAudioUnitScope_Input, 1, &enableIO, sizeof(enableIO)));
// disable output IO
enableIO = 0;
verify_macosapi(AudioUnitSetProperty(m_AUHAL, kAudioOutputUnitProperty_EnableIO, kAudioUnitScope_Output, 0, &enableIO, sizeof(enableIO)));
// Setup current device
size = sizeof(AudioDeviceID);
verify_macosapi(AudioUnitSetProperty(m_AUHAL, kAudioOutputUnitProperty_CurrentDevice, kAudioUnitScope_Global, 0, &m_MacDeviceID, sizeof(AudioDeviceID)));
// Set device native buffer length before setting AUHAL stream
size = sizeof(m_originalDeviceBufferTimeFrames);
verify_macosapi(AudioObjectSetPropertyData(m_MacDeviceID, &deviceBufSizeProperty, 0, NULL, size, &m_originalDeviceBufferTimeFrames));
// Get device format
size = sizeof(AudioStreamBasicDescription);
verify_macosapi(AudioUnitGetProperty(m_AUHAL, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Input, 1, &streamFormat, &size));
// Setup channel map
assert(m_params.numOfChannels <= streamFormat.mChannelsPerFrame);
channelArr = new SInt32[streamFormat.mChannelsPerFrame];
for (int i = 0; i < streamFormat.mChannelsPerFrame; i++)
channelArr[i] = -1;
for (int i = 0; i < m_params.numOfChannels; i++)
channelArr[i] = i;
verify_macosapi(AudioUnitSetProperty(m_AUHAL, kAudioOutputUnitProperty_ChannelMap, kAudioUnitScope_Input, 1, channelArr, sizeof(SInt32) * streamFormat.mChannelsPerFrame));
delete [] channelArr;
// Setup stream converters
streamFormat.mFormatID = kAudioFormatLinearPCM;
streamFormat.mFormatFlags = kAudioFormatFlagIsSignedInteger;
streamFormat.mFramesPerPacket = m_SamplesPerPacket;
streamFormat.mBitsPerChannel = m_params.sampleDepthBits;
streamFormat.mSampleRate = m_deviceSampleRate;
streamFormat.mChannelsPerFrame = 1;
streamFormat.mBytesPerFrame = 2;
streamFormat.mBytesPerPacket = streamFormat.mFramesPerPacket * streamFormat.mBytesPerFrame;
verify_macosapi(AudioUnitSetProperty(m_AUHAL, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Output, 1, &streamFormat, size));
// Setup callbacks
AURenderCallbackStruct input;
input.inputProc = captureCallback;
input.inputProcRefCon = this;
verify_macosapi(AudioUnitSetProperty(m_AUHAL, kAudioOutputUnitProperty_SetInputCallback, kAudioUnitScope_Global, 0, &input, sizeof(input)));
// Calculate the size of the IO buffer (in samples)
if (m_params.bufferSizeMS != -1)
unsigned int desiredSignalsInBuffer = (m_params.bufferSizeMS / (double)1000) * m_deviceSampleRate;
// making sure the value stay in the device's supported range
desiredSignalsInBuffer = std::min<unsigned int>(desiredSignalsInBuffer, m_deviceBufferFramesRange.mMaximum);
desiredSignalsInBuffer = std::max<unsigned int>(m_deviceBufferFramesRange.mMinimum, desiredSignalsInBuffer);
m_deviceBufferFrames = desiredSignalsInBuffer;
// Set device buffer length
size = sizeof(m_deviceBufferFrames);
verify_macosapi(AudioObjectSetPropertyData(m_MacDeviceID, &deviceBufSizeProperty, 0, NULL, size, &m_deviceBufferFrames));
m_deviceBufferSizeBytes = m_deviceBufferFrames * streamFormat.mBytesPerFrame;
m_deviceBufferTimeMS = 1000 * m_deviceBufferFrames/m_deviceSampleRate;
// Calculate number of buffers from channels
size = offsetof(AudioBufferList, mBuffers[0]) + (sizeof(AudioBuffer) * m_params.numOfChannels);
// Allocate input buffer
m_InputBuffer = (AudioBufferList *)malloc(size);
m_InputBuffer->mNumberBuffers = m_params.numOfChannels;
// Pre-malloc buffers for AudioBufferLists
for(UInt32 i = 0; i< m_InputBuffer->mNumberBuffers ; i++)
m_InputBuffer->mBuffers[i].mNumberChannels = 1;
m_InputBuffer->mBuffers[i].mDataByteSize = m_deviceBufferSizeBytes;
m_InputBuffer->mBuffers[i].mData = malloc(m_deviceBufferSizeBytes);
// Update class properties
m_params.sampleRateHz = streamFormat.mSampleRate;
m_params.bufferSizeSamples = m_deviceBufferFrames;
m_params.bufferSizeBytes = m_params.bufferSizeSamples * streamFormat.mBytesPerFrame;
eADMReturnCode MacOS_AudioDevice::start()
eADMReturnCode ret = OK;
if (!m_isStarted && m_isOpen)
OSStatus err = AudioOutputUnitStart(m_AUHAL);
if (err == noErr)
m_isStarted = true;
ret = ERROR;
return ret;
Any idea what cause it and how to solve?
Thanks in advance!
Periodic glitches or dropouts can be caused by not paying attention to or by not fully processing the number of frames sent to each audio callback. Valid buffers don't always contain the expected or same number of samples (inNumberFrames might not equal bufferSizeSamples or the previous inNumberFrames in a perfectly valid audio buffer).
It is possible that these types of glitches might be caused by attempting to record at 44.1k on some models of iOS devices that only support 48k audio in hardware.
Some types of glitch might also be caused by any non-hard-real-time code within your m_callbackFunc function (such as any synchronous file reads/writes, OS calls, Objective C message dispatch, GC, or memory allocation/deallocation).
Hello I am trying to learn core audio farm this book:http://www.amazon.com/Learning-Core-Audio-Hands-On-Programming/dp/0321636848
But when I try to run this code:
#import <Foundation/Foundation.h>
#import <AudioToolbox/AudioToolbox.h>
#define SAMPLE_RATE 44100
#define DURATION 5
#define FILENAME_FORMAT #"%0.03f-test.aif"
int main(int argc, const char * argv[])
return -1;
double hz = 44;
NSLog(#"Generating hz tone:%f",hz);
NSString* fileName = [NSString stringWithFormat:FILENAME_FORMAT, hz];
NSString* filePath = [[[NSFileManager defaultManager]currentDirectoryPath]
NSURL* fileURL = [NSURL fileURLWithPath:filePath];
AudioStreamBasicDescription asbd;
memset(&asbd, 0, sizeof(asbd));
asbd.mSampleRate = SAMPLE_RATE;
asbd.mFormatID = kAudioFormatLinearPCM;
asbd.mFormatFlags = kAudioFormatFlagIsBigEndian | kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
asbd.mBitsPerChannel = 16;
asbd.mChannelsPerFrame = 1;
asbd.mFramesPerPacket = 1;
asbd.mBytesPerFrame = 2;
asbd.mBytesPerPacket = 2;
AudioFileID audioFile;
OSStatus audioErr = noErr;
audioErr = AudioFileCreateWithURL((__bridge CFURLRef)fileURL, kAudioFileAIFFType, &asbd, kAudioFileFlags_EraseFile, &audioFile);
assert(audioErr == noErr);
long maxSampleCount = SAMPLE_RATE * DURATION;
long sampleCount = 0;
UInt32 bytesToWrite = 2;
double waveLengthInSamples = SAMPLE_RATE / hz;
while(sampleCount < maxSampleCount)
for(int i=0;i<waveLengthInSamples;i++)
SInt16 sample;
sample = CFSwapInt16BigToHost(SHRT_MAX);
sample = CFSwapInt16BigToHost(SHRT_MIN);
audioErr = AudioFileWriteBytes(audioFile, false, sampleCount*2, &bytesToWrite, &sample);
assert(audioErr = noErr);
audioErr = AudioFileClose(audioFile);
assert(audioErr = noErr);
return 0;
The program exit with this error code: Program ended with exit code: 255
Can anyone help me? I downloaded the sample code and the the same error occurs. I am using xcode 5 and a 64bit macbook.
Thanks for your help.
It looks like you've modified the book's code to explicitly set the tone to 44Hz.
double hz = 44;
However, the original code expected you to input the tone as a command line parameter. These lines are checking for that parameter, and return -1 (or 255) when no parameter is found.
return -1;
Remove those two lines to remove the parameter check.
I want to write an encoder with ffmpeg which can put iFrames (keyframes) at positions I want. Where can I found tutorials or reference material for it?
Is it possible to do this with mencoder or any opensource encoder. I want to encode H263 file. I am writing under & for linux.
You'll need to look at the libavcodec documentation - specifically, at avcodec_encode_video(). I found that the best available documentation is in the ffmpeg header files and the API sample source code that's provided with the ffmpeg source. Specifically, look at libavcodec/api-example.c or even ffmpeg.c.
To force an I frame, you'll need to set the pict_type member of the picture you're encoding to 1: 1 is an I frame, 2 is a P frame, and I don't remember what's the code for a B frame off the top of my head... Also, the key_frame member needs to be set to 1.
Some introductory material is available here and here, but I don't really know how good it is.
You'll need to be careful how you allocate the frame objects that the API calls require. api-example.c is your best bet as far as that goes, in my opinion. Look for the function video_encode_example() - it's concise and illustrates all the important things you need to worry about - pay special attention to the second call to avcodec_encode_video() that passes a NULL picture argument - it's required to get the last frames of video since MPEG video is encoded out of sequence and you may end up with a delay of a few frames.
An up-to-date version of api-example.c can be found at http://ffmpeg.org/doxygen/trunk/doc_2examples_2decoding_encoding_8c-example.html
It does the entire video encoding in a single and relatively short function. So this is probably a good place to start. Compile and run it. And then start modifying it until it does what you want.
It also has audio encoding and audio & video decoding examples.
GStreamer has decent documentation, has bindings for a number of languages (although the native API is C), and supports any video format you can find plugins for, including H.263 via gstreamer-ffmpeg.
you will need libavcodec library, For the first step I think you can learn about its use in ffplay.c file inside ffmpeg source code. It would tell you a lot. You can check my project also about video at rtstegvideo.sourceforge.net.
Hope this help.
If you're Java programmer then use Xuggler.
Minimal runnable example on FFmpeg 2.7
Based on Ori Pessach's answer, below is a minimal example that generates frames of form.
The key parts of the code that control frame type are:
c = avcodec_alloc_context3(codec);
/* Minimal distance of I-frames. This is the maximum value allowed,
or else we get a warning at runtime. */
c->keyint_min = 600;
/* Or else it defaults to 0 b-frames are not allowed. */
c->max_b_frames = 1;
frame->key_frame = 0;
switch (frame->pts % 4) {
case 0:
frame->key_frame = 1;
frame->pict_type = AV_PICTURE_TYPE_I;
case 1:
case 3:
frame->pict_type = AV_PICTURE_TYPE_P;
case 2:
frame->pict_type = AV_PICTURE_TYPE_B;
We can then verify the frame type with:
ffprobe -select_streams v \
-show_frames \
-show_entries frame=pict_type \
-of csv \
as mentioned at: https://superuser.com/questions/885452/extracting-the-index-of-key-frames-from-a-video-using-ffmpeg
Some rules were enforced by FFmpeg even if I try to overcome them:
the first frame is an I-frame
cannot place a B0frame before an I-frame (TODO why?)
Preview of generated output.
#include <libavcodec/avcodec.h>
#include <libavutil/imgutils.h>
#include <libavutil/opt.h>
#include <libswscale/swscale.h>
static AVCodecContext *c = NULL;
static AVFrame *frame;
static AVPacket pkt;
static FILE *file;
struct SwsContext *sws_context = NULL;
Convert RGB24 array to YUV. Save directly to the `frame`,
modifying its `data` and `linesize` fields
static void ffmpeg_encoder_set_frame_yuv_from_rgb(uint8_t *rgb) {
const int in_linesize[1] = { 3 * c->width };
sws_context = sws_getCachedContext(sws_context,
c->width, c->height, AV_PIX_FMT_RGB24,
c->width, c->height, AV_PIX_FMT_YUV420P,
0, 0, 0, 0);
sws_scale(sws_context, (const uint8_t * const *)&rgb, in_linesize, 0,
c->height, frame->data, frame->linesize);
Generate 2 different images with four colored rectangles, each 25 frames long:
Image 1:
black | red
green | blue
Image 2:
yellow | red
green | white
uint8_t* generate_rgb(int width, int height, int pts, uint8_t *rgb) {
int x, y, cur;
rgb = realloc(rgb, 3 * sizeof(uint8_t) * height * width);
for (y = 0; y < height; y++) {
for (x = 0; x < width; x++) {
cur = 3 * (y * width + x);
rgb[cur + 0] = 0;
rgb[cur + 1] = 0;
rgb[cur + 2] = 0;
if ((frame->pts / 25) % 2 == 0) {
if (y < height / 2) {
if (x < width / 2) {
/* Black. */
} else {
rgb[cur + 0] = 255;
} else {
if (x < width / 2) {
rgb[cur + 1] = 255;
} else {
rgb[cur + 2] = 255;
} else {
if (y < height / 2) {
rgb[cur + 0] = 255;
if (x < width / 2) {
rgb[cur + 1] = 255;
} else {
rgb[cur + 2] = 255;
} else {
if (x < width / 2) {
rgb[cur + 1] = 255;
rgb[cur + 2] = 255;
} else {
rgb[cur + 0] = 255;
rgb[cur + 1] = 255;
rgb[cur + 2] = 255;
return rgb;
/* Allocate resources and write header data to the output file. */
void ffmpeg_encoder_start(const char *filename, int codec_id, int fps, int width, int height) {
AVCodec *codec;
int ret;
codec = avcodec_find_encoder(codec_id);
if (!codec) {
fprintf(stderr, "Codec not found\n");
c = avcodec_alloc_context3(codec);
if (!c) {
fprintf(stderr, "Could not allocate video codec context\n");
c->bit_rate = 400000;
c->width = width;
c->height = height;
c->time_base.num = 1;
c->time_base.den = fps;
/* I, P, B frame placement parameters. */
c->gop_size = 600;
c->max_b_frames = 1;
c->keyint_min = 600;
c->pix_fmt = AV_PIX_FMT_YUV420P;
if (codec_id == AV_CODEC_ID_H264)
av_opt_set(c->priv_data, "preset", "slow", 0);
if (avcodec_open2(c, codec, NULL) < 0) {
fprintf(stderr, "Could not open codec\n");
file = fopen(filename, "wb");
if (!file) {
fprintf(stderr, "Could not open %s\n", filename);
frame = av_frame_alloc();
if (!frame) {
fprintf(stderr, "Could not allocate video frame\n");
frame->format = c->pix_fmt;
frame->width = c->width;
frame->height = c->height;
ret = av_image_alloc(frame->data, frame->linesize, c->width, c->height, c->pix_fmt, 32);
if (ret < 0) {
fprintf(stderr, "Could not allocate raw picture buffer\n");
Write trailing data to the output file
and free resources allocated by ffmpeg_encoder_start.
void ffmpeg_encoder_finish(void) {
uint8_t endcode[] = { 0, 0, 1, 0xb7 };
int got_output, ret;
do {
ret = avcodec_encode_video2(c, &pkt, NULL, &got_output);
if (ret < 0) {
fprintf(stderr, "Error encoding frame\n");
if (got_output) {
fwrite(pkt.data, 1, pkt.size, file);
} while (got_output);
fwrite(endcode, 1, sizeof(endcode), file);
Encode one frame from an RGB24 input and save it to the output file.
Must be called after ffmpeg_encoder_start, and ffmpeg_encoder_finish
must be called after the last call to this function.
void ffmpeg_encoder_encode_frame(uint8_t *rgb) {
int ret, got_output;
pkt.data = NULL;
pkt.size = 0;
switch (frame->pts % 4) {
case 0:
frame->key_frame = 1;
frame->pict_type = AV_PICTURE_TYPE_I;
case 1:
case 3:
frame->key_frame = 0;
frame->pict_type = AV_PICTURE_TYPE_P;
case 2:
frame->key_frame = 0;
frame->pict_type = AV_PICTURE_TYPE_B;
ret = avcodec_encode_video2(c, &pkt, frame, &got_output);
if (ret < 0) {
fprintf(stderr, "Error encoding frame\n");
if (got_output) {
fwrite(pkt.data, 1, pkt.size, file);
/* Represents the main loop of an application which generates one frame per loop. */
static void encode_example(const char *filename, int codec_id) {
int pts;
int width = 320;
int height = 240;
uint8_t *rgb = NULL;
ffmpeg_encoder_start(filename, codec_id, 25, width, height);
for (pts = 0; pts < 100; pts++) {
frame->pts = pts;
rgb = generate_rgb(width, height, pts, rgb);
int main(void) {
encode_example("tmp.h264", AV_CODEC_ID_H264);
encode_example("tmp.mpg", AV_CODEC_ID_MPEG1VIDEO);
/* TODO: is this encoded correctly? Possible to view it without container? */
/*encode_example("tmp.vp8", AV_CODEC_ID_VP8);*/
return 0;
Tested on Ubuntu 15.10. GitHub upstream.
Do you really want to do this?
In most cases, you are better off just controlling the global parameters of AVCodecContext.
FFmpeg does smart things like using a keyframe if the new frame is completely different from the previous one, and not much would be gained from differential encoding.
For example, if we set just:
c->keyint_min = 600;
then we get exactly 4 key-frames on the above example, which is logical since there are 4 abrupt frame changes on the generated video.