User custom private data in AVPacket - ffmpeg

Is there a way to attach some user private data to AVPacket before decoding to be able to match input AVPacket and decoded output AVFrame? Some sort of AVFrame::opaque.
Specifically, decoding process of h264 codestream can do reorder in a case B-frames are present and I would like to identify which AVPacket was decoded to which AVFrame.

Thanks to #Gyan I was able to solve the issue with the following code in the main decoding loop.
static uint64_t privateId = 0;
// Allocate dictionary and add appropriate key/value record
AVDictionary * frameDict = NULL;
av_dict_set(&frameDict, "private_id", std::to_string(privateId++).c_str(), 0);
// Pack dictionary to be able to use it as a side data in AVPacket
int frameDictSize = 0;
uint8_t *frameDictData = av_packet_pack_dictionary(frameDict, &frameDictSize);
// Free dictionary not used any more
av_dict_free(&frameDict);
// Add side_data to AVPacket which will be decoded
av_packet_add_side_data(&avPacket, AVPacketSideDataType::AV_PKT_DATA_STRINGS_METADATA, frameDictData, frameDictSize);
// Do the actual decoding
...
// Free side data from packet
av_packet_free_side_data(&avPacket);
// Obtain privateId from decoded frame
uint64_t privateId = std::stoul(av_dict_get(avFrame->metadata, "private_id", NULL, 0)->value);
// Free dictionary from decoded frame
av_dict_free(&avFrame->metadata);

Related

Decoding of char array data

I have a mesh from which I need to read the vertex positions of but I can just get a buffer with that data, which I seemingly can get as an utf-8 char array.
Currently I'm getting the data from the buffer into the array I metioned and wirte it into a char* but i can't get the decoding correctly or so it seems.
The following code reads the dara from the buffer:
char* GetDataFromIBuffer(Windows::Storage::Streams::IBuffer^ container)
{
unsigned int bufferLength = container->Length;
auto dataReader = Windows::Storage::Streams::DataReader::FromBuffer(container);
Platform::Array<unsigned char>^ managedBytes =
ref new Platform::Array<unsigned char>(bufferLength);
dataReader->ReadBytes(managedBytes);
char * bytes = new char[bufferLength];
for (unsigned int i = 0; i < bufferLength; i++)
{
if (managedBytes[i] == '\0')
{
bytes[i] = '0';
}
else
{
bytes[i] = managedBytes[i];
}
}
}
I can see the data in debug mode but i need a method to make it readable and write it into a file, where i can copy the mesh data and draw the mesh in a seperate program.
The following image shows the array data which can be seen in the array:
debug mode
Be careful not to mix up text encoding and data types.
char is a type often used for buffers because it has the size of a byte, but that doesn't mean that the data contained in the buffer is text.
Your debug view seem to confirm that the data inside your buffer is not text, because when interpreted as text, it gives weird characters such as 'ΓΏ', '^', etc...
UTF-8 is a way to encode unicode text, so it has nothing to do with binary data.
You need to find a way to cast your buffer data info the internal type of the data, it should be documented where you got that data (maybe it's just an array of floats ?)

Format of microphone audio passed to call back in mac OS X core audio example

I need access to audio data from microphone on macbook. I have the an example program for recording microphone data based on the one in "Learning Core Audio". When I run this program and break on the call back routine I see the inBuffer pointer and the mAudioData pointer. However I am having a heck of a time making sense of the data. I've tried casting the void* pointer to mAudioData to SInt16, to SInt32 and to float and tried a number of endian conversions all with nonsense looking results. What I need to know definitively is the number format for the data in the buffer. The example actually works writing microphone data to a file which I can play so I know that real audio is being recorded.
AudioStreamBasicDescription recordFormat;
memset(&recordFormat,0,sizeof(recordFormat));
//recordFormat.mFormatID = kAudioFormatMPEG4AAC;
recordFormat.mFormatID = kAudioFormatLinearPCM;
recordFormat.mChannelsPerFrame = 2;
recordFormat.mBitsPerChannel = 16;
recordFormat.mBytesPerPacket = recordFormat.mBytesPerFrame = recordFormat.mChannelsPerFrame * sizeof(SInt16);
recordFormat.mFramesPerPacket = 1;
MyGetDefaultInputDeviceSampleRate(&recordFormat.mSampleRate);
UInt32 propSize = sizeof(recordFormat);
CheckError(AudioFormatGetProperty(kAudioFormatProperty_FormatInfo,
0,
NULL,
&propSize,
&recordFormat),
"AudioFormatProperty failed");
//set up queue
AudioQueueRef queue = {0};
CheckError(AudioQueueNewInput(&recordFormat,
MyAQInputCallback,
&recorder,
NULL,
kCFRunLoopCommonModes,
0,
&queue),
"AudioQueueNewInput failed");
UInt32 size = sizeof(recordFormat);
CheckError(AudioQueueGetProperty(queue,
kAudioConverterCurrentOutputStreamDescription,
&recordFormat,
&size), "Couldn't get queue's format");

CoreAudio: how to retrieve actual sampling rate of raw data?

When attempting to play AAC-HE content in an mp4 container, the reported sampling rate found in the mp4 container appears to be half of the actual sampling rate.
E.g it appears as 24kHz instead of 48kHz.
Using the FFmpeg AAC decoder, retrieving the actual sampling rate can be done by simply decoding an audio packet using
avcodec_decode_audio4
And looking at AVCodecContext::sample_rate which will be updated appropriately. From that it's easy to adapt the output.
With CoreAudio decoder, I would use a AudioConverterRef set the input and output AudioStreamBasicDescription
and call AudioConverterFillComplexBuffer
As the converter performs all the required internal conversion including resampling it's fine. But it plays the content after resampling it to 24kHz (as that's what the input AudioStreamBasicDescription contains.
Would there be a way to retrieve the actual sampling rate as found be the decoder (rather than the demuxer) in a similar fashion as one can with FFmpeg ?
Would prefer to avoid losing audio quality if at all possible, and not downmix data
Thanks
Found this :
https://developer.apple.com/library/ios/qa/qa1639/_index.html
explaining on how to retrieve the higher quality stream..
resulting code is as follow:
AudioStreamBasicDescription inputFormat;
AudioFormatListItem* formatListPtr = NULL;
UInt32 propertySize;
OSStatus rv = noErr;
rv = AudioFileStreamGetPropertyInfo(mStream,
kAudioFileStreamProperty_FormatList,
&propertySize,
NULL);
if (rv == noErr) {
// allocate memory for the format list items
formatListPtr = static_cast<AudioFormatListItem*>(malloc(propertySize));
if (!formatListPtr) {
LOG("Error %d constructing AudioConverter", rv);
mCallback->Error();
return;
}
// get the list of Audio Format List Item's
rv = AudioFileStreamGetProperty(mStream,
kAudioFileStreamProperty_FormatList,
&propertySize,
formatListPtr);
if (rv == noErr) {
UInt32 itemIndex;
UInt32 indexSize = sizeof(itemIndex);
// get the index number of the first playable format -- this index number will be for
// the highest quality layer the platform is capable of playing
rv = AudioFormatGetProperty(kAudioFormatProperty_FirstPlayableFormatFromList,
propertySize,
formatListPtr,
&indexSize,
&itemIndex);
if (rv != noErr) {
free(formatListPtr);
LOG("Error %d retrieving best format for AudioConverter", rv);
return;
}
// copy the format item at index we want returned
inputFormat = formatListPtr[itemIndex].mASBD;
}
free(formatListPtr);
} else {
// Fill in the input format description from the stream.
nsresult rv = AppleUtils::GetProperty(mStream,
kAudioFileStreamProperty_DataFormat,
&inputFormat);
if (NS_FAILED(rv)) {
LOG("Error %d retrieving default format for AudioConverter", rv);
return;
}
}
// Fill in the output format manually.
PodZero(&mOutputFormat);
mOutputFormat.mFormatID = kAudioFormatLinearPCM;
mOutputFormat.mSampleRate = inputFormat.mSampleRate;
mOutputFormat.mChannelsPerFrame = inputFormat.mChannelsPerFrame;

Decoder crashes after ffmpeg upgrade

Recently I upgraded ffmpeg from 0.9 to 1.0 (tested on Win7x64 and on iOS), and now avcodec_decode_video2 seagfaults. Long story short: the crash occurs every time the video dimensions change (eg. from 320x240 to 160x120 or vice versa).
I receive mpeg4 video stream from some proprietary source and decode it like this:
// once, during initialization:
AVCodec *codec_ = avcodec_find_decoder(CODEC_ID_MPEG4);
AVCodecContext ctx_ = avcodec_alloc_context3(codec_);
avcodec_open2(ctx_, codec_, 0);
AVPacket packet_;
av_init_packet(&packet_);
AVFrame picture_ = avcodec_alloc_frame();
// on every frame:
int got_picture;
packet_.size = size;
packet_.data = (uint8_t *)buffer;
avcodec_decode_video2(ctx_, picture_, &got_picture, &packet_);
Again, all the above had worked flawlessly until I upgraded to 1.0. Now every time the frame dimensions change - avcodec_decode_video2 crashes. Note that I don't assign width/height in AVCodecContext - neither in the beginning, nor when the stream changes - can it be the reason?
I'd appreciate any idea!
Update: setting ctx_.width and ctx_.height doesn't help.
Update2: just before the crash I get the following log messages:
mpeg4, level 24: "Found 2 unreleased buffers!".
level 8: "Assertion i < avci->buffer_count failed at libavcodec/utils.c:603"
Update3 upgrading to 1.1.2 fixed this crash. The decoder is able again to cope with dimensions change on the fly.
You can try to fill the AVPacket::side_data. If you change the frame size, codec receives information from it (see libavcodec/utils.c apply_param_change function)
This structure can be filled as follows:
int my_ff_add_param_change(AVPacket *pkt, int32_t width, int32_t height)
{
uint32_t flags = 0;
int size = 4 * 3;
uint8_t *data;
if (!pkt)
return AVERROR(EINVAL);
flags = AV_SIDE_DATA_PARAM_CHANGE_DIMENSIONS;
data = av_packet_new_side_data(pkt, AV_PKT_DATA_PARAM_CHANGE, size);
if (!data)
return AVERROR(ENOMEM);
((uint32_t*)data)[0] = flags;
((uint32_t*)data)[1] = width;
((uint32_t*)data)[2] = height;
return 0;
}
You need to call this function every time the size changes.
I think this feature has appeared recently. I didn't know about it until I looked new ffmpeg sources.
UPD
As you write, the easiest method to solve the problem is to perform codec restart. Just call avcodec_close / avcodec_open2
I just ran into same issue when my frames were changing size on the fly. However, calling avcodec_close/avcodec_open2 is superflous. A cleaner way is to just reset your AVPacket data structure before the call to avcodec_decode_video2. Here it is the code:
av_init_packet(&packet_)
The key here is that this method resets the all of the values of AVPacket to defaults. Check docs for more info.

UINT16 monochrome image to 8bit monochrome Qimage using freeImage

I want to convert a UINT16 monochrome image to a 8 bits image, in C++.
I have that image in a
char *buffer;
I'd like to give the new converted buffer to a QImage (Qt).
I'm trying with freeImagePlus
fipImage fimage;
if (fimage.loadfromMemory(...) == false)
//error
loadfromMemory needs a fipMemoryIO adress:
loadfromMemory(fipMemoryIO &memIO, int flag = 0)
So I do
fipImage fimage;
BYTE *buf = (BYTE*)malloc(gimage.GetBufferLength() * sizeof(BYTE));
// 'buf' is empty, I have to fill it with 'buffer' content
// how can I do it?
fipMemoryIO memIO(buf, gimage.GetBufferLength());
fimage.loadFromMemory(memIO);
if (fimage.convertTo8Bits() == true)
cout << "Good";
Then I would do something like
fimage.saveToMemory(...
or
fimage.saveToHandle(...
I don't understand what is a FREE_IMAGE_FORMAT, which is the first argument to any of those two functions. I can't find information of those types in the freeImage documentation.
Then I'd finish with
imageQt = new QImage(destiny, dimX, dimY, QImage::Format_Indexed8);
How can I fill 'buf' with the content of the initial buffer?
And get the data from the fipImage to a uchar* data for a QImage?
Thanks.
The conversion is simple to do in plain old C++, no need for external libraries unless they are significantly faster and you care about such a speedup. Below is how I'd do the conversion, at least as a first cut. The data is converted inside of the input buffer, since the output is smaller than the input.
QImage from16Bit(void * buffer, int width, int height) {
int size = width*height*2; // length of data in buffer, in bytes
quint8 * output = reinterpret_cast<quint8*>(buffer);
const quint16 * input = reinterpret_cast<const quint16*>(buffer);
if (!size) return QImage;
do {
*output++ = *input++ >> 8;
} while (size -= 2);
return QImage(output, width, height, QImage::Format_Indexed8);
}

Resources