H264: decode series of nal units with ffmpeg - ffmpeg

I tried to decode a series of nal units with ffmpeg (libavcodec) but I get a "no frame" error. I produced the nal units with the guideline at How does one encode a series of images into H264 using the x264 C API?. I tried the following strategy for decoding:
avcodec_init();
avcodec_register_all();
AVCodec* pCodec;
pCodec=lpavcodec_find_decoder(CODEC_ID_H264);
AVCodecContext* pCodecContext;
pCodecContext=lpavcodec_alloc_context();
avcodec_open(pCodecContext,pCodec);
AVFrame *pFrame;
pFrame=avcodec_alloc_frame();
//for every nal unit:
int frameFinished=0;
//nalData2 is nalData without the first 4 bytes
avcodec_decode_video(pCodecContext,pFrame,&frameFinished,(uint8_t*) nalData2,nalLength);
I passed all units I got to this code but frameFinished stays 0. I guess there must be something wrong with the pCodecContext setup. Can someone send me a working example?
Thank you

Check out this tutorial. It should be able to decode any video type including H.264:
http://dranger.com/ffmpeg/
I don't know exactly what is causing the problem, but I suspect it has something to do with the fact that you are not using the av_read_frame from libavformat to parse out a frames worth of data at a time. H.264 has the ability to split a frame into multiple slices and therefore multiple NAL units.
I am pretty sure the x264 encoder does not do this by default and produces one NAL unit per frame. However, there are NAL units with other stream information that need to be fed to the decoder. I have experimented with this in the past and when av_read_frame parses out a frames worth of data, it sometimes contains multiple NAL units. I would suggest following the tutorial closely and seeing if that works.
Another thing is that I think you do need to pass the first 4 bytes of the NAL unit into avcodec_decode_video if that is the start code you are talking about (0x00000001). Having investigated the output from av_read_frame, the start codes are still in the data when passed to the decoder.

Try this after the codec context instantiation code:
if(pCodec->capabilities & CODEC_CAP_TRUNCATED)
pCodecContext->flags |= CODEC_FLAG_TRUNCATED; /* We may send incomplete frames */
if(pCodec->capabilities & CODEC_FLAG2_CHUNKS)
pCodecContext->flags2 |= CODEC_FLAG2_CHUNKS;

Related

How to tell ffmpeg to skip the last NAL unit?

I'm new to ffmpeg and trying to use it to record a streaming video (e.g. recording YouTube streams). However, I guess sometimes due to network issues, the last NAL unit is corrupted and when I want to concat multiple videos together, the error below occurs and the process exited.
[NULL # 0x558551957ec0] Invalid NAL unit size (41974 > 39166).bitrate=12309.0kbits/s speed=21.5x
[NULL # 0x558551957ec0] missing picture in access unit with size 39182
[concat # 0x55855194c700] h264_mp4toannexb filter failed to receive output packet
../filelist.txt: Invalid data found when processing input
So I'm wondering if there's a way to tell ffmpeg to skip the last NAL unit (or skip any invalid NAL unit) without re-encoding the entire video? Thanks in advance!

NAL Type STAP-A and retrieve the sps and pps

I wrote a RTP server to receive the RTP packets which are sent by command ffmpeg -i test.mp4 rtp rtp://ip:port (client) and the server could get the nal type 24 (STAP-A).
And I want to use the server to retrieve the spa and pps from the first nal(type 24) instead of info from ffmpeg command.
Is it possible SPS and PPS would be aggregated in one nal ?
for example
[RTP header][nal header(type 24)][nal1 header][nal1 size][nal1 payload][nal2 header][nal2 size][nal2 payload]...
thanks
It's highly likely that the STAP-A consists of the SPS and PPS: these NAL units are usually at the beginning of the stream, small and can be aggregated into a STAP A. If the IDR is small enough, it might also be part of the STAP, but usually this is to big and will be sent separately.
The best thing to verify this is to split the STAP-A into the original NAL units (See RFC6184) and check for types 7 (SPS) and 8 (PPS).

convert h264 avcc stream to SampleBuffer in iOS 8

How do I convert h264 avcc stream to SampleBuffer in iOS 8
Need the sample buffer to perform decompression session
It has to be in Mpeg-4 format. You will need an algorithm to read the raw byte stream looking for NAL separator codes (usually 00000001) and then look for the NAL unit code in the next byte.
Typical NAL Unit codes are 67 (PPS), 68 (SPS), 41 (P Frame) and 65 (I Frame)
Generally the first will be the SPS, capture these bytes in an NSMutable Data, then capture the PPS in another. use CMVideoFormatDescriptionCreateFromH264ParameterSets with the SPS and PPS data to create a CMFormatDescription.
Use the CMFormatDescription to create a VTDecompressionSession using VTDecompressionSessionCreate.
When you have these, capture the NALUnit payload into a CMBlockBuffer making sure to replace the separator code with a 4 byte length code (the length of the NalUnit including the unit code) and then using this and the Format Description you created before, create a CMSampleBuffer.
You now have everything you need to use VTDecompressionSessionDecodeFrame.

FFMPEG: Frame parameter initializations in HEVC decoder

I'm going through the HEVC decoder integrated in FFMPEG. I'm actually trying to understand its flow and working.
By flow, i mean the part in code where it is reading various parameters of the input .bin file. Like where is it reading the resolution, where is it deciding the fps that it needs to play, the output display format that is yuv420p etc.
Initially What i suspected is the demuxer of hevc situated at /libavformat/hevcdec.c In this file does the input file reading work. There is a probe function which is used to detect which decoder to select while decoding the input bin stream. Further we have a FF_DEF_RAWVIDEO_DEMUXER. Is it in this function that the resolution and other parameters read from the input file?
Secondly what i suspect is the hevc parser situated at: /libavcodec/hevc_parser.c but here i think it is just parsing the frame data, that is finding end of frame. So, is this assumption of mine right?
Any suggestions or any predictions will be really helpful to me. Please provide your valuable suggestions. Thanks in advance.
To understand more specifically what is going on in decoder, it's better to start your study with HEVC/H.265 standard (http://www.itu.int/rec/T-REC-H.265). It contains all the information you need to know to find the location of resolution, fps, etc.
If you want to get more details from FFMPEG, here are some hints:
/libavcodec/hevc_parser.c contains H.265 Annex B parser, which converts byte stream into series of NAL units. Each NAL unit has its own format and should be parsed depending on its NAL unit type.
If you are looking for basic properties of video sequence, you may be interested in SPS (Sequence Parameter Set) parsing. Its format is described in section 7.3.2.2.1 of the standard and there is a function ff_hevc_decode_nal_sps located in /libavcodec/hevc_ps.c which extracts SPS parameters from the bitstream.
Note: I was talking about FFMPEG version 2.5.3. Code structure can be different for other versions.

Parsing NAL units using FFMPEG

I am new to MPEG-4 and taking baby steps to learn it. I am using FFMPEG as reference.
I understand that all mpeg-4 are encoded into NAL units and wrt to FFMPEG av_read_frame() function returns one NAL unit, Am I right? Is frame a NAL unit? (though it can be a combination of multiple NALs)
I also saw that h264_parser.c implements a function called h264_parse which is calling parse_nal_units() inside, If i need to get NAL units how can I use this parse_nal_units from my main function?
What is av_parse_Parse2() function do? does it return decoded NAL units?
OR FFMPEG has -vbsf h264_mp4toannexb switch to dump raw NAL units, Can somebody help me understand how I can use the same from my main function?
Please help me out here...
-ash5
For question 1:
The following article has links that will help you understand what NALs are.
In h264 NAL units means frame.?
NALs are divided into several types, and depending on the type can contain decoding parameters (SPS, PPS), enhancement information (SEI) and video samples (slice header and data). A common sequence from a broadcast transport stream would be SPS, PPS, SEI, slice_header(), slice_data(), SEI, slice_header(), slice_data() *
You probably don't need to understand ISO 14496-10 section 7.3 "Syntax in tabular form" for your application.

Resources