I am new to MPEG-4 and taking baby steps to learn it. I am using FFMPEG as reference.
I understand that all mpeg-4 are encoded into NAL units and wrt to FFMPEG av_read_frame() function returns one NAL unit, Am I right? Is frame a NAL unit? (though it can be a combination of multiple NALs)
I also saw that h264_parser.c implements a function called h264_parse which is calling parse_nal_units() inside, If i need to get NAL units how can I use this parse_nal_units from my main function?
What is av_parse_Parse2() function do? does it return decoded NAL units?
OR FFMPEG has -vbsf h264_mp4toannexb switch to dump raw NAL units, Can somebody help me understand how I can use the same from my main function?
Please help me out here...
-ash5
For question 1:
The following article has links that will help you understand what NALs are.
In h264 NAL units means frame.?
NALs are divided into several types, and depending on the type can contain decoding parameters (SPS, PPS), enhancement information (SEI) and video samples (slice header and data). A common sequence from a broadcast transport stream would be SPS, PPS, SEI, slice_header(), slice_data(), SEI, slice_header(), slice_data() *
You probably don't need to understand ISO 14496-10 section 7.3 "Syntax in tabular form" for your application.
Related
I'm new to ffmpeg and trying to use it to record a streaming video (e.g. recording YouTube streams). However, I guess sometimes due to network issues, the last NAL unit is corrupted and when I want to concat multiple videos together, the error below occurs and the process exited.
[NULL # 0x558551957ec0] Invalid NAL unit size (41974 > 39166).bitrate=12309.0kbits/s speed=21.5x
[NULL # 0x558551957ec0] missing picture in access unit with size 39182
[concat # 0x55855194c700] h264_mp4toannexb filter failed to receive output packet
../filelist.txt: Invalid data found when processing input
So I'm wondering if there's a way to tell ffmpeg to skip the last NAL unit (or skip any invalid NAL unit) without re-encoding the entire video? Thanks in advance!
I am using ffmpeg to stream a h264 encoded avi file to a player and
the player supports only packetization mode 0 ( single NAL unit mode
). But ffmpeg always uses packetization mode 1 and sends FU-A nal unit
type, the player does not play the video on receiving a fu-a nal type
payload. It just displays a blank screen. I understand non-interleaved
mode supports both single NAL unit types (1-23) and fua, but how to
can I force ffmpeg to use only single nal unit type mode? Can some one
help me?
I'm assuming you mean H264 over RTP here. FFmpeg's RTP muxer can be forced to use mode 0 by using flag -rtpflags h264_mode0; though if you are seeing FU-A type (28) then chances are some NAL units can't fit single RTP packet and mode0 won't work.
I wrote a RTP server to receive the RTP packets which are sent by command ffmpeg -i test.mp4 rtp rtp://ip:port (client) and the server could get the nal type 24 (STAP-A).
And I want to use the server to retrieve the spa and pps from the first nal(type 24) instead of info from ffmpeg command.
Is it possible SPS and PPS would be aggregated in one nal ?
for example
[RTP header][nal header(type 24)][nal1 header][nal1 size][nal1 payload][nal2 header][nal2 size][nal2 payload]...
thanks
It's highly likely that the STAP-A consists of the SPS and PPS: these NAL units are usually at the beginning of the stream, small and can be aggregated into a STAP A. If the IDR is small enough, it might also be part of the STAP, but usually this is to big and will be sent separately.
The best thing to verify this is to split the STAP-A into the original NAL units (See RFC6184) and check for types 7 (SPS) and 8 (PPS).
I'm going through the HEVC decoder integrated in FFMPEG. I'm actually trying to understand its flow and working.
By flow, i mean the part in code where it is reading various parameters of the input .bin file. Like where is it reading the resolution, where is it deciding the fps that it needs to play, the output display format that is yuv420p etc.
Initially What i suspected is the demuxer of hevc situated at /libavformat/hevcdec.c In this file does the input file reading work. There is a probe function which is used to detect which decoder to select while decoding the input bin stream. Further we have a FF_DEF_RAWVIDEO_DEMUXER. Is it in this function that the resolution and other parameters read from the input file?
Secondly what i suspect is the hevc parser situated at: /libavcodec/hevc_parser.c but here i think it is just parsing the frame data, that is finding end of frame. So, is this assumption of mine right?
Any suggestions or any predictions will be really helpful to me. Please provide your valuable suggestions. Thanks in advance.
To understand more specifically what is going on in decoder, it's better to start your study with HEVC/H.265 standard (http://www.itu.int/rec/T-REC-H.265). It contains all the information you need to know to find the location of resolution, fps, etc.
If you want to get more details from FFMPEG, here are some hints:
/libavcodec/hevc_parser.c contains H.265 Annex B parser, which converts byte stream into series of NAL units. Each NAL unit has its own format and should be parsed depending on its NAL unit type.
If you are looking for basic properties of video sequence, you may be interested in SPS (Sequence Parameter Set) parsing. Its format is described in section 7.3.2.2.1 of the standard and there is a function ff_hevc_decode_nal_sps located in /libavcodec/hevc_ps.c which extracts SPS parameters from the bitstream.
Note: I was talking about FFMPEG version 2.5.3. Code structure can be different for other versions.
I tried to decode a series of nal units with ffmpeg (libavcodec) but I get a "no frame" error. I produced the nal units with the guideline at How does one encode a series of images into H264 using the x264 C API?. I tried the following strategy for decoding:
avcodec_init();
avcodec_register_all();
AVCodec* pCodec;
pCodec=lpavcodec_find_decoder(CODEC_ID_H264);
AVCodecContext* pCodecContext;
pCodecContext=lpavcodec_alloc_context();
avcodec_open(pCodecContext,pCodec);
AVFrame *pFrame;
pFrame=avcodec_alloc_frame();
//for every nal unit:
int frameFinished=0;
//nalData2 is nalData without the first 4 bytes
avcodec_decode_video(pCodecContext,pFrame,&frameFinished,(uint8_t*) nalData2,nalLength);
I passed all units I got to this code but frameFinished stays 0. I guess there must be something wrong with the pCodecContext setup. Can someone send me a working example?
Thank you
Check out this tutorial. It should be able to decode any video type including H.264:
http://dranger.com/ffmpeg/
I don't know exactly what is causing the problem, but I suspect it has something to do with the fact that you are not using the av_read_frame from libavformat to parse out a frames worth of data at a time. H.264 has the ability to split a frame into multiple slices and therefore multiple NAL units.
I am pretty sure the x264 encoder does not do this by default and produces one NAL unit per frame. However, there are NAL units with other stream information that need to be fed to the decoder. I have experimented with this in the past and when av_read_frame parses out a frames worth of data, it sometimes contains multiple NAL units. I would suggest following the tutorial closely and seeing if that works.
Another thing is that I think you do need to pass the first 4 bytes of the NAL unit into avcodec_decode_video if that is the start code you are talking about (0x00000001). Having investigated the output from av_read_frame, the start codes are still in the data when passed to the decoder.
Try this after the codec context instantiation code:
if(pCodec->capabilities & CODEC_CAP_TRUNCATED)
pCodecContext->flags |= CODEC_FLAG_TRUNCATED; /* We may send incomplete frames */
if(pCodec->capabilities & CODEC_FLAG2_CHUNKS)
pCodecContext->flags2 |= CODEC_FLAG2_CHUNKS;