FFmpeg: what does the "global_header" flag do? - ffmpeg

According to the description, it places global headers in extradata instead of every keyframe.
But what's the actual purpose? Is it useful e.g. for streaming?
I guess that the resulting video file will be marginally shorter, but more prone to corruption (no redundancy = file unplayable if main header corrupted or only partially downloaded)?
Or maybe it somehow improves decoding a little bit? As headers are truly global and cannot change from keyframe to keyframe?
Thanks!

By extradata FFmpeg means out-of-band, as opposed to in-band. The behavior of the flag is format specific. This is useful for headers that are not expected to change because it reduces the overhead.
Example: for H.264 in MP4 the SPS and PPS are stored in the avcC atom. For the same H.264 stream in let's say MPEG-TS the sequences are repeated in the bitstream.

Related

Do any of the FFMPEG libraries have a method for determining the frame type (I, P or B) of a frame WITHOUT first decoding?

Every once and a while this comes up and I've never been able to find a good solution but I figured I'd ask the experts: is there a video frame parser in FFMPEG that can determine frame type WITHOUT resorting to decoding?
Each codec has its particular syntax, and the decoder is the bespoke component that can work with that syntax. Now, is there an operational mode where the decoders analyze and log surface level parameters of the payload without entering the codepath to generate raster? In general, no.
There is a -debug option (http://www.ffmpeg.org/ffmpeg-all.html#Codec-Options), which when set, certain decoders, mainly the native MPEG decoders, will log some metadata, but this will invoke the decoder. For modern codecs, there's the trace_headers bitstream filter which can be used in streamcopy mode. This will parse and print all the parameter sets and slice headers. You can dump these to file and inspect them.

How does Cobalt packet Opus track?

When the audio codec is Opus, some extra parameters are very important for our integration.
Is there a way to get codec delay, seek preroll and codec private?
When SB_API_VERSION is not less than SB_AUDIO_SPECIFIC_CONFIG_AS_POINTER, 'codec private' for Opus has been passed to starboard.
As I am not very sure whether the audio sample was preprocessed with 'codec delay' and 'seek preroll', is it unnecessary for audio decoder to use those?
Opus metadata is stored in AudioDecoderConfig::extra_data() and passed into SbPlayerCreate() via SbMediaAudioHeader::audio_specific_config.
You may parse it by using code similar to ParseOpusHeader function inside "media/filters/opus_audio_decoder.cc".
Unfortunately |audio_specific_config| is an array of 8 bytes in COBALT_9 and the extra bytes of the Opus metadata is missing. There is several solution to this:
1. Remove support of Opus as it is optional per the 2017 requirements. Use aac instead.
2. Use an Opus decoder that doesn't need the metadata.
3. Wait until COBALT_11 is released, in which version the size limit of |audio_specific_config| is removed. But this may not be feasible per your 2017 release schedule.
4. Increase the size of SbMediaAudioHeader::audio_specific_config to a larger number (say 1024). This will make your future rebase slightly harder.

FFMPEG: Frame parameter initializations in HEVC decoder

I'm going through the HEVC decoder integrated in FFMPEG. I'm actually trying to understand its flow and working.
By flow, i mean the part in code where it is reading various parameters of the input .bin file. Like where is it reading the resolution, where is it deciding the fps that it needs to play, the output display format that is yuv420p etc.
Initially What i suspected is the demuxer of hevc situated at /libavformat/hevcdec.c In this file does the input file reading work. There is a probe function which is used to detect which decoder to select while decoding the input bin stream. Further we have a FF_DEF_RAWVIDEO_DEMUXER. Is it in this function that the resolution and other parameters read from the input file?
Secondly what i suspect is the hevc parser situated at: /libavcodec/hevc_parser.c but here i think it is just parsing the frame data, that is finding end of frame. So, is this assumption of mine right?
Any suggestions or any predictions will be really helpful to me. Please provide your valuable suggestions. Thanks in advance.
To understand more specifically what is going on in decoder, it's better to start your study with HEVC/H.265 standard (http://www.itu.int/rec/T-REC-H.265). It contains all the information you need to know to find the location of resolution, fps, etc.
If you want to get more details from FFMPEG, here are some hints:
/libavcodec/hevc_parser.c contains H.265 Annex B parser, which converts byte stream into series of NAL units. Each NAL unit has its own format and should be parsed depending on its NAL unit type.
If you are looking for basic properties of video sequence, you may be interested in SPS (Sequence Parameter Set) parsing. Its format is described in section 7.3.2.2.1 of the standard and there is a function ff_hevc_decode_nal_sps located in /libavcodec/hevc_ps.c which extracts SPS parameters from the bitstream.
Note: I was talking about FFMPEG version 2.5.3. Code structure can be different for other versions.

What is the minimum amount of metadata is needed to stream only video using libx264 to encode at the server and libffmpeg to decode at the client?

I want to stream video (no audio) from a server to a client. I will encode the video using libx264 and decode it with ffmpeg. I plan to use fixed settings (at the very least they will be known in advance by both the client and the server). I was wondering if I can avoid wrapping the compressed video in a container format (like mp4 or mkv).
Right now I am able to encode my frames using x264_encoder_encode. I get a compressed frame back, and I can do that for every frame. What extra information (if anything at all) do I need to send to the client so that ffmpeg can decode the compressed frames, and more importantly how can I obtain it with libx264. I assume I may need to generate NAL information (x264_nal_encode?). Having an idea of what is the minimum necessary to get the video across, and how to put the pieces together would be really helpful.
I found out that the minimum amount of information are the NAL units from each frame, this will give me a raw h264 stream. If I were to write this to a file, I could watchit using VLC if adding a .h264
I can also open such a file using ffmpeg, but if I want to stream it, then it makes more sense to use RTSP, and a good open source library for that is Live555: http://www.live555.com/liveMedia/
In their FAQ they mention how to send the output from your encoder to live555, and there is source for both a client and a server. I have yet to finish coding this, but it seems like a reasonable solution

Extracting frames from MP4/FLV?

I know it's possible with FFMPEG, but what to do if I have a partial file (like without the beginning and the end). Is is possible to extract some frames from it?
The command
ffmpeg -ss 00:00:25 -t 00:00:00.04 -i YOURMOVIE.MP4 -r 25.0 YOURIMAGE%4d.jpg
will extract frames
beginning at second 25 [-ss 00:00:25]
stopping after 0.04 second [-t 00:00:00.04]
reading from input file YOURMOVIE.MP4
using only 25.0 frames per second, i. e. one frame every 1/25 seconds [-r 25.0]
as JPEG images with the names YOURIMAGE%04d.jpg, where %4d is a 4-digit autoincrement number with leading zeros
Check you movie for the framerate before applying option [-r], same applicable for [-t], unless you want to extract the frames with the custom rate.
Never tried this with the cropped (corrupted?) input file though.
Worth to try.
This could be VERY difficult. The MP4 file format includes an 'moov' atom which has pointers to the audio and video 'samples'. If the fragment of the mp4 file you have does not have the moov atom, your job would be much more complicated. You'd have to develop logic to examine the 'mdat' atom (which contains all the audio and video samples) and use educated guesses to find the audio and video boundaries.
Even worse, without the moov atom, you won't have the SPS and PPS needed to decode the slices. You'd have to synthesize replacements; if you know the codec used to create the MP4, then you might be able to copy the SPS and PPS from a similarly encoded file; if not, it could be a painful process of trial and error, because the syntax of the slices (the H.264 encoded pictures) is dependent upon values specified in the SPS and PPS.

Resources