What exactly is Fragmented mp4(fMP4)? How is it different from normal mp4? - media

Media Source Extension (MSE) needs fragmented mp4 for playback in the browser.

A fragmented MP4 contains a series of segments which can be requested individually if your server supports byte-range requests.
Boxes aka Atoms
All MP4 files use an object oriented format that contains boxes aka atoms.
You can view a representation of the boxes in your MP4 using an online tool such as MP4 Parser or if you're using Windows, MP4 Explorer. Let's compare a normal MP4 with one that is fragmented:
Non-Fragmented MP4
This screenshot (from MP4 Parser) shows an MP4 that hasn't been fragmented and quite simply has one massive mdat (Movie Data) box.
If we were building a video player that supports adaptive bitrate, we might need to know the byte position of the 10 sec mark in a 0.5Mbps and a 1Mbps file in order to switch the video source between the two files at that moment. Determining this exact byte position within one massive mdat in each respective file is not trivial.
Fragmented MP4
This screenshot shows a fragmented MP4 which has been segmented using MP4Box with the onDemand profile.
You'll notice the sidx and series of moof+mdat boxes. The sidx is the Segment Index and stores meta data of the precise byte range locations of the moof+mdat segments.
Essentially, you can independently load the sidx (its byte-range will be defined in the accompanying .mpd Media Presentation Descriptor file) and then choose which segments you'd like to subsequently load and add to the MSE SourceBuffer.
Importantly, each segment is created at a regular interval of your choosing (ie. every 5 seconds), so the segments can have temporal alignment across files of different bitrates, making it easy to adapt the bitrate during playback.

Media File Formats
Media data streams are wrapped in a container format. The container includes the physical data of the media but also metadata that are necessary for playback. For example it signals to the video player the codec
used, subtitles tracks etc. In video streaming there are two main formats
that are used for storage and presentation of multimedia content: MPEG-
2 Transport Streams (MPEG-2 TS)[25] and ISO Base Media File Formats
(ISOBMFF)[24](MP4 and fragmented MP4).
MPEG-2 Transport Streams are specified by [25] and are designed for
broadcasting video through satellite networks. However, Apple adopted
it for its adaptive streaming protocol making it an important format. In
MPEG-2 TS audio, video and subtitle streams are multiplexed together.
MP4 and fragmented MP4 (fMP4), are both part of the MPEG-4, Part
12 standard that covers the ISOBMFF. MP4 is the most known multimedia
container format and it’s widely supported in different operating systems
and devices. The structure of an MP4 video file, is shown in figure 2.2a.
As shown, MP4 consist of different boxes, each with a different function-
ality. These boxes are the basic building block of every container in MP4.
For example the file type box (’ftyp’), specifies the compatible brands (spe-
cifications) of the file. MP4 files have a Movie Box (’moov’) that contains
metadata of the media file and sample tables that are important for timing
and indexing the media samples (’stbl’). Also there is a Media Data Box
(’mdat’) that contains the corresponding samples. In the fragmented con-
tainer, shown in figure 2.2b, media samples are interleaved by using Movie
Fragment boxes (’moof’) which contain the sample table for the specific
fragment(mdat box).
Ref : https://repository.tudelft.nl/islandora/object/uuid%3Ae06cde4c-1514-4a8d-90be-7e10eee5aac1

Related

MediaConvert introducing Video Audio Duration Mismatch in HLS segments

When I used MediaConvert to package a video file into HLS, I'm seeing that the resulting TS files have mismatching video and audio duration. For example, when I used segment size as 6, the resulting TS file has video duration as 6.006000 and audio duration as 5.994667.
How can we ensure MediaConvert produces HLS TS files with the same video and audio duration? What settings should be used?
We need to ensure similar video/audio duration as these HLS segments will be replaced with ads by MediaTailor. We are encountering few SSAI stream playout glitches, especially on Safari due to this.
Good question.
Regarding the control of video segment lengths in AWS Elemental MediaConvert:
Video segment duration accuracy can be increased with the following settings:
Check the HLS Output Group settings /Advanced / Manifest duration format and set it to 'Integer'. Also, set Segment length control to "Exact". This will help ensure that all video segments (except possibly the last one, which may be short due to end of content) will be of the specified duration. Related: the setting for 'Minimum final segment length' will ensure that any short final segments below the specified minimum duration be appended to the previous segment instead. This avoids very short segments which can cause playback issues on some players.
There is no explicit control for the duration of stand-alone audio segments beyond the HLS output group settings. By default, MediaConvert will pad or crop the end of the audio content to equal the duration of the video content in the input. This behavior can also be adjusted.
Regarding the "SSAI stream playout glitches" you are seeing from MediaTailor endpoints, we suggest that you open a new support case with AWS Premium Support from your AWS Console. To speed the investigation, please include a session ID with timestamps and/or or a HAR file browser log of the issue.

Lua script for mpv - different duration for each file in a directory

I searched and tried possible solutions for a lua script that autoloop some images from one directory. The result should be these images to be launched by mpv(media player) with a different duration.
I know there is an autoload script that takes every image but just 1 second each.
https://github.com/mpv-player/mpv/blob/master/TOOLS/lua/autoload.lua
(working on windows 10 with the script directory for windows: C:\Users\Username\AppData\Roaming\mpv\scripts)
The following is not an exact answer but slightly related. I often had this need to have a image slideshow where images should be shown for variable duration. Most often accompanied by audio. These solutions worked for me.
Matroska format is very helpful for this. In mpv, I accomplished it with a lua script with images as attachment. Then duration list given in a tag. I don't use it actively because I cannot distribute to others. But instead I found the following approach more portable.
This is the concept. You create a mjpeg video with all the jpegs you want to create. Then you have a video player play with variable frame rate. You specify how long each frame should be shown. Only some container formats allow variable frame rate. Matroska container format allows. So wrap your mjpeg encoded video along with timing information in a matroska container.
You can extract jpeg images from mjpeg without any loss.
I used these tools on linux. I am not sure if they exist for windows. They are open-source tools.
This uses the variable frame rate ability of matroska container format.
Make a mjpeg video of all the jpegs in the sequence you wanted.
You can use ffmpeg tool to do that. Be careful with file naming. Any gap in number sequence is unforgiveable for ffmpeg. You may need to specify a container format for mjpeg encoded video. You can use .mkv format as well. I think other formats can also be used. I used .mkv format which is matroska format.
create time-sequence file. Refer to matroska container timestamp file format. I used version-2 format. In that format you specify the time for each frame in milliseconds. One line for each image frame. First line is header specifying the version
Create a matroska container using mktoolnix-gui.
Add the mjpeg encoded video file.
specify the timestamp file.
create an mkv file.
The tool will extract the mjpeg encoded video from the input container. Using the timestamp, it will create a new .mkv container.
Playing this .mkv container will show the images with the required duration. In future if required you can extract the images without any loss from mjpeg encoded format.

How to stream binary files using MPEG-2 TS

I am working on a software system to stream videos over MPEG-2 TS using FFMpeg. I have got an additional requirement of streaming pdf files. Can some one point me to the relevant link / ideas?
Simply putting pdf contents as TS payload did not work for me. I have created necessary PAT and PMT. Do I need to set PCR in adaptation field in PMT?

Streaming videos in multiple-bitrates without manually creating video files in each bitrate

I want to have a media file that I can stream at multiple bitrates using FFMPEG (for encoding and multiple bitrate generation) and Flash Media Server (for streaming).
In "LIVE BROADCASTING" ffmpeg made multiple bitrate videos from a single bitrate source but there were no files for the different bitrates. A file would be created for different bitrates when a viewer requested that bitrate streaming video but when a request was terminated the generated file was deleted.
So I searched in Flash Media Server and found (hds-vod), but in hds-vod I should create one file for every bitrate, for example if I have 2000 videos in my archive with HD quality (1024 kbps) I should make 4 videos with different bitrates from one video and together I have created 10,000 videos.
So to have 2000 videos in five bitrates (1024k, 760k, 320k, 145k, 64k), and now I have 10,000 videos. I have space for 2000 videos and I don't have free space in my server for 10,000 video files.
I want to stream "ON-DEMAND" videos in my server and not have the different bitrate files be continually generated like this.
Does anyone have any advice?
Thank you
Well, you will have to decode-encode video each time you want to generate video with a different bitrate. It is up to you if you want to save the result of reencoding into a file, or just stream it. I would save it into a file, because:
It makes no sense to waste the CPU cycles to reencode the same video again and again if you watch it more than once, or if you have several users watching the same video.
Your machine might not be powerful enough to do reencoding in realtime while keeping the proper frame rate, especially with HD videos and especially if you have multiple users.
This is why it is better to reencode video and store the file in advance. Storage is cheap nowadays.

What is the minimum amount of metadata is needed to stream only video using libx264 to encode at the server and libffmpeg to decode at the client?

I want to stream video (no audio) from a server to a client. I will encode the video using libx264 and decode it with ffmpeg. I plan to use fixed settings (at the very least they will be known in advance by both the client and the server). I was wondering if I can avoid wrapping the compressed video in a container format (like mp4 or mkv).
Right now I am able to encode my frames using x264_encoder_encode. I get a compressed frame back, and I can do that for every frame. What extra information (if anything at all) do I need to send to the client so that ffmpeg can decode the compressed frames, and more importantly how can I obtain it with libx264. I assume I may need to generate NAL information (x264_nal_encode?). Having an idea of what is the minimum necessary to get the video across, and how to put the pieces together would be really helpful.
I found out that the minimum amount of information are the NAL units from each frame, this will give me a raw h264 stream. If I were to write this to a file, I could watchit using VLC if adding a .h264
I can also open such a file using ffmpeg, but if I want to stream it, then it makes more sense to use RTSP, and a good open source library for that is Live555: http://www.live555.com/liveMedia/
In their FAQ they mention how to send the output from your encoder to live555, and there is source for both a client and a server. I have yet to finish coding this, but it seems like a reasonable solution

Resources