MediaConvert introducing Video Audio Duration Mismatch in HLS segments - aws-media-convert

When I used MediaConvert to package a video file into HLS, I'm seeing that the resulting TS files have mismatching video and audio duration. For example, when I used segment size as 6, the resulting TS file has video duration as 6.006000 and audio duration as 5.994667.
How can we ensure MediaConvert produces HLS TS files with the same video and audio duration? What settings should be used?
We need to ensure similar video/audio duration as these HLS segments will be replaced with ads by MediaTailor. We are encountering few SSAI stream playout glitches, especially on Safari due to this.

Good question.
Regarding the control of video segment lengths in AWS Elemental MediaConvert:
Video segment duration accuracy can be increased with the following settings:
Check the HLS Output Group settings /Advanced / Manifest duration format and set it to 'Integer'. Also, set Segment length control to "Exact". This will help ensure that all video segments (except possibly the last one, which may be short due to end of content) will be of the specified duration. Related: the setting for 'Minimum final segment length' will ensure that any short final segments below the specified minimum duration be appended to the previous segment instead. This avoids very short segments which can cause playback issues on some players.
There is no explicit control for the duration of stand-alone audio segments beyond the HLS output group settings. By default, MediaConvert will pad or crop the end of the audio content to equal the duration of the video content in the input. This behavior can also be adjusted.
Regarding the "SSAI stream playout glitches" you are seeing from MediaTailor endpoints, we suggest that you open a new support case with AWS Premium Support from your AWS Console. To speed the investigation, please include a session ID with timestamps and/or or a HAR file browser log of the issue.

Related

Concatenating Smooth Streaming output to a single MP4 file - problems with A/V sync. What is CodecPrivateData?

I have a video in fragmented form which is an output of an Azure Media Services Live Event (Smooth Streaming).
I'm trying to concatenate the segments to get a single MP4 file, however I've run into a A/V sync problem - no matter what I do (time-shifting/speeding up/slowing down/using FFmpeg filters), the audio delay is always floating. To get the output MP4 file, I tried concatenating the segments for video and audio streams (both at OS file level and with FFmpeg) and then muxing with FFmpeg.
I've tried everything I found on the web and I'm always ending up with exactly the same result. What's important, when I play the source from the manifest file, it's all good. That made me skim through the manifest once again, and I realized there's CodecPrivateData value which I'm not using anywhere in the process. What is it? Could it somehow help solving my problem?
Mystery solved: the manifest file contains the list of stream discontinuities, which need to be taken into account when concatenating the streams.

Capture Video from Public Web Video Feed

I've unsuccessfully mucked around with this on my own and need help.
Given the public Web camera feed at https://itsvideo.arlingtonva.us:8011/live/cam58.stream/playlist.m3u8 I'd like to be able to be able to capture the video feed into an MP4 or MPG file with a reasonably accurate timestamp using the Windows command line (so I can put it into a batch script, etc.).
This is probably easy for someone who is already a wiz with VLC or FFmpeg or some such tool.
Additional wish list items would be to call up a higher resolution stream for a shorter duration (so as to balance I/O impact) and/or to just get still images instead of the video offered.
For instance, the m3u file has the following parameters:
#EXTM3U
#EXT-X-VERSION:3
#EXT-X-STREAM-INF:BANDWIDTH=214105,CODECS="avc1.100.40",RESOLUTION=352x288
chunklist_w977413411.m3u8
Would there be a way to substitute any of these to increase the resolution and reduce the video duration in a corresponding way so that net I/O is the same? Or even to just get a still image, whether higher res or not?

What exactly is Fragmented mp4(fMP4)? How is it different from normal mp4?

Media Source Extension (MSE) needs fragmented mp4 for playback in the browser.
A fragmented MP4 contains a series of segments which can be requested individually if your server supports byte-range requests.
Boxes aka Atoms
All MP4 files use an object oriented format that contains boxes aka atoms.
You can view a representation of the boxes in your MP4 using an online tool such as MP4 Parser or if you're using Windows, MP4 Explorer. Let's compare a normal MP4 with one that is fragmented:
Non-Fragmented MP4
This screenshot (from MP4 Parser) shows an MP4 that hasn't been fragmented and quite simply has one massive mdat (Movie Data) box.
If we were building a video player that supports adaptive bitrate, we might need to know the byte position of the 10 sec mark in a 0.5Mbps and a 1Mbps file in order to switch the video source between the two files at that moment. Determining this exact byte position within one massive mdat in each respective file is not trivial.
Fragmented MP4
This screenshot shows a fragmented MP4 which has been segmented using MP4Box with the onDemand profile.
You'll notice the sidx and series of moof+mdat boxes. The sidx is the Segment Index and stores meta data of the precise byte range locations of the moof+mdat segments.
Essentially, you can independently load the sidx (its byte-range will be defined in the accompanying .mpd Media Presentation Descriptor file) and then choose which segments you'd like to subsequently load and add to the MSE SourceBuffer.
Importantly, each segment is created at a regular interval of your choosing (ie. every 5 seconds), so the segments can have temporal alignment across files of different bitrates, making it easy to adapt the bitrate during playback.
Media File Formats
Media data streams are wrapped in a container format. The container includes the physical data of the media but also metadata that are necessary for playback. For example it signals to the video player the codec
used, subtitles tracks etc. In video streaming there are two main formats
that are used for storage and presentation of multimedia content: MPEG-
2 Transport Streams (MPEG-2 TS)[25] and ISO Base Media File Formats
(ISOBMFF)[24](MP4 and fragmented MP4).
MPEG-2 Transport Streams are specified by [25] and are designed for
broadcasting video through satellite networks. However, Apple adopted
it for its adaptive streaming protocol making it an important format. In
MPEG-2 TS audio, video and subtitle streams are multiplexed together.
MP4 and fragmented MP4 (fMP4), are both part of the MPEG-4, Part
12 standard that covers the ISOBMFF. MP4 is the most known multimedia
container format and it’s widely supported in different operating systems
and devices. The structure of an MP4 video file, is shown in figure 2.2a.
As shown, MP4 consist of different boxes, each with a different function-
ality. These boxes are the basic building block of every container in MP4.
For example the file type box (’ftyp’), specifies the compatible brands (spe-
cifications) of the file. MP4 files have a Movie Box (’moov’) that contains
metadata of the media file and sample tables that are important for timing
and indexing the media samples (’stbl’). Also there is a Media Data Box
(’mdat’) that contains the corresponding samples. In the fragmented con-
tainer, shown in figure 2.2b, media samples are interleaved by using Movie
Fragment boxes (’moof’) which contain the sample table for the specific
fragment(mdat box).
Ref : https://repository.tudelft.nl/islandora/object/uuid%3Ae06cde4c-1514-4a8d-90be-7e10eee5aac1

FFmpegFrameRecorder calls record() 20 times but the resulting mp4 file only has 2 frames

I'm using FFmpegFrameRecorder to create mp4(H264) video from camera preview. My recorder configuration is as follows.
recorder = new FFmpegFrameRecorder(filePath, width, height);
recorder.setVideoCodec(avcodec.AV_CODEC_ID_H264);
recorder.setFormat("mp4");
recorder.setFrameRate(VIDEO_FPS);
recorder.setVideoBitrate(16384);
recorder.setPixelFormat(avutil.AV_PIX_FMT_YUV420P);
For the rest I follows closely to the sample code RecordActivity.java and was able to verify that
recorder.record(yuvIplimage)
gets called 20 (or more) times, which should create an mp4 with 20 frames. However, the resulting mp4 files after open up only has 2 frames (two first frame of the preview)! I have no idea what have caused such behavior. Any help would be greatly appreciate. Thank you.
Long Le
I figured it out: the issue was because I didn't know what I was doing. I was new to javacv, and I was assuming, based on this stackoverflow entry, that the number of frames in the resulting video should be equal to the number of record() calls. However, this is not the case with video encoding, especially with H264. I figured this out by trying with MPEG4 encoding and there's definitely more than 2 frames. H264 seems to require a minimum number of input frames and hence is not suitable for short (<1 minute) video clips generation (which is my application). One solution is to switch to MPEG4 encoding. However, most browser that does play .mp4 files does not support MPEG4 encoding. Another the solution is to use H264 with minimize compression, by adding the following configuration
recorder.setVideoQuality(0); // maximum quality, replace recorder.setVideoBitrate(16384);
recorder.setVideoOption("preset", "veryfast"); // or ultrafast or fast, etc.

Streaming videos in multiple-bitrates without manually creating video files in each bitrate

I want to have a media file that I can stream at multiple bitrates using FFMPEG (for encoding and multiple bitrate generation) and Flash Media Server (for streaming).
In "LIVE BROADCASTING" ffmpeg made multiple bitrate videos from a single bitrate source but there were no files for the different bitrates. A file would be created for different bitrates when a viewer requested that bitrate streaming video but when a request was terminated the generated file was deleted.
So I searched in Flash Media Server and found (hds-vod), but in hds-vod I should create one file for every bitrate, for example if I have 2000 videos in my archive with HD quality (1024 kbps) I should make 4 videos with different bitrates from one video and together I have created 10,000 videos.
So to have 2000 videos in five bitrates (1024k, 760k, 320k, 145k, 64k), and now I have 10,000 videos. I have space for 2000 videos and I don't have free space in my server for 10,000 video files.
I want to stream "ON-DEMAND" videos in my server and not have the different bitrate files be continually generated like this.
Does anyone have any advice?
Thank you
Well, you will have to decode-encode video each time you want to generate video with a different bitrate. It is up to you if you want to save the result of reencoding into a file, or just stream it. I would save it into a file, because:
It makes no sense to waste the CPU cycles to reencode the same video again and again if you watch it more than once, or if you have several users watching the same video.
Your machine might not be powerful enough to do reencoding in realtime while keeping the proper frame rate, especially with HD videos and especially if you have multiple users.
This is why it is better to reencode video and store the file in advance. Storage is cheap nowadays.

Resources