ffmpeg: Extract unknown data stream from video container - ffmpeg

I have a .MOV container which contains the following tracks:
Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709), 3840x2160 [SAR 1:1 DAR 16:9], 100619 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
Metadata:
creation_time : 2020-07-21T22:48:24.000000Z
handler_name : DJI.AVC
encoder : AVC encoder
Stream #0:1(eng): Data: none (priv / 0x76697270), 87 kb/s
Metadata:
creation_time : 2020-07-21T22:48:24.000000Z
handler_name : DJI.Meta
Stream #0:2(eng): Subtitle: mov_text (text / 0x74786574), 2 kb/s (default)
Metadata:
creation_time : 2020-07-21T22:48:24.000000Z
handler_name : DJI.Subtitle
As you can see, stream #0:1, called DJI.meta, is of an unknown data format. I just want to extract the raw data of this stream to a file. So that is the ffmpeg command I tried:
ffmpeg -i .\DJI_0001.MOV -map 0:1 metadata
But using this command results in the following error:
Unable to find a suitable output format for 'metadata'
metadata: Invalid argument
How can I tell ffmpeg that the data should not be formated, so that only the raw data is extracted?

Use
ffmpeg -i input -map 0:d -c copy -copy_unknown -f data raw.bin

Related

The SSIM values calculated using FFMPEG are not what I expected

I'm trying to encode an m2ts (MPEG-2 Transport Stream) file to mp4 (H.264) and get the ssim value.
I did get some SSIM values, but the results were contrary to my expectations.
Are there wrong command options for ffmpeg?
Encoding and SSIM calculation commands
# encode
$ ffmpeg -hide_banner -fflags +discardcorrupt -i input.m2ts \
-c:v libx264 -crf <CRF> -preset:v medium \
-c:a copy -bsf:a aac_adtstoasc \
output_ff_crf-<CRF>.mp4
# calculate ssim
$ ffmpeg -hide_banner -i <A> -i <B> \
-lavfi "[0:v]settb=AVTB,setpts=PTS-STARTPTS[main];[1:v]settb=AVTB,setpts=PTS-STARTPTS[ref];[main][ref]ssim" \
-f null -
The results of the SSIM
(a) A=input.m2ts, B=input.m2ts, ssim=0.973266
(b) A=input.m2ts, B=output_ff_crf-0.mp4, ssim=0.813347
(c) A=input.m2ts, B=output_ff_crf-30.mp4, ssim=0.819897
(d) A=output_ff_crf-0.mp4, B=output_ff_crf-0.mp4, ssim=1.000000
(e) A=output_ff_crf-0.mp4, B=output_ff_crf-30.mp4, ssim=0.972911
(d)(e): These are what I expected.
(a): The files are the same, but ssim≠1.
(b)(c): SSIMs with CRF=0 and CRF=30 have almost the same value, although the image quality is different.
In the case of HandBrakeCLI
To determine if there was a problem with input.m2ts, I ran HandBrakeCLI with almost the same parameters as ffmpeg.
# encode
HandBrakeCLI --verbose --format av_mp4 --encoder x264 --quality <CRF> --x264-preset medium \
--aencoder copy \
--input input.m2ts --output output_hb_crf-<CRF>.mp4
# calculate ssim (same as ffmpeg)
$ ffmpeg -hide_banner -i <A> -i <B> \
-lavfi "[0:v]settb=AVTB,setpts=PTS-STARTPTS[main];[1:v]settb=AVTB,setpts=PTS-STARTPTS[ref];[main][ref]ssim" \
-f null -
(b') A=input.m2ts, B=output_hb_crf-0.mp4, ssim=0.999999
(c') A=input.m2ts, B=output_hb_crf-30.mp4, ssim=0.972886
(d') A=output_hb_crf-0.mp4, B=output_hb_crf-0.mp4, ssim=1.000000
(e') A=output_hb_crf-0.mp4, B=output_hb_crf-30.mp4, ssim=0.972886
It's all as I expected. (although A is not ssim=1.0)
Therefore, I don't see a problem with input.m2ts.
Informations about video files and tools
Results of the ffprobe
input.m2ts
[mpeg2video # 0x5655577c1680] Invalid frame dimensions 0x0.
Last message repeated 1 times
[mpegts # 0x5655577bd080] start time for stream 2 is not set in estimate_timings_from_pts
[mpegts # 0x5655577bd080] PES packet size mismatch
Input #0, mpegts, from 'input.m2ts':
Duration: 00:30:02.68, start: 39593.392600, bitrate: 19019 kb/s
Program 211
Stream #0:0[0x140]: Video: mpeg2video (Main) ([2][0][0][0] / 0x0002), yuv420p(tv, bt709, top first), 1920x1080 [SAR 1:1 DAR 16:9], 29.97 fps, 29.97 tbr, 90k tbn, 59.94 tbc
Stream #0:1[0x141]: Audio: aac (LC) ([15][0][0][0] / 0x000F), 48000 Hz, stereo, fltp, 255 kb/s
Stream #0:2[0x138]: Data: bin_data ([6][0][0][0] / 0x0006)
Unsupported codec with id 100359 for input stream 2
output_ff_crf-0.mp4
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'output_ff_crf-0.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf57.83.100
Duration: 00:30:02.67, start: 0.000000, bitrate: 109301 kb/s
Stream #0:0(und): Video: h264 (High 4:4:4 Predictive) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], 109040 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 252 kb/s (default)
Metadata:
handler_name : SoundHandler
output_hb_crf-0.mp4
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'output_hb_crf-0.mp4':
Metadata:
major_brand : mp42
minor_version : 512
compatible_brands: isomiso2avc1mp41
creation_time : 2020-05-17T06:22:06.000000Z
encoder : HandBrake 1.1.0 2018042400
Duration: 00:30:02.22, start: 0.000000, bitrate: 109661 kb/s
Stream #0:0(und): Video: h264 (High 4:4:4 Predictive) (avc1 / 0x31637661), yuv420p(tv, bt709), 1920x1080 [SAR 1:1 DAR 16:9], 109405 kb/s, 29.97 fps, 29.97 tbr, 90k tbn, 180k tbc (default)
Metadata:
creation_time : 2020-05-17T06:22:06.000000Z
handler_name : VideoHandler
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 252 kb/s (default)
Metadata:
creation_time : 2020-05-17T06:22:06.000000Z
handler_name : Stereo
Tools
version
ffmpeg: 3.4.6-0ubuntu0.18.04.1
HandBrakeCLI: 1.1.0
ldd
$ ldd /usr/bin/ffmpeg
...
libx264.so.152 => /usr/lib/x86_64-linux-gnu/libx264.so.152 (0x00007efbf1f33000)
...
$ ldd /usr/bin/HandBrakeCLI
...
libx264.so.152 => /usr/lib/x86_64-linux-gnu/libx264.so.152 (0x00007efbfb38f000)
...
ffmpeg and HandBrakeCLI are using the same libx264.

Scale2ref then join two video clips using ffmpeg

I have two video (with audio) clips that I want to join. The first clip has the following format:
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280x720 [SAR 1:1 DAR 16:9], 358 kb/s, 25 fps, 25 tbr, 12800 tbn, 50 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(und): Audio: ac3 (ac-3 / 0x332D6361), 48000 Hz, stereo, fltp, 192 kb/s (default)
Metadata:
handler_name : SoundHandler
Side data:
audio service type: main
And the second:
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 320x240, 88 kb/s, 8 fps, 8 tbr, 16384 tbn, 16 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(und): Audio: ac3 (ac-3 / 0x332D6361), 11025 Hz, mono, fltp, 96 kb/s (default)
Metadata:
handler_name : SoundHandler
Side data:
audio service type: main
I want to join the two clips; the first followed by the latter while keeping the format options of the second clip.
Based on reading the documentation and other questions I have reached the following command:
ffmpeg -i secondClip.mp4 -i firstClip.mpg -filter_complex "[1:v:0][0:v:0]scale2ref=oh*mdar:ih[2nd][ref],[2nd][1:a:0][ref][0:a:0]concat=n=2:v=1:a=1[outv][outa]" -map "[outv]" -map "[outa]" output.mp4
This gives the following errors:
Stream mapping:
Stream #0:0 (h264) -> scale2ref:ref
Stream #0:1 (ac3) -> concat:in1:a0
Stream #1:0 (mpeg2video) -> scale2ref:default
Stream #1:1 (mp2) -> concat:in0:a0
concat:out:v0 -> Stream #0:0 (libx264)
concat:out:a0 -> Stream #0:1 (aac)
Press [q] to stop, [?] for help
[Parsed_concat_1 # 0000017896b6f400] Input link in1:v0 parameters (size 320x240, SAR 0:1) do not match the corresponding output link in0:v0 parameters (426x240, SAR 640:639)
[Parsed_concat_1 # 0000017896b6f400] Failed to configure output pad on Parsed_concat_1
Error reinitializing filters!
Failed to inject frame into filter network: Invalid argument
Error while processing the decoded data for stream #0:0
Conversion failed!

Embedding timed text metadata in MP4

Is it possible to manually embed timed text metadata into MP4 files?
I have a TTML / SRT file with the metadata. I just need to embed the text data without doing any encoding the video / audio.
EDIT:
We used to do the metadata injecting using on Wowza server which we use for live streaming. What I need to do is manually inject the metadata in to prerecorded MP4 files without running the video through Wowza.
Here is one such video file that was processed by Wowza:
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'metadata-back.mp4':
Metadata:
major_brand : f4v
minor_version : 0
compatible_brands: isommp42m4v
creation_time : 2015-04-16 11:12:39
Duration: 00:00:11.70, start: 0.000000, bitrate: 1373 kb/s
Stream #0:0(eng): Video: h264 (Baseline) (avc1 / 0x31637661), yuv420p(tv), 640x480 [SAR 1:1 DAR 4:3], 1352 kb/s, 28.60 fps, 30 tbr, 90k tbn, 60 tbc (default)
Metadata:
creation_time : 2015-04-16 11:12:39
handler_name : WowzaStreamingEngine
encoder : WowzaStreamingEngine
Stream #0:1(eng): Audio: speex (spex / 0x78657073), 16000 Hz, mono, s16, 17 kb/s (default)
Metadata:
creation_time : 2015-04-16 11:12:39
handler_name : WowzaStreamingEngine
Stream #0:2(eng): Data: none (amf0 / 0x30666D61), 0 kb/s (default)
Metadata:
creation_time : 2015-04-16 11:12:39
handler_name : WowzaStreamingEngine
Now if I run the command ffmpeg -i new-meta.mp4 -i sub.srt -c copy -c:s mov_text -movflags +faststart out.mp4 and if I run ffmpeg -i out.mp4, I get this:
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'out.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf56.4.101
Duration: 00:00:07.27, start: 0.000000, bitrate: 925 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1366x768 [SAR 1:1 DAR 683:384], 920 kb/s, 30 fps, 30 tbr, 15360 tbn, 60 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(und): Subtitle: mov_text (tx3g / 0x67337874), 0 kb/s (default)
Metadata:
handler_name : SubtitleHandler
Now as you can see the text is embedded with a different codec (is this the right term?). Also I dont see an audio track as well.
Hope my question is clear enough. I need a way to embed metadata (from srt / ttml) into an MP4 video it should be embedded in amf format (again is this the right term?)
ffmpeg -i in.mp4 -i subs.srt -c copy -c:s mov_text -movflags +faststart out.mp4
Support for 3GPP TS 26.245 Timed Text ("mov_text") in MP4 may vary according to the player.

ffprobe stream selection for encoding

when I run ffmpeg, I can see "default" audio and video stream:
Stream #0:0(und): Video: h264 (Constrained Baseline) (avc1 / 0x31637661), yuv420p, 960x400 [SAR 1:1 DAR 12:5], 3859 kb/s, 23.98 fps, 23.98 tbr, 24k tbn, 47.95 tbc
(default)
Metadata:
creation_time : 2013-05-03 22:50:47
handler_name : GPAC ISO Video Handler
Stream #0:1(und): Audio: aac (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 92 kb/s
(default)
Metadata:
creation_time : 1970-01-01 00:00:00
handler_name : SoundHandler
Stream #0:2: Video: mjpeg, yuvj420p(pc), 675x1000 [SAR 72:72 DAR 27:40], 90k tbr, 90k tbn, 90k tbc
As I understand, this stream selected by ffmpeg as input when encoding, if map option not set.
How can I get "default" stream using ffprobe ?
Sorry for English

FFmpeg streaming to Akamai like FMLE does

I am trying to stream video from my webcam to AkamaiHD service using ffmpeg (like it is implemented in Flash Media Live Encoder)
ffmpeg -f dshow -i video="Webcam C110" -s 640x360 -aspect 16:9 -profile:v baseline - pix_fmt yuv420p -vcodec libx264 -f flv "rtmp://..."
...
Input #0, dshow, from 'video=Webcam C110':
Duration: N/A, start: 31296.194000, bitrate: N/A
Stream #0:0: Video: rawvideo (YUY2 / 0x32595559), yuyv422, 640x480, 30 tbr, 10000k tbn, 30 tbc
Output #0
Metadata:
encoder : Lavf55.8.102
Stream #0:0: Video: h264 (libx264) ([7][0][0][0] / 0x0007), yuv420p, 640x360 [SAR 1:1 DAR 16:9], q=-1--1, 1k tbn, 30 tbc
Stream mapping:
Stream #0:0 -> #0:0 (rawvideo -> libx264)
....
The video is sreamed, but when I try to view it at http://mediapm.edgesuite.net/edgeflash/public/zeri/debug/Main.html?url=myplayback_url/manifest.f4m
it is not displayed.
I've found out that if the video is recorded using FMLE and restreamed to akamai, then HLS stream is played.
ffmpeg -re -i sample.f4v -c copy -f flv "rtmp://..."
....
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'sample.f4v':
Metadata:
major_brand : f4v
minor_version : 0
compatible_brands: isommp42m4v
creation_time : 2018-10-06 09:23:33
Duration: 00:01:01.77, start: 0.460000, bitrate: 718 kb/s
Stream #0:0(eng): Video: h264 (Baseline) (avc1 / 0x31637661), yuv420p, 640x360 [SAR 1:1 DAR 16:9], 624 kb/s, 30 tbr, 1k tbn, 48 tbc
Metadata:
creation_time : 2018-10-06 09:23:33
handler_name : MainConcept
Output #0
Metadata:
major_brand : f4v
minor_version : 0
compatible_brands: isommp42m4v
encoder : Lavf55.8.102
Stream #0:0(eng): Video: h264 ([7][0][0][0] / 0x0007), yuv420p, 640x360 [SAR 1:1 DAR 16:9], q=2-31, 624 kb/s, 1k tbn, 1k tbc
Metadata:
creation_time : 2018-10-06 09:23:33
handler_name : MainConcept
Stream mapping:
Stream #0:0 -> #0:0 (copy)
....
It seems that the problem is with h.264 codec configuration, but I have found no solution yet.
Could you please advice, how can I implement streaming like FMLE using ffmpeg?
I think for Akamai you need authentication for the RTMP entrypoint which I don't believe is supported in the librtmp used by ffmpeg.
Have a look at this thread over at Wowza for more information.
If you think it is a codec issue then instead of copy the codec you could try encoding it using -c:v libx264 as the video codec and -c:a libfaac for audio

Resources