I want to concat two videos with ffmpeg. In order to concat the videos without re-encoding the videos have to be exactly the same format (same codec, same width, height, same color format, same time base, same frame rate, etc)
One video is from a gopro 11 (and I do not want to re-encode or modify this video)
and the other video is encoded with ffmpeg
ffprobe of the gopro 11 video shows
Stream #0:0[0x1](und): Video: hevc (Main) (hvc1 / 0x31637668), yuvj420p(pc, bt709), 1920x1080 [SAR 1:1 DAR 16:9], 38030 kb/s, 23.98 fps, 23.98 tbr, 24k tbn (default)
Metadata:
handler_name : VideoHandler
vendor_id : [0][0][0][0]
Stream #0:1[0x2](eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 193 kb/s (default)
Metadata:
handler_name : SoundHandler
vendor_id : [0][0][0][0]
"streams": [
{
"index": 0,
"codec_name": "hevc",
"codec_long_name": "H.265 / HEVC (High Efficiency Video Coding)",
"profile": "Main",
"codec_type": "video",
"codec_tag_string": "hvc1",
"codec_tag": "0x31637668",
"width": 1920,
"height": 1080,
"coded_width": 1920,
"coded_height": 1080,
"closed_captions": 0,
"film_grain": 0,
"has_b_frames": 0,
"sample_aspect_ratio": "1:1",
"display_aspect_ratio": "16:9",
"pix_fmt": "yuvj420p",
"level": 153,
"color_range": "pc",
"color_space": "bt709",
"color_transfer": "bt709",
"color_primaries": "bt709",
"chroma_location": "left",
"refs": 1,
"id": "0x1",
"r_frame_rate": "24000/1001",
"avg_frame_rate": "2400/91",
"time_base": "1/24000",
"start_pts": 0,
"start_time": "0.000000",
"duration_ts": 10010,
"duration": "0.417083",
"bit_rate": "38030596",
"nb_frames": "11",
The second video is generated with ffmpeg with this command
ffmpeg -i in.MP4 -pix_fmt yuvj420p -tag:v hvc1 -c:v libx265 -x265-params "level=5.1:bframes=0" out.mp4 -y
ffprobe of the generated video is
Duration: 00:00:06.13, start: 0.000000, bitrate: 913 kb/s
Stream #0:0[0x1](und): Video: hevc (Main) (hvc1 / 0x31637668), yuvj420p(pc, bt709, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 774 kb/s, 23.98 fps, 23.98 tbr, 24k tbn (default)
Metadata:
handler_name : VideoHandler
vendor_id : [0][0][0][0]
encoder : Lavc59.37.100 libx265
Stream #0:1[0x2](eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 129 kb/s (default)
Metadata:
handler_name : SoundHandler
vendor_id : [0][0][0][0]
"streams": [
{
"index": 0,
"codec_name": "hevc",
"codec_long_name": "H.265 / HEVC (High Efficiency Video Coding)",
"profile": "Main",
"codec_type": "video",
"codec_tag_string": "hvc1",
"codec_tag": "0x31637668",
"width": 1920,
"height": 1080,
"coded_width": 1920,
"coded_height": 1080,
"closed_captions": 0,
"film_grain": 0,
"has_b_frames": 0,
"sample_aspect_ratio": "1:1",
"display_aspect_ratio": "16:9",
"pix_fmt": "yuvj420p",
"level": 153,
"color_range": "pc",
"color_space": "bt709",
"color_transfer": "bt709",
"color_primaries": "bt709",
"chroma_location": "left",
"field_order": "progressive",
"refs": 1,
"id": "0x1",
"r_frame_rate": "24000/1001",
"avg_frame_rate": "24000/1001",
"time_base": "1/24000",
"start_pts": 0,
"start_time": "0.000000",
"duration_ts": 147147,
"duration": "6.131125",
"bit_rate": "774658",
"nb_frames": "147",
which is very similar BUT not the same. there is a new extra line
"field_order": "progressive",
that does not appear in the ffprobe of the gopro 11 video.
when I try to concat both videos the final video does not work because the input videos are not exactly the same.
Why does the gopro 11 video not have the field_order parameter?
what can I change in the ffmpeg command to generate a video without the field_order progressive?
Is the field_order the only encoding difference that does not allow the concat?
the first line of the ffprobes show the differences too
hevc (Main) (hvc1 / 0x31637668), yuvj420p(pc, bt709), 1920x1080 [SAR 1:1 DAR 16:9], 38030 kb/s, 23.98 fps, 23.98 tbr, 24k tbn (default)
hevc (Main) (hvc1 / 0x31637668), yuvj420p(pc, bt709, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 774 kb/s, 23.98 fps, 23.98 tbr, 24k tbn (default)
Related
When I tried to concat mp4 videos with the ffmpeg concat demuxer, the resulting video was out of sync starting from the end of the first video.
Although I use the same video resolution, fps, 128k audio bit-rate, it wasn't sufficient :
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'WhatsApp_Video_2021-10-20_at_6.45.53_AM-128kb.mp4':
Duration: 00:00:24.92, start: 0.000000, bitrate: 4098 kb/s
Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p(tv, bt709), 1280x720 [SAR 1:1 DAR 16:9], 3970 kb/s, 30 fps, 30 tbr, 19200 tbn, 60 tbc (default)
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 129 kb/s (default)
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'TOUT_N_ETAIT_PAS_ENCORE_ACCOMPLI_A_LA_CROIX_DR_N_GUESSAN_MICHEL__136-140__BzQKTvNYwlg__youtube_com.mp4':
Duration: 00:51:43.23, start: 0.000000, bitrate: 1932 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, smpte170m/bt470bg/bt709), 1280x720 [SAR 1:1 DAR 16:9], 1796 kb/s, 30 fps, 30 tbr, 15360 tbn, 60 tbc (default)
Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 127 kb/s (default)
What are the common encoding parameters needed to concat losslessly mp4 videos the the ffmpeg concat demuxer ?
I have video of resolution 1920x1080 (16:9 aspect ratio). When played its padded with black box on all sides. How to remove the black boxes to get the 1920x1080 video?
Screenshot of video
Below the audio and video details:
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'Maths Logic.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.19.102
Duration: 00:43:11.24, start: 0.000000, bitrate: 1475 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], 1405 kb/s, 25 fps, 25 tbr, 90k tbn, 50 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 64 kb/s (default)
Metadata:
handler_name : SoundHandler
Use cropdetect filter to get crop values:
ffmpeg -i input.mp4 -vf cropdetect -frames:v 3 -f null -
...
[Parsed_cropdetect_0 # 0x559116cfe440] x1:240 x2:1679 y1:56 y2:1078 w:1440 h:1008 x:240 y:64 pts:2 t:2.000000 crop=1440:1008:240:64
Then use crop filter to crop the black, scale to upscale back to 1080, and then pad to fill in missing area to make 16:9 aspect ratio:
ffmpeg -i input.mp4 -vf "crop=1440:1008:240:64,scale=1920:1080:force_original_aspect_ratio=decrease,pad=1920:1080:-1:-1" -c:a copy output.mp4
Before and after:
I'm trying to encode an m2ts (MPEG-2 Transport Stream) file to mp4 (H.264) and get the ssim value.
I did get some SSIM values, but the results were contrary to my expectations.
Are there wrong command options for ffmpeg?
Encoding and SSIM calculation commands
# encode
$ ffmpeg -hide_banner -fflags +discardcorrupt -i input.m2ts \
-c:v libx264 -crf <CRF> -preset:v medium \
-c:a copy -bsf:a aac_adtstoasc \
output_ff_crf-<CRF>.mp4
# calculate ssim
$ ffmpeg -hide_banner -i <A> -i <B> \
-lavfi "[0:v]settb=AVTB,setpts=PTS-STARTPTS[main];[1:v]settb=AVTB,setpts=PTS-STARTPTS[ref];[main][ref]ssim" \
-f null -
The results of the SSIM
(a) A=input.m2ts, B=input.m2ts, ssim=0.973266
(b) A=input.m2ts, B=output_ff_crf-0.mp4, ssim=0.813347
(c) A=input.m2ts, B=output_ff_crf-30.mp4, ssim=0.819897
(d) A=output_ff_crf-0.mp4, B=output_ff_crf-0.mp4, ssim=1.000000
(e) A=output_ff_crf-0.mp4, B=output_ff_crf-30.mp4, ssim=0.972911
(d)(e): These are what I expected.
(a): The files are the same, but ssim≠1.
(b)(c): SSIMs with CRF=0 and CRF=30 have almost the same value, although the image quality is different.
In the case of HandBrakeCLI
To determine if there was a problem with input.m2ts, I ran HandBrakeCLI with almost the same parameters as ffmpeg.
# encode
HandBrakeCLI --verbose --format av_mp4 --encoder x264 --quality <CRF> --x264-preset medium \
--aencoder copy \
--input input.m2ts --output output_hb_crf-<CRF>.mp4
# calculate ssim (same as ffmpeg)
$ ffmpeg -hide_banner -i <A> -i <B> \
-lavfi "[0:v]settb=AVTB,setpts=PTS-STARTPTS[main];[1:v]settb=AVTB,setpts=PTS-STARTPTS[ref];[main][ref]ssim" \
-f null -
(b') A=input.m2ts, B=output_hb_crf-0.mp4, ssim=0.999999
(c') A=input.m2ts, B=output_hb_crf-30.mp4, ssim=0.972886
(d') A=output_hb_crf-0.mp4, B=output_hb_crf-0.mp4, ssim=1.000000
(e') A=output_hb_crf-0.mp4, B=output_hb_crf-30.mp4, ssim=0.972886
It's all as I expected. (although A is not ssim=1.0)
Therefore, I don't see a problem with input.m2ts.
Informations about video files and tools
Results of the ffprobe
input.m2ts
[mpeg2video # 0x5655577c1680] Invalid frame dimensions 0x0.
Last message repeated 1 times
[mpegts # 0x5655577bd080] start time for stream 2 is not set in estimate_timings_from_pts
[mpegts # 0x5655577bd080] PES packet size mismatch
Input #0, mpegts, from 'input.m2ts':
Duration: 00:30:02.68, start: 39593.392600, bitrate: 19019 kb/s
Program 211
Stream #0:0[0x140]: Video: mpeg2video (Main) ([2][0][0][0] / 0x0002), yuv420p(tv, bt709, top first), 1920x1080 [SAR 1:1 DAR 16:9], 29.97 fps, 29.97 tbr, 90k tbn, 59.94 tbc
Stream #0:1[0x141]: Audio: aac (LC) ([15][0][0][0] / 0x000F), 48000 Hz, stereo, fltp, 255 kb/s
Stream #0:2[0x138]: Data: bin_data ([6][0][0][0] / 0x0006)
Unsupported codec with id 100359 for input stream 2
output_ff_crf-0.mp4
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'output_ff_crf-0.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf57.83.100
Duration: 00:30:02.67, start: 0.000000, bitrate: 109301 kb/s
Stream #0:0(und): Video: h264 (High 4:4:4 Predictive) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], 109040 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 252 kb/s (default)
Metadata:
handler_name : SoundHandler
output_hb_crf-0.mp4
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'output_hb_crf-0.mp4':
Metadata:
major_brand : mp42
minor_version : 512
compatible_brands: isomiso2avc1mp41
creation_time : 2020-05-17T06:22:06.000000Z
encoder : HandBrake 1.1.0 2018042400
Duration: 00:30:02.22, start: 0.000000, bitrate: 109661 kb/s
Stream #0:0(und): Video: h264 (High 4:4:4 Predictive) (avc1 / 0x31637661), yuv420p(tv, bt709), 1920x1080 [SAR 1:1 DAR 16:9], 109405 kb/s, 29.97 fps, 29.97 tbr, 90k tbn, 180k tbc (default)
Metadata:
creation_time : 2020-05-17T06:22:06.000000Z
handler_name : VideoHandler
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 252 kb/s (default)
Metadata:
creation_time : 2020-05-17T06:22:06.000000Z
handler_name : Stereo
Tools
version
ffmpeg: 3.4.6-0ubuntu0.18.04.1
HandBrakeCLI: 1.1.0
ldd
$ ldd /usr/bin/ffmpeg
...
libx264.so.152 => /usr/lib/x86_64-linux-gnu/libx264.so.152 (0x00007efbf1f33000)
...
$ ldd /usr/bin/HandBrakeCLI
...
libx264.so.152 => /usr/lib/x86_64-linux-gnu/libx264.so.152 (0x00007efbfb38f000)
...
ffmpeg and HandBrakeCLI are using the same libx264.
I concatenated multiple videos using the following ffmpeg command:
ffmpeg -i video1.avi video2.avi -f concat -c copy -safe 0 -o concat.mov
Are there any chance that original files could be splitted back from concat.mov? Since this is a concat and copy, are there any markers in concat.mov that I could utilize?
Update: original video codecs:
Input #0, avi, from 'DVR___2017-08-10_09.17.56.AVI':
Metadata:
encoder : DVR ZIR32 SW: 1.1.001
Duration: 00:20:00.00, start: 0.000000, bitrate: 897 kb/s
Stream #0:0: Video: mjpeg (MJPG / 0x47504A4D), yuvj422p(pc, bt470bg/unknown/unknown), 640x360 [SAR 1:1 DAR 16:9], 703 kb/s, 5 fps, 5 tbr, 5 tbn, 5 tbc
Stream #0:1: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 12000 Hz, mono, s16, 192 kb/s
Concatenated file:
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'DVR___2017-08-07_19.55.36.AVI.mov':
Metadata:
major_brand : qt
minor_version : 512
compatible_brands: qt
encoder : Lavf57.71.100
Duration: 00:52:58.80, start: 0.000000, bitrate: 875 kb/s
Stream #0:0(eng): Video: mjpeg (jpeg / 0x6765706A), yuvj422p(pc, bt470bg/unknown/unknown), 640x360 [SAR 1:1 DAR 16:9], 682 kb/s, 5 fps, 5 tbr, 10240 tbn, 10240 tbc (default)
Metadata:
handler_name : DataHandler
Stream #0:1(eng): Audio: pcm_s16le (sowt / 0x74776F73), 12000 Hz, mono, s16, 192 kb/s (default)
Metadata:
handler_name : DataHandler
when I run ffmpeg, I can see "default" audio and video stream:
Stream #0:0(und): Video: h264 (Constrained Baseline) (avc1 / 0x31637661), yuv420p, 960x400 [SAR 1:1 DAR 12:5], 3859 kb/s, 23.98 fps, 23.98 tbr, 24k tbn, 47.95 tbc
(default)
Metadata:
creation_time : 2013-05-03 22:50:47
handler_name : GPAC ISO Video Handler
Stream #0:1(und): Audio: aac (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 92 kb/s
(default)
Metadata:
creation_time : 1970-01-01 00:00:00
handler_name : SoundHandler
Stream #0:2: Video: mjpeg, yuvj420p(pc), 675x1000 [SAR 72:72 DAR 27:40], 90k tbr, 90k tbn, 90k tbc
As I understand, this stream selected by ffmpeg as input when encoding, if map option not set.
How can I get "default" stream using ffprobe ?
Sorry for English