How to trim webm video while preserving transparency - ffmpeg

I want to trim a transparent webm video using ffmpeg. Here's the ffprobe result for that video:
Input #0, matroska,webm, from 'template.webm':
Metadata:
ENCODER : Lavf58.29.100
Duration: 00:00:05.24, start: -0.002000, bitrate: 2856 kb/s
Stream #0:0: Video: vp8, yuv420p(progressive), 1573x900, SAR 1:1 DAR 1573:900, 30 fps, 30 tbr, 1k tbn, 1k tbc (default)
Metadata:
ALPHA_MODE : 1
ENCODER : Lavc58.54.100 libvpx
DURATION : 00:00:05.240000000
Stream #0:1: Audio: opus, 48000 Hz, mono, fltp
Metadata:
ENCODER : Lavc58.54.100 libopus
DURATION : 00:00:05.241000000
I tried
ffmpeg -i template.webm -ss 1 -to 3 -c copy trimmed.webm
but the trimmed video doesn't start (or sometimes end) at the exact times defined in the command so I tried re-encoding the video using libvpx
ffmpeg -i template.webm -ss 1 -to 3 -c:v libvpx -c:a copy -crf 30 -b:v 0 trimmed.webm
It solved the timing issue but this results in loss of transparency of output video. Here's the ffprobe:
Input #0, matroska,webm, from 'trimmed.webm':
Metadata:
ENCODER : Lavf57.83.100
Duration: 00:00:02.00, start: -0.001000, bitrate: 1395 kb/s
Stream #0:0: Video: vp8, yuv420p(progressive), 1573x900, SAR 1:1 DAR 1573:900, 30 fps, 30 tbr, 1k tbn, 1k tbc (default)
Metadata:
ALPHA_MODE : 1
ENCODER : Lavc57.107.100 libvpx
DURATION : 00:00:02.000000000
Stream #0:1: Audio: opus, 48000 Hz, mono, fltp
Metadata:
ENCODER : Lavc58.54.100 libopus
DURATION : 00:00:02.001000000
How should I trim the video while preserving the transparency? Moreover, a fast solution will be extremely helpful.

The native, built-in FFmpeg VP8 decoder does not yet support alpha/transparency. Use libvpx to decode:
ffmpeg -c:v libvpx -i template.webm -ss 1 -to 3 -c:v libvpx -c:a copy -crf 30 -b:v 0 trimmed.webm
If you get Transparency encoding with auto_alt_ref does not work error then add the -auto-alt-ref 0 output option or change -c:v libvpx output option to -c:v libvpx-vp9.

Related

FFMPEG - concatenating mp4s from different sources - unable to stop "Non-monotonous DTS in output stream" warning

I need to concatenate mp4 files from different sources, this means some of the variables are out of my control such as timebase, aspect ratio and encoding. So to get around this I re-encode and attempt to standardise the files before concatenating them. Unfortunately, despite this I get Non-monotonous DTS in output stream warnings during the concatenation stage, and the output video seems to always have broken audio/video syncing by the last segment.
I know there are a lot of other questions out there about resolving the warning above, but I've been through them all and reviewed the documentation.. but unfortunately I've been still been unable to solve it..
I think the thing which I don't understand is: if I have mp4s from different sources, what exactly do I need to do to ensure that the files will always neatly concatenate together?
What I've tried so far
The script I'm using to standardise the mp4 files before concantenation is the following (amends resolution, frame rate, timebase, bitrate for audio, bitrate for video, audio encoding and video encoding):
ffmpeg -y -i $1 -vf 'scale=1280:720:force_original_aspect_ratio=1,pad=1280:720:(ow-iw)/2:(oh-ih)/2' -r 30 -video_track_timescale 90000 -b:a 128K -b:v 1200K -c:a aac -c:v libx264 $2
Here's the ffprobe output on two of the files, there are some differences but I'm not sure if they are significant?
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'intro.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.12.100
Duration: 00:00:08.98, start: 0.000000, bitrate: 1210 kb/s
Stream #0:0(eng): Video: h264 (avc1 / 0x31637661), yuv420p, 1280x720 [SAR 1:1 DAR 16:9], 1069 kb/s, 30 fps, 30 tbr, 90k tbn, 60 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(eng): Audio: aac (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 132 kb/s (default)
Metadata:
handler_name : SoundHandler
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'middle.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.12.100
Duration: 00:00:59.72, start: 0.000000, bitrate: 1200 kb/s
Stream #0:0(und): Video: h264 (avc1 / 0x31637661), yuv420p, 1280x720 [SAR 1:1 DAR 16:9], 1063 kb/s, 30 fps, 30 tbr, 90k tbn, 60 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(und): Audio: aac (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 128 kb/s (default)
Metadata:
handler_name : SoundHandler
They all have normal video and audio at this point.
After that I concatenate them and add a watermark using the following (it sucks that I need to re-encode here):
ffmpeg -y \
-f concat \
-safe 0 \
-i $INFILES \
-c:v libx264 \
-c:a copy \
-preset fast \
-vf drawtext=enable="'between(t, $DRAW_TEXT_DELAY, $DRAW_TEXT_DURATION)': fontfile=$FONT_DIR/$FONT: text='$TEXT': fontcolor=$FONTCOLOR: fontsize=$FONTSIZE: $POSITION" \
$OUTFILE
INFILES is a path to a text file formatted like:
file /usr/src/app/data/test/out/intro.mp4
file /usr/src/app/data/test/out/middle.mp4
file /usr/src/app/data/test/out/outro.mp4
What am I missing here? Is there a way to debug this further?
Your audio streams have distinct sampling rates, and may have distinct channel count as well. Also, compressed MPEG audio streams will introduce slight async upon concat.
Use
ffmpeg -y -i $1 -vf 'scale=1280:720:force_original_aspect_ratio=1,pad=1280:720:(ow-iw)/2:(oh-ih)/2,setsar=1,format=yuv420p' -r 30 -c:v libx264 -b:v 1200K -ac 2 -ar 48000 -c:a pcm_s16le -video_track_timescale 90000 $2
to standardize, but save to MOV.
Then during concat, change -c:a copy to -c:a aac.
There are three methods to concatenate files in FFmpeg.
Demuxer (You are using this)
This method can be used to concat files with the same paramters, like codecs, size, PAR, etc.
$ ffmpeg -concat -i files.txt [...] output.mp4
Protocol
Same as the first one, but on top of that, this method is useful for files that can be copied together bitwise - it doesn't involves re-encoding (some formats support this, like MpegTS or some lossless formats).
$ ffmpeg -i "concat:FILE_0| ... |FILE_N" [...] output.mp4
Filter
If you have videos with different codecs, you have to use this method:
$ ffmpeg -i <FILE_0> ... -i <FILE_N> [...] -filter_complex "[0:0][0:1]...[<N>:0][<N>:1] concat=n=<N>:v=1:a=1[v_out][a_out]" -map [v_out] -map [a_out] output.mp4
The concat filter decodes the video and reencodes it with the same parameters. It also takes care of the audio streams. I'm not entirely sure what does it do if the resolutions are different, but this should be a good start.

ffmpeg http slower than downloading first

When trying to get a clip of a video from a remote source
Input source:
ffprobe version 3.3.2-static http://johnvansickle.com/ffmpeg/ Copyright (c) 2007-2017 the FFmpeg developers
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'http://website.com/video.mp4':
Metadata:
major_brand : mp42
minor_version : 0
compatible_brands: mp42mp41
creation_time : 2017-07-13T15:44:58.000000Z
Duration: 00:57:32.42, start: 0.000000, bitrate: 1939 kb/s
Stream #0:0(eng): Video: h264 (Main) (avc1 / 0x31637661), yuv420p(tv), 720x480 [SAR 10:11 DAR 15:11], 1745 kb/s, 29.97 fps, 29.97 tbr, 29970 tbn, 59.94 tbc (default)
Metadata:
creation_time : 2017-07-13T15:44:58.000000Z
handler_name : Mainconcept MP4 Video Media Handler
encoder : AVC Coding
Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 189 kb/s (default)
Metadata:
creation_time : 2017-07-13T15:44:58.000000Z
handler_name : Mainconcept MP4 Sound Media Handler
Current command:
ffmpeg version 3.3.2-static
ffmpeg.linux -threads 2 -y -ss 3273 -i "http://website.com/video.mp4" -an -movflags +faststart -preset veryfast -codec copy /outputfolder/trimmed_video.mp4
This 5m35.102s to create a 45mb 2min file.
If I download the file using wget it takes 28s and using ffmpeg only takes 0.243s
If I add -vn OR -an to the output portion of the command it completes the trim in about 2.101s. Meaning it's faster to download the two stream and merge them myself.
Can anyone explain this behaviour and why my first command takes so long when on a lot of other video files it's very fast?
Videos where the command completed fast were optimized for streaming by youtube.
User uploaded media was not optimized.
Running ffmpeg with -movflags +faststart on the files fixed the issues server side.

duration change after transcode ts

i have a problem about transcode with ffmpeg
i want to cover m3u8 to mp4, so i transcode every ts file first, and then concat them to a mp4, but i found that the duration will be bigger than source file.
source file is :
http://oc7iy3eta.bkt.clouddn.com/src_20.ts
after transcode, test file is:
http://oc7iy3eta.bkt.clouddn.com/test_20.ts
i use the command as bellow to change to 5fps, and 400k bitrate:
sudo ffmpeg -analyzeduration 2147483647 -probesize 2147483647 -nostdin -y -v warning -i ./src_20.ts -threads 3 -movflags faststart -metadata:s:v rotate=0 -chunk_duration 520000 -video_track_timescale 25000 -pix_fmt yuv420p -copytb 1 -vcodec libx264 -b:v 400000 -minrate 400000 -maxrate 400000 -bufsize 500k -force_key_frames "expr:gte(t,n_forced*2)" -vsync 1 -r 5 -s 544*960 -acodec libfaac -async 1 ./test_20.ts
i use ffprobe command to see video info:
source file info:
Duration: 00:00:01.26, start: 28.346989, bitrate: 921 kb/s
Program 1
Metadata:
service_name : Service01
service_provider: FFmpeg
Stream #0:0[0x100]: Audio: aac ([15][0][0][0] / 0x000F), 44100 Hz, stereo, fltp, 23 kb/s
Stream #0:1[0x101]: Video: h264 (High) ([27][0][0][0] / 0x001B), yuv420p, 544x960, 10.67 tbr, 90k tbn, 180k tbc
test file:
Input #0, mpegts, from 'test_20.ts':
Duration: 00:00:01.62, start: 1.576778, bitrate: 447 kb/s
Program 1
Metadata:
service_name : Service01
service_provider: FFmpeg
Stream #0:0[0x100]: Video: h264 (High) ([27][0][0][0] / 0x001B), yuv420p, 544x960, 5 fps, 5 tbr, 90k tbn, 10 tbc
Stream #0:1[0x101]: Audio: aac ([15][0][0][0] / 0x000F), 44100 Hz, stereo, fltp, 5 kb/s
=======================================================================
question
so , we can see that the duration of src file is 1.26s , but after transcode, the test file is 1.62s.
why? can anybody help
I suggest you save the m3u8 to a single TS and then transcode that to MP4.
ffmpeg -i in.m3u8 -c copy src.ts
Your current command is transcoding each TS to CFR at half the rate but your source timestamps have some jitter, so due to PTS quantization, there will be a mismatch. A single file transcode will minimize it.

FFMPEG add text frames to the start of video

I have some videos either in mp4 or webm format, and I'd like to use ffmpeg to add 4 seconds to the start of each video to display some text in the center with no sound.
Some other requirements:
try to avoid re-encoding the video
need to maintain the quality (resolution, bitrate, etc)
(optional) to make the text fade in/out
I am new to ffmpeg and any help will be appreciated.
thanks in advance
Example ffprobe information for mp4 below:
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'input.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf55.33.100
Duration: 00:00:03.84, start: 0.042667, bitrate: 1117 kb/s
Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280x720, 1021 kb/s, 30 fps, 30 tbr, 15360 tbn, 60 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 140 kb/s (default)
Metadata:
handler_name : SoundHandler
Example webm
Input #0, matroska,webm, from 'input.webm':
Metadata:
encoder : Lavf55.33.100
Duration: 00:00:03.80, start: 0.000000, bitrate: 1060 kb/s
Stream #0:0(eng): Video: vp8, yuv420p, 1280x720, SAR 1:1 DAR 16:9, 30 fps, 30 tbr, 1k tbn, 1k tbc (default)
Stream #0:1(eng): Audio: vorbis, 48000 Hz, stereo, fltp (default)
Screenshot from joined.mp4
Screenshot for step 3 console
You'll have to generate a 4 second video with dummy audio matching the parameters of the existing video, including timebase, and then use the concat demuxer with streamcopy.
For the sample files shown in Q:
Step 1 Generate text video
ffmpeg -f lavfi -r 30 -i color=black:1280x720 -f lavfi -i anullsrc -vf "drawtext=fontfile='/path/to/font.ttf':fontcolor=FFFFFF:fontsize=50:text='Your text':x='(main_w-text_w)/2':y='(main_h-text_h)/2',fade=t=in:st=0:d=1,fade=t=out:st=3:d=1" -c:v libx264 -b:v 1000k -pix_fmt yuv420p -video_track_timescale 15360 -c:a aac -ar 48000 -ac 2 -sample_fmt fltp -t 4 intro.mp4
For WebM, replace -c:v libx264 with -c:v libvpx, -c:a aac with -c:a libvorbis and intro.mp4 with intro.webm. You may remove the -video_track_timescale 15360 since WebMs tend to use a single timescale, that I've seen.
Step 2 Prepare concat file, say, list.txt
file 'intro.mp4'
file 'input.mp4'
Step 3 Concat
ffmpeg -f concat -i list.txt -c copy -fflags +genpts joined.mp4
The variables important here are video size 1280x720, frame rate -r 30, -pix_fmt yuv420p, sample rate -ar 48000, format -sample_fmt fltp, channel layout -ac 2 and of course, codecs.
Short answer is that you cannot encode new data as mp4 or webm and insert it at the front of the video stream. Those formats simply do not work like that. Both of these encoding formats are lossy, so if you decode and encode them again then additional information will be lost/changed by the second encoding. You could do something else, but what you are trying to do will not work.

Demux audio (AMR_WB) and video(H264) from mp4 file using ffmpeg

I want to demux audio (AMR_WB) and video(H264) from an mp4 file. I need to write a program which does this using ffmpeg libraries.
In demuxing.c file which is there in FFMPEG examples i was able to get only the raw formats as the output.
Can i somehow modify that code to get H264 and AMR_WB in encoded format from the mp4 file?
Run ffmpeg twice , each time specify that just 1 track be copy to output.
Example on diff mp4 will provide most of the idea which u will need to adapt to your specific track types for the respective video/audio in your container...
MP4 example : demux h264 and aac tracks to separate outputs (tout1, tout2 )
Whats in input?
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'phoneCam_20120902_112701.mp4':
Metadata:
major_brand : isom
minor_version : 0
compatible_brands: isom3gp4
creation_time : 2012-09-02 18:27:14
Duration: 00:00:12.65, start: 0.000000, bitrate: 8011 kb/s
Stream #0:0(eng): Video: h264 (Baseline) (avc1 / 0x31637661), yuv420p, 1280x720, 7707 kb/s, SAR 65536:65536 DAR 16:9, 28.64 fps, 29.83 tbr, 90k tbn, 180k tbc
Metadata:
creation_time : 2012-09-02 18:27:14
handler_name : VideoHandle
Stream #0:1(eng): Audio: aac (mp4a / 0x6134706D), 48000 Hz, mono, s16, 96 kb/s
Pass 1, just get the Vid
ffmpeg -i phoneCam_20120902_112701.mp4 -map 0:0 -c copy tout1.mp4
Pass2 just get the aud
ffmpeg -i phoneCam_20120902_112701.mp4 -map 0:1 -c aac -ar 48000 -ab
48000 -strict -2 tout2.3gp
In your program, just run ffmpeg from the CLI or call main() in ffmpeg.c

Resources