Speed up hardcoding subs with ffmpeg vs handbrake - ffmpeg

I have a video that i need to hardcode subtitles for legacy devices so I usually use ffmpeg for this and it works fine but its so slow.
This is the command i usually use.
ffmpeg -i test-cut.mp4 -vf "subtitles=test-cut.srt" -c:v libx264 -crf 24 -vsync passthrough -c:a copy test.ffmpeg.mp4
The other day i tried handbrakeCLI and it was a lot lot faster on the same file.
I used this HandbrakeCLI command to convert and hardcode, i know its not like for like but you would expect ffmpeg to be faster since its not converting the audio and really only has to convert the video at the frames that need subtitles.
HandBrakeCLI --preset "Very Fast 1080p30" --format av_mp4 -i test-cut.mp4 --srt-file test-cut.srt --srt-burn=1 -o test.handbrake.mp4
Is there a way that i can have ffmpeg just inprint the subs on the video that needs subs. instead of having to convert the whole video.
For example if i have a minute long video and there is 1 sentence that is said at 30 seconds in and that sentence/sub needs to be displayed for 5 seconds to have ffmpeg only convert that 5 seconds segment and just copy the rest of the video and audio to the new output?
The input video is 720p with 24 FPS and the speed difference is usually that handbrake is 2x and ffmpeg is about 0.7x of video duration. I know im using a 1080p profile for handbrake. Both were tested on the same machine and the same video.
Versions
ffmpeg version 4.2.2
HandBrake 1.3.1
video info
original video
CONTAINER......: MPEG-4
SIZE...........: 13.3 MiB
RUNTIME........: 45 s 94 ms
VIDEO CODEC....: avc1, High#L3.1, 8 bits
RESOLUTION.....: 1280x720
BITRATE........: 2 402 kb/s
FRAMERATE......: 24.000 FPS
AUDIO..........: AAC, 2 channels, 66.2 kb/s
handbrake output
CONTAINER......: MPEG-4
SIZE...........: 2.81 MiB
RUNTIME........: 45 s 51 ms
VIDEO CODEC....: x264, avc1, Main#L4, 8 bits
RESOLUTION.....: 1118x692
BITRATE........: 355 kb/s
FRAMERATE......: 24.000 FPS
AUDIO..........: AAC, 2 channels, 160 kb/s
ffmpeg output
CONTAINER......: MPEG-4
SIZE...........: 3.90 MiB
RUNTIME........: 45 s 94 ms
VIDEO CODEC....: x264, avc1, High#L3.1, 8 bits
RESOLUTION.....: 1280x720
BITRATE........: 651 kb/s
FRAMERATE......: 23.976 (23976/1000) FPS
AUDIO..........: AAC, 2 channels, 66.2 kb/s
Would be grateful if anyone could provide any suggestions to get ffmpeg faster at hardcoding subs.

For Handbrake, you use the veryfast preset.You can also specify a preset with ffmpeg, by default it is set to medium but you can change:-preset veryfast

Related

Recording streams audio with ffmpeg for Cloud Speech-to-Text

Goodnight
I am trying to record audio with the following features:
codec: flac
sampling rate: 16000hz
I am testing with the following line of code:
ffmpeg -t 15 -i http://198.15.86.218:9436/stream -codec:a flac -b:a 16k example.flac
But when reviewing the output file, I get the following:
codec: flac
sampling rate: 44000hz
I could guide the correct use of ffmpeg options.
-b:a is to set the bitrate*. For sampling rate, you have to use -ar.
Use
ffmpeg -t 15 -i http://198.15.86.218:9436/stream -codec:a flac -ar 16k example.flac
*for a lossless codec, bitrate setting is irrelevant.
"Free Lossless Audio Codec" Flac is lossless and hence output bit-rate cannot be controlled precisely. -b:a 16k is actually trying to set output bit-rate of audio to 16k bits per second.
While in your case you need it to be sampled at 16000 Hz. So the correct option would be to use -ar [audio rate]
ffmpeg -t 15 -i http://198.15.86.218:9436/stream -c:a flac -ar 16000 example.flac
If you want to control the output bit-rate with FLAC encoder then you can use the option -compression_level 0-15 with 5 being default. You can get mode details on controlling other parameters of FLAC ffmpeg encoder here.

Why the audio of my mp4 file is going out of async

I am having a problem converting a wmv file to mp4. I am using x264.exe this command to get the video stream
x264 --output temporal.264 --fps 25 --preset slow --bitrate 2048 --vbv-maxrate 2048 --vbv-bufsize 9600 --min-keyint 48 --keyint 48 --scenecut 0 --no-scenecut --pass 1 --video-filter "resize:width=640,height=480" Original.wmv
Then I use ffmpeg.exe to extract the audio stream with this line:
ffmpeg -i .wmv -acodec libfdk_aac -b:a 32000 temporal.aac
finally I use MP4Box to merges each stream with this line:
MP4Box -add temporal.264 Final.mp4
MP4Box -add temporal.aac Final.mp4
The problem is that the final.mp4 audio is out of sync. It starts good but with time it goes out of sync with time.
I run this command:
MP4Box -info 010004470063PE-10022017083824-2_MultiMedia--1.mp4
and I discover that the estimated time of both streams are different:
output of command
* Movie Info *
Timescale 600 - 2 tracks
Computed Duration 01:00:03.643 - Indicated Duration 01:00:03.643
Fragmented File: no
File suitable for progressive download (moov before mdat)
File Brand isom - version 1
Compatible brands: isom avc1
Created: GMT Wed Jun 27 16:31:44 2018
Modified: GMT Wed Jun 27 16:31:44 2018
File has root IOD (9 bytes)
Scene PL 0xff - Graphics PL 0xff - OD PL 0xff
Visual PL: AVC/H264 Profile (0x7f)
Audio PL: AAC Profile # Level 2 (0x29)
No streams included in root OD
Track # 1 Info - TrackID 1 - TimeScale 25000
Media Duration 00:59:57.520 - Indicated Duration 00:59:57.520
Track has 1 edit lists: track duration is 00:59:57.320
Media Info: Language "Undetermined (und)" - Type "vide:avc1" - 89938 samples
Visual Track layout: x=0 y=0 width=640 height=480
MPEG-4 Config: Visual Stream - ObjectTypeIndication 0x21
AVC/H264 Video - Visual Size 640 x 480
AVC Info: 1 SPS - 1 PPS - Profile Main # Level 3
NAL Unit length bits: 32
Chroma format YUV 4:2:0 - Luma bit depth 8 - chroma bit depth 8
SPS#1 hash: 41EE779BEF2AA71A7131EAFD3C77C7E3BC95FD8E
PPS#1 hash: 086E1D72A40A0E8CF35D102F34A9DF6CD44D6CEF
Self-synchronized
RFC6381 Codec Parameters: avc1.4D401E
Average GOP length: 250 samples
Track # 2 Info - TrackID 2 - TimeScale 44100
Media Duration 01:00:03.644 - Indicated Duration 01:00:03.644
Media Info: Language "Undetermined (und)" - Type "soun:mp4a" - 155196 samples
MPEG-4 Config: Audio Stream - ObjectTypeIndication 0x40
MPEG-4 Audio AAC LC - 2 Channel(s) - SampleRate 44100
Synchronized on stream 1
RFC6381 Codec Parameters: mp4a.40.2
All samples are sync
I am not shore why this is happening, becar the original wmv is perfectly synchronize. Any help?
.aac is a raw container having no timestamps, if there are PTS gaps in the source audio, they will be lost.
You have two workarounds:
a) extract to a timed samples container
ffmpeg -i .wmv -acodec libfdk_aac -b:a 32000 -vn temporal.m4a
b) fill in the gaps and extract
ffmpeg -i .wmv -af aresample=async=1 -acodec libfdk_aac -b:a 32000 temporal.aac
Base on the response of Gyan I use ffmpeg to transcode the wmv file to an mp4, separating the audio and video sequences was a bad idea. At the end I use the following command for transcoding:
ffmpeg -i <input>.wmv -c:v libx264 -preset slow -crf 23 -c:a aac -r 25 -b:a 48k -strict -2 -max_muxing_queue_size 4000 <output>.mp4

MPEG-DASH - Multiplexed Representations Issue

I'm trying to learn ffmpeg, MP4Box, and MPEG-DASH, but I'm running into an issue with the .mp4 I'm using. I'm using ffmpeg to demux the mp4 with this command:
ffmpeg -i test.mp4 -c:v copy -g 72 -an video.mp4 -c:a copy audio.mp4
Once the two files are created, I use MP4Box to segment the files for the dash player using this command:
MP4Box -dash 4000 -frag 1000 -rap -segment-name segment_ output.mp4
Which does create all the files I think I need. Then I point the player to the output_dash.mpd and nothing happens except a ton of messages in the console:
[8] EME detected on this user agent! (ProtectionModel_21Jan2015)
[11] Playback Initialized
[21] [dash.js 2.3.0] MediaPlayer has been initialized
[64] Parsing complete: ( xml2json: 3.42ms, objectiron: 2.61ms, total: 0.00603s)
[65] Manifest has been refreshed at Wed Apr 12 2017 12:16:52 GMT-0600 (MDT)[1492021012.196]
[72] MediaSource attached to element. Waiting on open...
[77] MediaSource is open!
[77] Duration successfully set to: 148.34
[78] Added 0 inline events
[78] No video data.
[79] No audio data.
[79] No text data.
[79] No fragmentedText data.
[79] No embeddedText data.
[80] Multiplexed representations are intentionally not supported, as they are not compliant with the DASH-AVC/264 guidelines
[81] No streams to play.
Here is the MP4Box -info on the video I'm using:
* Movie Info *
Timescale 1000 - Duration 00:02:28.336
Fragmented File no - 2 track(s)
File suitable for progressive download (moov before mdat)
File Brand mp42 - version 512
Created: GMT Wed Feb 6 06:28:16 2036
File has root IOD (9 bytes)
Scene PL 0xff - Graphics PL 0xff - OD PL 0xff
Visual PL: Not part of MPEG-4 Visual profiles (0xfe)
Audio PL: Not part of MPEG-4 audio profiles (0xfe)
No streams included in root OD
iTunes Info:
Name: Rogue One - A Star Wars Story
Artist: Lucasfilm
Genre: Trailer
Created: 2016
Encoder Software: HandBrake 0.10.2 2015060900
Cover Art: JPEG File
Track # 1 Info - TrackID 1 - TimeScale 90000 - Duration 00:02:28.335
Media Info: Language "Undetermined" - Type "vide:avc1" - 3552 samples
Visual Track layout: x=0 y=0 width=1920 height=816
MPEG-4 Config: Visual Stream - ObjectTypeIndication 0x21
AVC/H264 Video - Visual Size 1920 x 816
AVC Info: 1 SPS - 1 PPS - Profile High # Level 4.1
NAL Unit length bits: 32
Pixel Aspect Ratio 1:1 - Indicated track size 1920 x 816
Self-synchronized
Track # 2 Info - TrackID 2 - TimeScale 44100 - Duration 00:02:28.305
Media Info: Language "English" - Type "soun:mp4a" - 6387 samples
MPEG-4 Config: Audio Stream - ObjectTypeIndication 0x40
MPEG-4 Audio MPEG-4 Audio AAC LC - 2 Channel(s) - SampleRate 44100
Synchronized on stream 1
Alternate Group ID 1
I know I need to separate the video and audio and I think that's where my issue is. The command I'm using probably isn't doing the right thing.
Is there a better command to demux my mp4?
Is the MP4Box command I'm using best for segmenting the files?
If I use different files, will they always need to be demuxed?
One thing to mention, if I use the following commands everything works fine, but there is no audio because of the -an which means it's only video:
ffmpeg -i test.mp4 -c:v copy -g 72 -an output.mp4
MP4Box -dash 4000 -frag 1000 -rap -segment-name segment_ output.mp4
UPDATE
I noticed that the video had no audio stream, but the audio had the video stream which is why I got the mux error. I thought that might be an issue so I ran this command to keep the unwanted streams out of the outputs:
ffmpeg -i test.mp4 -c:v copy -g 72 -an video.mp4 -c:a copy -vn audio.mp4
then I run:
MP4Box -dash 4000 -frag 1000 -rap -segment-name segment_ video.mp4 audio.mp4
now I no longer get the Multiplexed representations are intentionally not supported... message, but now I get:
[122] Video Element Error: MEDIA_ERR_SRC_NOT_SUPPORTED
[123] [object MediaError]
[125] Schedule controller stopping for audio
[126] Caught pending play exception - continuing (NotSupportedError: Failed to load because no supported source was found.)
I tried playing the video and audio independently through Chrome and they both work, just not through the dash player. Ugh, this is painful to learn, but I feel like I'm making progress.
I ended up going with Bento4. I'm not sure why I couldn't get MP4Box working, but Bento4 worked very easy and had me up and going within a few hours.
You are not using the good profile. Considering the logs, I suppose you're using the dash-if player. In that case, you need to use this command:
MP4Box -dash 4000 -frag 1000 -profile dashavc264:onDemand -rap -segment-name segment_ output.mp4

ffmpeg not honoring sample rate in opus output

I am capturing a live audio stream to Opus, and no matter what I choose for the audio sample rate, I get 48khz output.
This is my command line
./ffmpeg -f alsa -ar 16000 -i sysdefault:CARD=CODEC -f
alsa -ar 16000 -i sysdefault:CARD=CODEC_1 -filter_complex
join=inputs=2:channel_layout=stereo:map=0.1-FR\|1.0-FL,asetpts=expr=N/SR/TB
-ar 16000 -ab 64k -c:a opus -vbr off -compression_level 5 output.ogg
And this is what ffmpeg responds with:
Output #0, ogg, to 'output.ogg': Metadata:
encoder : Lavf57.48.100
Stream #0:0: Audio: opus (libopus), 16000 Hz, stereo, s16, delay 104, padding 0, 64 kb/s (default)
Metadata:
encoder : Lavc57.54.100 libopus
However, it appears that ffmpeg has lied, because when analysing the file again, I get:
Input #0, ogg, from 'output.ogg': Duration: 00:00:03.21, start:
0.000000, bitrate: 89 kb/s
Stream #0:0: Audio: opus, 48000 Hz, stereo, s16, delay 156, padding 0
Metadata:
ENCODER : Lavc57.54.100 libopus
I have tried so many permutations of sample rate, simplifying down to a single audio input etc etc - always with the same result.
Any ideas?
This question should be asked and answered on Super User, since it's about using software instead of programming. But, since I know the answer, I'll post one anyway.
FFmpeg will encode Opus at the sample rate specified. You can verify this in the source code of libopusenc.c (here and here).
But FFmpeg will decode Opus at 48 kHz, even if it was encoded at a lower sample rate. You can verify this in libopusdec.c (here and here).
This is actually recommended by the Ogg Opus specification (IETF RFC 7845). Section 5.1, item 5 says:
An Ogg Opus player SHOULD select the playback sample rate according to the following procedure:
If the hardware supports 48 kHz playback, decode at 48 kHz.
Otherwise, if the hardware's highest available sample rate is a supported rate, decode at this sample rate.
Otherwise, if the hardware's highest available sample rate is less than 48 kHz, decode at the next higher Opus supported rate above the highest available hardware rate and resample.
Otherwise, decode at 48 kHz and resample.
Since FFmpeg and most hardware support 48 kHz playback, 48 kHz is used for decoding Opus in FFmpeg. The original sample rate is stored in the OpusHead packet of the Ogg container, so you can retrieve it using a parser or different player if you wish, but FFmpeg ignores it and just decodes at 48 kHz.

How to create m3u8 playlist and mpeg-ts chunks with constant duration by using FFMPEG?

I have mp4 file (Big Buck Bunny):
Duration: 00:09:56.50
Bitrate: 2048 kb/s
Size: 1280x720
fps: 29.97
I've set constant keyframes after 2 second.
I want to prepare this video for HLS.
I use this for generate m3u8 playlist and generate ts chunks:
ffmpeg -i input.mp4 -hls_time 2 out.m3u8
But unfortunately I don't understand how it works.
I've thought this command generates 298 chunks of 2 seconds but it generates only 152 chunks with different lengths (3 - 9 seconds).
But the most strange thing it have created m3u8 file with only 5 links to files.
#EXTM3U
#EXT-X-VERSION:3
#EXT-X-TARGETDURATION:9
#EXT-X-MEDIA-SEQUENCE:148
#EXTINF:8.341667,
out148.ts
#EXTINF:7.841167,
out149.ts
#EXTINF:0.967633,
out150.ts
#EXTINF:8.341667,
out151.ts
#EXTINF:7.140467,
out152.ts
#EXT-X-ENDLIST
I've thought m3u8 file have to includes all part of videos. Can somebody explain me how to create 298 chunks each of 2 seconds and fill m3u8 file properly?
To force a keyframe every 2 seconds you can specify the GOP size using -g:
ffmpeg -i input.mp4 -g 60 -hls_time 2 out.m3u8
Where 29.97 fps * 2s ~= 60 frames, meaning a keyframe each 60 frames.
Otherwise it will wait to split on a keyframe and the minimum duration will vary.
To keep all segments add -hls_list_size 0, otherwise it keeps just the default value of 5.

Resources