Use FFmpeg to combine fdshow audio capture with still image - image

On windows I am trying to capture direct show audio and combine with a still image. I have come up with the following command:
ffmpeg -f dshow -i audio="Microphone (Conexant SmartAudio HD)" -loop 1 -i black2.png -b:a 30k -ac 1 -acodec libfdk_aac -vcodec libx264 -b:v 60k -shortest test.mp4
This nearly works, a video with image and audio are produced but the video output is created at a much faster rate that the captured audio. So if the audio is capturing for 1 minute a 5 minute video is produced instead of 1 minute. The audio plays for the 1st minute and there is no audio for the remaining 4 minutes, The images displays through out video.
Any help appreciated, Thank you.

Try
ffmpeg -f dshow -i audio="Microphone (Conexant SmartAudio HD)" -loop 1 -re -i black2.png -b:a 30k -ac 1 -acodec libfdk_aac -vcodec libx264 -b:v 60k -shortest test.mp4
The -re limits the speed at which Ffmpeg processes the video - to realtime.

Related

ffmpeg combine images and audio into video and loop through images until end of audio

I have found this code to combine multiple images and one audio file into video, but what I want is the images to loop until the end of audio. So what I mean, if I have 5 images and each image is shown for 5 seconds, after 25 seconds show again the first image, second image etc and this will continue until the end of audio.
ffmpeg -r 0.2 -i Scan-130802-%04d.jpg -i "1.mp3" \
-vcodec libx264 -vf scale=1920:1080 \
-crf 25 -preset slow -acodec copy video.mp4
another problem I have is that with the above code the images appear horizontally flipped for some reason.
It's more efficient to do this in two steps.
#1
ffmpeg -framerate 1/5 -i Scan-130802-%04d.jpg -vf "scale=1920:1080,setsar=1" -r 5 -c:v libx264 -crf 25 -preset slow scan-video.mp4
#2
ffmpeg -stream_loop -1 -i scan-video.mp4 -i "1.mp3" -codec copy -shortest -fflags +shortest -max_interleave_delta 200M video.mp4

FFMPEG: How to avoid audio/video desync in output of crossfaded clips when input is variable frame rate video

I'm doing screen recordings of gameplay (Dota2) using my NVIDIA graphics card GeForce experience hardware recording (NVEC Encoder). This creates a variable frame rate output video. My NVIDIA settings are 60 fps 15000 kbps. I have paid a guy to make a program that generates scripts that given start/stop timepoints can extract clips from the video and merge them with crossfade. See example code below. The script works for many input recordings but fails often: The audio and video are desynchronized (usually audio delay) in many of the clips, ca 0.5 seconds. I think it fails more when frame rate dropped more during recording. He does not know how to fix the problem, and I wonder if anyone could point out if anything could be fixed in the script (example below)?
Processing speed is quite important (now making a 10 min 'highlight' video takes ca 7-10 min). Solutions increasing that amount very much more is not of too big interest, unfortunately. His approach has been to work separately with audio and video and merge in the end. He already has a program to make ffmpeg code for working with different scenarios (also adding overlays, adding music, intro/outro) so it would be preferable with some easy fixes to his code and not dramatic redesigning of the logic. But if nothing else can fix the problem, a redesign in logic is ok. Using other tools than ffmpeg is also ok, but should be automatable (scripts/cli) and not increase processing times too much.
Running the program "mediainfo" on the input video shows that framerate dropped quite low for this input video:
Frame rate mode: Variable
Frame rate : 60.000 FPS
Minimum frame rate: 3.059 FPS
Maximum frame rate: 63.739 FPS
Full report here: https://pastebin.com/TX061Wih
The input video can be downloaded from dropbox here (6 GB):
https://www.dropbox.com/s/ftwdgapazbi62pr/fullgame.mp4?dl=0
Here the example of a script when asked to extract two clips from input video at 9:57 (41 sec length) and 15:45 (28 sec length) and crossfade merge them with a 0.5 crossfade time. There might be some code-remnants from options that are not used in this example (overlays, music, intro/outro). Using the input video above, this creates audio/video desync.
6 commands excecuted in sequence:
ffmpeg.exe -loglevel warning -ss 00:09:57 -i fullgame.mp4 -t 00:00:41 -filter_complex "[0:a]afade=t=out:st=40.5:d=0.5[a1]" -map "[a1]" -y out_temp_00.mp4.wav
ffmpeg.exe -loglevel warning -i fullgame.mp4 -ss 00:09:57 -t 00:00:41 -an -vcodec copy -f mpegts -avoid_negative_ts make_zero -y out_temp_00.mp4.ts
ffmpeg.exe -loglevel warning -ss 00:15:45 -i fullgame.mp4 -t 00:00:28 -filter_complex "[0:a]afade=t=in:st=0:d=0.5[a1]" -map "[a1]" -y out_temp_01.mp4.wav
ffmpeg.exe -loglevel warning -i fullgame.mp4 -ss 00:15:45 -t 00:00:28 -an -vcodec copy -f mpegts -avoid_negative_ts make_zero -y out_temp_01.mp4.ts
ffmpeg.exe -loglevel warning -i out_temp_00.mp4.wav -i out_temp_01.mp4.wav -y -filter_complex "[0:a]adelay=0|0[a0];[1:a]adelay=40500|40500[a1];[a0][a1]amix=inputs=2:dropout_transition=68.5,atrim=duration=68.5[outa0];[outa0]loudnorm[outa]" -map "[outa]" -ar 48000 -acodec aac -strict -2 fullgame_Output.mp4.aac
ffmpeg.exe -loglevel warning -i out_temp_00.mp4.ts -i out_temp_01.mp4.ts -y -i fullgame_Output.mp4.aac -filter_complex "[0:v]trim=start=0.5,setpts=PTS-STARTPTS[0c];[1:v]trim=start=0.5,setpts=PTS-STARTPTS[1c];[0:v]trim=40.5:41,setpts=PTS-STARTPTS[fo];[1:v]trim=0:0.5[fi];[fi]format=pix_fmts=yuva420p,fade=t=in:st=0:d=0.5:alpha=1[z];[fo]format=pix_fmts=yuva420p,fade=t=out:st=0:d=0.5:alpha=1[x];[z]fifo[w];[x]fifo[q];[q][w]overlay[r];[0c][r][1c]concat=n=3[outv]" -map "[outv]" -map 2:a -shortest -acodec copy -vcodec libx264 -preset ultrafast -b 15000k -aspect 1920:1080 fullgame_Output.mp4
P.S.
I already asked for help at an ffmpeg chat room. One guy said he knew what the problem was, but didnt know how to fix it(?):
[00:10] <kepstin> oh, wait, you're using -vcodec copy
[00:10] <kepstin> that explains everything.
[00:10] <kepstin> when you're using -vcodec copy, the start time (set with -ss) is rounded to the nearest keyframe
[00:10] <kepstin> it's not exact
[00:11] <kepstin> depending on the keyframe interval, this will result in possibly quite large shifts
[00:11] <kepstin> (also, your commands are applying audio filters on commands with -an, which is confusing/contradictory)
[00:12] <birdboy88> so the problem is that the audio temporary clips are not being extracted from the same excat timepoints?
[00:13] <kepstin> birdboy88: yeah, your audio is being re-encoded to wav so it's being cut sample-accurate, but the video's not being precisely cut.
[00:16] <birdboy88> kepstin: so I need to use slow seek (?) to extract video accurately? Or somehow extract audio only where there are video keyframes?
[00:17] <kepstin> birdboy88: i don't know how to extract audio starting at video keyframes with ffmpeg cli. You're already doing slow seek, which doesn't help (you should move the -ss option to before the -i option to speed it up)
[00:17] <kepstin> if you want accurate video cutting when saving to a file, you have to re-encode the video
[00:18] <kepstin> (doing this in a single ffmpeg command means you don't have to save to a file, so you can avoid the issue)
[00:18] * kepstin is off for a bit now
EDIT:
Everything is done with the latest ffmpeg version.
I was unable to get Gyan's code to work. It always loses some audio (audio is either 40.5 or 27.5, so only one audio is used). This is the only one working for me (changes were adelay=40500|40500 and amix=inputs=2[a0];[a0]loudnorm):
ffmpeg -i fullgame.mp4 -filter_complex "[0]split=2[vpre][vpost];
[0]asplit=2[apre][apost];
[vpre]trim=start='00:09:57':duration='00:00:41',setpts=PTS-STARTPTS[vpre-t];
[apre]atrim=start='00:09:57':duration='00:00:41',asetpts=PTS-STARTPTS,afade=t=out:st=40.5:d=0.5[apre-t];
[vpost]trim=start='00:15:45':duration='00:00:28',setpts=PTS-STARTPTS,format=yuva420p,fade=t=in:st=0:d=0.5:alpha=1,setpts=PTS+40.5/TB[vpost-t];
[apost]atrim=start='00:15:45':duration='00:00:28',asetpts=PTS-STARTPTS,afade=t=in:st=0:d=0.5,adelay=40500|40500[apost-t];
[vpre-t][vpost-t]overlay[v];
[apre-t][apost-t]amix=inputs=2[a0];[a0]loudnorm[a]" -map "[v]" -map "[a]" -y -c:v libx264 -preset ultrafast -b:v 15000k -aspect 1920:1080 -c:a aac fullgame_Output.mp4
Then I tried using a similar setup but with 3 clips, but on one machine I got error: "Error while filtering: Cannot allocate memory". And my 16 GB memory machine the processing speed is 0.02x! Any way to avoid this? This is the code I tried:
ffmpeg -i fullgame.mp4 -filter_complex "[0]split=3[vpre][vpost][v3];
[0]asplit=3[apre][apost][a3];
[vpre]trim=start=357:duration=41,setpts=PTS-STARTPTS[vpre-t];
[apre]atrim=start=357:duration=41,asetpts=PTS-STARTPTS,afade=t=out:st=40.5:d=0.5[apre-t];
[vpost]trim=start=795:duration=28,setpts=PTS-STARTPTS,format=yuva420p,fade=t=in:st=0:d=0.5:alpha=1,fade=t=out:st=40.5:d=0.5:alpha=1,setpts=PTS+40.5/TB[vpost-t];
[apost]atrim=start=795:duration=28,asetpts=PTS-STARTPTS,afade=t=in:st=0:d=0.5,afade=t=out:st=27.5:d=0.5,adelay=40500|40500[apost-t];
[v3]trim=start=95:duration=30,setpts=PTS-STARTPTS,format=yuva420p,fade=t=in:st=0:d=0.5,setpts=PTS+41+28-0.5/TB[v3-t];
[a3]atrim=start=95:duration=30,asetpts=PTS-STARTPTS,afade=t=in:st=0:d=0.5,adelay=68500|68500[a3-t];
[vpre-t][vpost-t]overlay[v1];
[v1][v3-t]overlay[v];
[apre-t][apost-t][a3-t]amix=inputs=3[a0];
[a0]loudnorm[a]" -map "[v]" -map "[a]" -y -c:v libx264 -preset ultrafast -b:v 15000k -aspect 1920:1080 -c:a aac fullgame_Output.mp4
Just do it in one command.
Besides the keyframe seek issue, which is true, your present sequence has an error in the last command. You have [0:v]trim=start=0.5...[0c] which trims out the first 0.5 seconds and will cause a desync of its own. Since this is the first clip, it should be [0:v]trim=0:40.5.
The full single command should be
ffmpeg -i fullgame.mp4 -filter_complex
"[0]split=2[vpre][vpost];[0]asplit=2[apre][apost];
[vpre]trim=start='00:09:57':duration='00:00:41',setpts=PTS-STARTPTS[vpre-t];
[apre]atrim=start='00:09:57':duration='00:00:41',asetpts=PTS-STARTPTS,afade=t=out:st=40.5:d=0.5[apre-t];
[vpost]trim=start='00:15:45':duration='00:00:28',setpts=PTS-STARTPTS,format=yuva420p,fade=t=in:st=0:d=0.5:alpha=1,setpts=PTS+40.5/TB[vpost-t];
[apost]atrim=start='00:15:45':duration='00:00:28',asetpts=PTS-STARTPTS,afade=t=in:st=0:d=0.5[apost-t];
[vpre-t][vpost-t]overlay[v];
[apre-t][apost-t]acrossfade=d=0.5,loudnorm,aresample=48000[a]"
-map "[v]" -map "[a]" -c:v libx264 -preset ultrafast -b:v 15000k -aspect 1920:1080 -c:a aac fullgame_Output.mp4
Your original sequence had -strict -2 for audio AAC encoding. That hasn't been needed since Dec 2015. You have a very old version of ffmpeg if your ffmpeg throws an error without it. Upgrade first.
I did not test the above with your file, as it will take too long to filter 16 min of Full HD 60 fps video, but I tested the below faster command and it works fine with the latest git build of ffmpeg:
ffmpeg -ss 00:09:57 -t 00:00:41 -i fullgame.mp4 -ss 00:15:45 -t 00:00:28 -i fullgame.mp4 -filter_complex
"[0]afade=t=out:st=40.5:d=0.5[apre-t];
[1]format=yuva420p,fade=t=in:st=0:d=0.5:alpha=1,setpts=PTS+40.5/TB[vpost-t];
[1]afade=t=in:st=0:d=0.5[apost-t];
[0][vpost-t]overlay[v];
[apre-t][apost-t]acrossfade=d=0.5,loudnorm,aresample=48000:ocl=stereo[a]"
-map "[v]" -map "[a]" -c:v libx264 -preset ultrafast -b:v 15000k -aspect 1920:1080 -c:a aac fullgame_Output.mp4

ffmpeg audio watermark at specific time

I'm looking for a way to add an audio watermark, on specific time, to a video file (with existing audio) . something like: ffmpeg -i mainAVfile.mov -i audioWM.wav -filter_complex "[0:a][1:a] amix=inputs=2:enable='between(t,9,10)' [aud]; [0:v][aud]" -c:v libx264 -vf "scale=1280:720:sws_dither=ed:flags=lanczos, setdar=16:9" -c:a libfdk_aac -ac 2 -ab 96k -ar 48000 -af "aformat=channel_layouts=stereo, aresample=async=1000" -threads 0 -y output.mp4
The above command gives me this error Timeline ('enable' option) not supported with filter 'amix'. amerge didn't work as well. I kind of get lost with filter_complex syntax, specifically with the following conditions
On the main AV file, both audio and video tracks are filtered
Watermark should be between the 9th and 10th second (I already
generated a 1 second, 10k tone file)
The watermark need to survive the proceeding audio transcode
Use
ffmpeg -i mainAVfile.mov -i audioWM.wav
-filter_complex
"[0:a]aformat=channel_layouts=stereo,aresample=async=1000[main];
[1:a]atrim=0:1,adelay=9000|9000[wm];[main][wm]amix=inputs=2"
-vf "scale=1280:720:sws_dither=ed:flags=lanczos,setdar=16:9" -c:v libx264
-c:a libfdk_aac -ac 2 -ar 48000 -b:a 96k
-threads 0 -y output.mp4
It's preferable to perform all filtering in a single filtergraph. But I've kept the video filter as-is.

ffmpeg rtmp stream taking 100% CPU

I am creating a small script to stream a images on rtmp server but FFMPEG command taking 100% CPU. Please have a look on my code.
ffmpeg -f lavfi -i anullsrc=channel_layout=stereo:sample_rate=44100 -loop 1 -i "Digital-Wallet-.jpg" -t 00:30:00 -r 1 -c:v libx264 -c:a aac -preset:v ultrafast -pix_fmt yuv420p -f flv "rtmp://rtmpserver"
Encoding is CPU intensive. Remove -r 1 and add -framerate 1, -re, and -shortest.
ffmpeg -f lavfi -i anullsrc -loop 1 -framerate 1 -re -i "Digital-Wallet-.jpg" -t 00:30:00 -c:v libx264 -c:a aac -preset:v ultrafast -pix_fmt yuv420p -shortest -f flv "rtmp://rtmpserver"
The default image demuxer frame rate is 25, so your command was unnecessarily converting 25 frames per second to 1 frame per second which is inefficient. The above changes fixes that.
-re will slow down the reading of the input to the native frame rate of the input. It is useful for real-time output and live streaming. Otherwise ffmpeg will attempt to encode as fast as possible.
I added -shortest to end the output when the shortest stream ends (the image) because anullsrc was set to encode indefinitely.

Mix audio/video of different lengths with ffmpeg

I want to mix video from video.mp4 (1 minute duration) and audio from audio.mp3 (10 minute duration) into one output file with a duration of 1 minute. The audio from audio.mp3 should be from the 4 min - 5 min position. How can I do this with ffmpeg?
If video.mp4 has no audio
You can use this command:
ffmpeg -i video.mp4 -ss 00:04:00 -i audio.mp3 -c copy -shortest output.mkv
The audio will be from the 4 minute position (-ss 00:04:00) as requested in the question.
This example will stream copy (re-mux) the video and audio–no re-encoding will happen.
If video.mp4 has audio
You will have to add the -map option as described here: FFmpeg mux video and audio (from another video) - mapping issue.
If the audio is shorter than the video
Add the apad filter to add silent padding:
ffmpeg -i video.mp4 -ss 00:04:00 -i audio.mp3 -c:v copy -af apad -shortest output.mkv
Note that filtering requires re-encoding, so the audio will be re-encoded in this example.

Resources