I have a video file with two audio streams, representing two people talking at different times. The two people never talk at the same time, so there is no danger of clipping by summing the audio. I would like to sum the audio into one stream without reducing the volume. The ffmpeg amix filter has an option that would seem to do what I want, but the option does not seem to work. Here are two minimal non-working examples (the audio tracks are [0:2] and [0:3]):
ffmpeg -i input.mkv -map 0:0 -c:v copy \
-filter_complex '[0:2][0:3]amix' \
output.m4v
ffmpeg -i input.mkv -map 0:0 -c:v copy \
-filter_complex '[0:2][0:3]amix=sum=sum' \
output.m4v
The first example diminishes the audio volume. The second example is a syntax error. I tried other variants like amix=sum and amix=sum=1, but despite the documentation I don't think the sum option exists any more. ffmpeg -h filter=amix does not mention the sum option (ffmpeg version n4.3.1).
My questions:
Can I sum two audio tracks with ffmpeg, without losing resolution. (I'd rather not cut the volume in half and scale it up, but if there's no other way I guess I'd accept and answer that sacrifices a bit.)
Is there an easy way to adjust the relative delay of one of the tracks by a few milliseconds?
The sum option was added on 2021-02-04, so it's not in any release yet. You'll to have a use current git build.
To add a delay, use the adelay filter.
Suppose, you wanted to delay the first audio by 50ms,
-filter_complex '[0:2]adelay=50:all=1[a1];[a1][0:3]amix=sum=sum'
Related
ffmpeg is being run with the same arguments but different inputs in two separate places in my code. The framerate is set to 12fps with -framerate. In one case the output video matches the framerate pretty well every time. In the other case the video is coming back with 1 second of video for every 15 input frames, but again it is consistent. What could possibly be causing the discrepancy?
Arguments:
-y -framerate 12 -itsoffset -654ms -i "C:\path/%06d.JPG" -i "C:\path/audio.mp3" -map 0:v:0 -map 1:a:0 -vf scale=1280:720:force_original_aspect_ratio=decrease,pad=1280:720:-1:-1:color=black -acodec aac -vcodec libx264 -ar 48000 -pix_fmt yuv420p -movflags +faststart "C:\path\output.mp4"
There are dozens of differences in the code that calls it but I can't understand how any of that would influence the framerate. I tried switching out the mp3 with one of an arbitrary different length to see if that had an effect but it did not. That leaves only the image inputs.
edit: So it gets stranger. I reproduced both cases and modified the framerate value for each to check what happened. In almost every case I got more seconds of video than expected. I was doing these tests with 400 frames and 5 seconds of audio to ensure the lack of -shortest wasn't a factor. The only time I got a perfectly accurate framerate was when it was set to 1. In every other case (except the verbatim 12fps in the case that was always working) the video was too short by 5-15%. This makes it a bit of a mystery how either part of my code ever produced good results.
Well it turns out simply replacing -framerate with -r fixes this issue. I would still appreciate some clarification about this if anyone can explain though.
I have 30 mkv files which have multiple audio streams and multiple subtitles.
For each file I am trying to: extract the dutch audio and subtitles from that file (25fps)
And merge it with another mkv file (23.976216fps)
With this command it seems like I extract the dutch audio and subtitles into a mkv:
ffmpeg -y -r 23.976216 -i "S01E01 - Example.mkv" -c copy -map 0:m:language:dut S01E01.mkv
But it does not adjust the fps from 25 to 23.976216.
I think I am going to use mkvmerge to merge the two mkv's, but they need to be the same framerate
Anyone knows how I could make this work? Thanks! :)
The frame rate of the video has nothing to do with the frame rate of audio. They are totally independent. In fact there is really no such thing as audio frame rate (well, there is, but that’s a byproduct of the codecs). If you are changing the video frame rate by dropping frames, you are not changing the videos duration, hence you should not change the audios duration. If you are slowing down the video, you must decode the audio, slow it down (likely with pitch correction) and re-encode it.
Something like this would change the audio pitch from standard PAL to NTSC framerate (example valid if your audio track is the 2nd in list, -check with ffmpeg -i video.mkv and see-)
ffmpeg -i video.mkv -vn -map 0:1 -filter:a atempo=0.95904 -y slowed-down-audio-to-23.976-fps.ac3
(23976/25000 = 0.95904 so this is the converted frame rate needed for NTSC films)
Conversely, you can figure out how to speed up NTSC standard frame rate audio to the PAL system (1.0427094).
This trick works, for example, should you want to add a better quality audio track obtained from a different source.
First, I have looked at the older questions asking the same, but the responses do not work.
Adding -r 30 or 60 to the input file does not impact the output, nor does setting it for the output, which remains unchanged.
I am handling a very large number of files from 1 to 22 gigs recorded (with audio) in 30fps that need to be re-posted as 60pfs, with the corresponding speed increase.
I toyed with ffmpeg a bit and came up with this..
-filter_complex "[0:v]setpts=0.50*PTS[v];[0:a]atempo=2.0[a]" -map "[v]" -map "[a]" -vcodec:v libx264
It works fine, but to have to wait out a complete re-encoding of the video and audio to produce the same video with the fps changed seems like an insane waste of time.
Am I missing something simple? Is there not a way to -c copy with a new fps playback rate on the resulting file?
(if it still has to recode the audio to maintain sync that's fine, audio is quick enough it doesn't much matter)
I have been trying to use ffmpeg to create a wavefile image from an opus file. so far i have found three different methods but cannot seem to determine which one is the best.
The end result is hopefully to have a sound-wave that is only approx. 55px in height. The image will become part of a css background-image.
Adapted from Generating a waveform using ffmpeg:
ffmpeg -i file.opus -filter_complex
"showwavespic,colorbalance=bs=0.5:gm=0.3:bh=-0.5,drawbox=x=(iw-w)/2:y=(ih-h)/2:w=iw:h=1:color=black#0.5"
file.png
which produces this image:
Next, I found this one (and my favorite because of the simplicity):
ffmpeg -i test.opus -lavfi showwavespic=split_channels=1:s=1024x800 test.png
And here is what that one looks like:
Finally, this one from FFmpeg Wiki: Waveform, but it seems less efficient using a second utility (gnuplot) rather than just ffmpeg:
ffmpeg -i file.opus -ac 1 -filter:a
aresample=4000 -map 0:a -c:a pcm_s16le -f data - | \
gnuplot -e "set
terminal png size 525,050;set output
'file.png';unset key;unset tics;unset border; set
lmargin 0;set rmargin 0;set tmargin 0;set bmargin 0; plot '
Option two is my favorite, but i dont like the margins on the top and bottom of the waveforms.
Option three (using gnuplot) makes the best 'shaped' image for our needs, since the initial spike in sound seems to make the rest almost too small to use (lines tend to almost disappear) when the image is sized at only 50 pixels high.
Any suggestions how might best approach this? I really understand very little about any of the options I see, except of course for the size. Note too i have 10's of thousands to process, so naturally i want to make a wise choice at the very beginning.
Original and manipulated waveforms.
You can use the compand filter to adjust the dynamic range. drawbox is then used to make the horizontal line.
ffmpeg -i test.opus -filter_complex \
"compand=gain=-6,showwavespic=s=525x50, \
drawbox=x=(iw-w)/2:y=(ih-h)/2:w=iw:h=1:color=white" \
-vframes 1 output.png
It won't be quite as accurate of a representation of your audio as the original waveform, but it may be an improvement visually; especially on such a wide scale.
Also see FFmpeg Wiki: Waveform.
I'm trying to play some videos (webm mostly) on some very-low performance hardware. The hardware can barely handle FullHD output.
Since the devices in question are online via 3G modem only, there is some weight on the video size as well. However right now, the playing performance is definitely the more important part.
So, here's the question: Are there any options for avconv to improve playback performance? Or should I simply use another codec instead?
Right now, the command used is something like the following:
avconv \
-i $input_file \
-y \
-vf scale=$scale \
-an \
$output_file
You would want to use ffmpeg instead of avconv (ffmpeg is more active and reliable - my opinion):
Compile ffmpeg with libvpx support (WebM): guide
I would suggest you use CBR encoding
Set --profile to 3: guide, read about some more options if you want
Generally you would want to lower frame resolution and frame per second as much as acceptable for your project requirement and throw the appropriate bitrate for it.
There is an approach that you can try, that is to shrink the video twice in hight while leaving the width alone:
$ avconv -i 01.webm -vf 'scale=w=iw:h=ih/2' -c:v libtheora -c:a copy 01.ogv
for me has produced a file 84% the size of
$ avconv -i 01.webm -c:v libtheora -c:a copy 01.ogv
This way is much better than scalink in width, because it does not damage the text that may appear on the screen quite as much (human brain can for whatever reason deal with vertical distortion easier than with horizontal one).
You can also apply the denoise filter hqdn3d, which will make the filesize smaller, but will not damage the quality of the video.
The load on the processor of the playing machine can sometimes be more difficult to predict when one goes from video to video; but there is a difference in codecs. I've not compared them much, so can't offer real assistance.