I have a video file with two audio streams, representing two people talking at different times. The two people never talk at the same time, so there is no danger of clipping by summing the audio. I would like to sum the audio into one stream without reducing the volume. The ffmpeg amix filter has an option that would seem to do what I want, but the option does not seem to work. Here are two minimal non-working examples (the audio tracks are [0:2] and [0:3]):
ffmpeg -i input.mkv -map 0:0 -c:v copy \
-filter_complex '[0:2][0:3]amix' \
output.m4v
ffmpeg -i input.mkv -map 0:0 -c:v copy \
-filter_complex '[0:2][0:3]amix=sum=sum' \
output.m4v
The first example diminishes the audio volume. The second example is a syntax error. I tried other variants like amix=sum and amix=sum=1, but despite the documentation I don't think the sum option exists any more. ffmpeg -h filter=amix does not mention the sum option (ffmpeg version n4.3.1).
My questions:
Can I sum two audio tracks with ffmpeg, without losing resolution. (I'd rather not cut the volume in half and scale it up, but if there's no other way I guess I'd accept and answer that sacrifices a bit.)
Is there an easy way to adjust the relative delay of one of the tracks by a few milliseconds?
The sum option was added on 2021-02-04, so it's not in any release yet. You'll to have a use current git build.
To add a delay, use the adelay filter.
Suppose, you wanted to delay the first audio by 50ms,
-filter_complex '[0:2]adelay=50:all=1[a1];[a1][0:3]amix=sum=sum'
Currently I am using this command to convert
ffmpeg -i <srcfile> -vcodec libx264 -profile:v main -level 3.1 -preset slower -crf 18 -x264-params ref=4 -acodec copy -movflags +faststart <outfile>
to convert some dashcam footage for viewing on an iOS device.
The above command took about 30 min to complete on a 2017 Macbookpro with 16G of RAM.
I want to speed it. One thing I tried is to harness the GPU in the computer. Therefore I added the flag -c:v h264_videotoolbox
It sped up by a lot. I can complete the conversion in 1 min.
However, when I inspected the output, the GPU version suffers from banding and blurry.
Here is a screenshot. CPU version on the left and GPU version on the right.
To highlight the difference, here are the parts of the videos
Trees in reflections
corrugated iron sheet wall
Is there any switch that I can manipulate to make the GPU version clearer?
This is a simplistic H.264 encoder compared to x264, so you're not going to get the same quality per bitrate. h264_videotoolbox is optimized for speed and does not support -crf.
You can view some options specific to this encoder with ffmpeg -h encoder=h264_videotoolbox, but as they are probably already set to "auto" (I didn't confirm via source code and I don't have the hardware to try it) these additional options may not make much of a difference.
So you'll just have to increase the bitrate, such as with -b:v 8000k.
Or continue to use libx264 with a faster -preset.
I see the question's been answered and nearly two years ago. Jumping in for others who might stumble on the thread. I get great results with VideoToolbox as encoder, using either GPU or software to accelerate, depending which machine I am using.
As already mentioned, setting a constant bitrate, and adjusting it upward is key to producing a result that is nearly indistinguishable from a large source file. A constant bitrate is as effective as two-pass encoding for high-quality output, when paired with other key parameters, and is much quicker than two-pass.
May seem counter-intuitive, but a computer running on all threads, full throttle, to encode a video won't give you best results. Several researchers have demonstrated that quality actually goes down if all cpu threads are engaged in encoding; better to use fewer threads and even throttle ffmpeg with a 3rd party app (encoding does not slow down significantly, in my experience). So limit threads on newer multithread desktops and laptops.
Common practice for target bitrates (seen on Netflix, Amazon) vary with resolution, naturally: at least 5,000kbps for 1080p; 3,500 for 720p. For a noticeable improvement in video quality, the encoder bitrate should be set to at least 1.5 times those common practice bitrates: ie, 7,500 for 1080p, 5,250 for 720p. Similarly for 4K GoPros or dash cams.
Often I work with large movie files from my bluray library, and create slimmed-down versions that are 1/3 to 1/2 the size of the original (20G original gives way to a file of 8-10GB with no perceptible loss of quality. Also: framerate. Maintaining the same framerate from source to slimmed-down file is essential, so that parameter is important. Framerate is either 24fps, 25fps, or 30fps for theatrical film, European tv, and North American tv, respectively. (Except in transferring film to a tv screen, 24fps becomes 23.976fps, in most cases.) Of course 60fps is common for GoPro-like cameras, but here 30fps would be a reasonable choice.
It is this control of framerate and bitrate the keeps ffmpeg in check, and gives you predictable, repeatable results. Not an errant, gigantic file that is larger than the one you may have started with.
I work on a Mac, so there may be slight differences on the command line, and here I use VideoToolbox as software encoder, but a typical command reads:
ffmpeg -loglevel error -stats -i source.video -map 0:0 -filter:v fps\=24000/1001 -c:v h264_videotoolbox -b:v 8500k -profile 3 -level 41 -coder cabac -threads 4 -allow_sw:v 1 -map 0:1 -c:a:0 copy -disposition:a:0 default -map 0:6 -c:s:0 copy -disposition:s:0 0 -metadata:g title\=“If you want file title in the metadata, goes here” -default_mode passthrough ‘outfile.mkv’
-loglevel error (to troubleshoot errors)
-stats (provides progess status in terminal window)
-i infile (source video to transcode)
-map 0:0 (specify each stream in the original to map to output)
-filter:v fps\=24000/1001 (framerate of 23.976, like source)
-c:v h264_videotoolbox (encoder)
-b:v (set bitrate, here I chose 8500k)
-profile 3 -level 41 (h264 profile high, level 4.1)
-coder cabac (cabac coder chosen)
-threads 4 (limit of 4 cpu threads, of 8 on this laptop)
-allow_sw:v 1 (using VideoToolbox software encoding for accleration; GPU is not enabled)
-map 0:1 -c:a:0 copy -disposition:a:0 default (copies audio stream over, unchanged, as default audio)
-map 0:6 -c:s:0 copy -disposition:s:0 0 (copies subtitle stream over, not as default ... ie, will play subtitles automatically)
-metadata:g (global metadata, you can reflect filename in metadata)
-default_mode passthrough (allow audio w/o further processing)
outfile (NOTE: no dash precedes filename/path. Chose mkv format to
hold my multiple streams; mp4 or other formats work just fine ... as
long as contents are appropriate for format.)
In addition of llogan's answer I'd recommend set 'realtime' property to zero (this can increase quality in motion scenes)
As llogan says, bitrate option is good parameter in this situation.
ffmpeg -i input.mov -c:v h264_videotoolbox -b:v {bitrate} -c:a aac output.mp4
if you want to set 1000kb/s bitrate, command is like this
ffmpeg -i input.mov -c:v h264_videotoolbox -b:v 1000k -c:a aac output.mp4
When I transcode video to H265 with following command, I get a bitrate about 600K and the quality is almost the same as the original.
ffmpeg -i data2.mp4 -c:v libx265 -c:a copy d2.mp4
However when I use the hevc_nvenc, I get a very high bitrate (about 2M), I need to have a bitrate as low as possible and keeping almost the same quality.
ffmpeg -i data2.mp4 -c:v hevc_nvenc -c:a copy d3.mp4
It works if I specify the output bitrate, but I want to know how to figure out the proper bitrate?
There is no such thing as "proper bitrate". You get to choose the bitrate. if you don't, the encoder will choose on for you. In this case, you are using two different encoders, so you get different bitrates. You can change this by adding -b:v option to ffmpeg.
But that's probably not what you want. You probably want to use a constant quality factor by setting -crf to a value between 0 (Great quality large file) to 51 (bad quality small file)
Note that hevc_nvenc will almost produce larger files than libx265 at a given quality because it not as efficient as an encoder.
First, I have looked at the older questions asking the same, but the responses do not work.
Adding -r 30 or 60 to the input file does not impact the output, nor does setting it for the output, which remains unchanged.
I am handling a very large number of files from 1 to 22 gigs recorded (with audio) in 30fps that need to be re-posted as 60pfs, with the corresponding speed increase.
I toyed with ffmpeg a bit and came up with this..
-filter_complex "[0:v]setpts=0.50*PTS[v];[0:a]atempo=2.0[a]" -map "[v]" -map "[a]" -vcodec:v libx264
It works fine, but to have to wait out a complete re-encoding of the video and audio to produce the same video with the fps changed seems like an insane waste of time.
Am I missing something simple? Is there not a way to -c copy with a new fps playback rate on the resulting file?
(if it still has to recode the audio to maintain sync that's fine, audio is quick enough it doesn't much matter)
If using ffmpeg to precisely cut up a video into pieces, for instance using something like:
ffmpeg -i input.mkv -ss 0 -to 30 -c:v libx264 -preset ultrafast -qp 0 -c:a copy p_1.mkv
Would the resulting clip include the last frame? (e.g., frame 1800 for a 60FPS video)
Also, should the audio also be re-encoded to ensure that no audio de-sync happens if I were to concatenated it together with other clips?
If the video is constant-frame-rate, then it should include 1800 frames. to stops when the position is reached i.e. 30.000 So, the 1800th frame in a CFR video will have a PTS of 29.983 s
As for audio, if the audio is shorter than the video, it will be so even after encode unless you expressly apply padding. Since you seem to be extracting from inside larger videos, you shouldn't encounter that problem.