Which bitrates are supported by ffmpeg for AAC-LC encoding?
I want to encode 44.1kHz,16 Bit Mono Audio. The framesize is set to 1024. But only some bitrates seem to be support, if I set the bitrate to 320 it doesn't seem to work properly.
For 44.1 kHz mono with 1024-sample frames, AAC-LC's maximum bitrate is:
(6144 bits/block ÷ 1024 samples/block) × 44100 samples/sec × 1 channel = 264.6 kbit/s
FFmpeg supports 4 different AAC-LC encoders. The best quality can be obtained using libfdk_aac, which supports all the way up to the maximum bitrate.
See also Recommended sampling rate and bitrate combinations.
Related
I'm not sure if
x264/5
use CPU and if h264/5 use GPU and also if h265 is basically HEVC_NVENC for NVIDIA GPU acceleration. So, if you could give me more info about these encoding types it would be great. I understood that, summing up a lot, x26* use CPU and are slower but more accurate while h26* are the opposite but h265 is the most recent and optimal trade off.
Furthermore, I was trying to convert a video using GPU acceleration and my question is:
Does the following command tell to the GPU to use h265 to encode a video holding the same audio and at upgrading it at its maximum video quality? Furthermore, are there other ways to express the same command?
ffmpeg.exe -hwaccel_output_format cuda -i "input" -c:v hevc_nvenc -preset medium -rc constqp -qp 0 -c:a copy "output"
H.264 and H.265 are video codecs for compressing image frames to video.
H.264 is a very widely supported standard that's been around a long time. It provides good efficiency. H.265 is a newer codec which can reduce your bandwidth requirements for a given quality by around a quarter to a third, for most videos. (Or, you can increase quality for the same bandwidth.) H.265 is a patent-encrusted cluster (even more than H.264), and there are currently three groups who claim you must license with them to use it.
I'm not sure if x264/5 use CPU and if h264/5 use GPU
x264 and x265 are open source encoder implementations of the H.264 and H.265 codecs. They run on the CPU.
and also if h265 is basically HEVC_NVENC for NVIDIA GPU acceleration
Yes, that's basically correct. GPUs with NVENC support can encode H.265 video.
Does the following command tell to the GPU to use h265 to encode a video holding the same audio and at upgrading it at its maximum video quality?...
No. Maximum quality would actually be lossless encoding, which is apparently supported.
Here's a curious option listed in the man pages of ffmpeg:
-aframes number (output)
Set the number of audio frames to output. This is an obsolete alias for "-frames:a", which you should use instead.
What an 'audio frame' is seems dubious to me. This SO answer says that frame is synonymous with sample, but that can't be what ffmpeg thinks a frame is. Just look at this example when I resample some audio to 22.05 kHz and a length of exactly 313 frames:
$ ffmpeg -i input.mp3 -frames:a 313 -ar:a 22.05K output.wav
If 'frame' and 'sample' were synonymous, we would expect audio duration to be 0.014 seconds, but the actual duration is 8 seconds. ffmpeg thinks the frame rate of my input is 39.125.
What's going on here? What does ffmpeg think an audio frame really is? How do I go about finding this frame rate of my input audio?
FFmpeg uses an AVFrame structure internally to convey and process all media data in chunks. The number of samples per frame depends on the decoder. For video, a frame consists of all pixel data for one picture, which is a logical grouping, although it can also contain pixel data for two half-pictures of an interlaced video stream.
For audio, decoders of DCT-based codecs typically fill a frame with the number of samples used in the DCT window - that's 1024 for AAC and 576/1152 for MP3, as Brad mentioned, depending on sampling rate. PCM samples are independent so there is no inherent concept of framing and thus frame size. However the samples still need to be accommodated within AVFrames, and ffmpeg defaults to 1024 samples per frame for planar PCM in each buffer (one for each channel).
You can use the ashowinfo filter to display the frame size. You can also use the asetnsamples filter to regroup the data in a custom frame size.
A "frame" is a bit of an overloaded term here.
In PCM, a frame is a set of samples occurring at the same time. If your audio were 22.05 kHz and you had 313 PCM frames, it's length in time would be about 14 milliseconds, as you expect.
However, your audio isn't PCM... it's MP3. An MP3 frame is about 26 milliseconds long. 313 of them add up to about 8 seconds. The frame here is a block of audio that cannot be decoded independently. (In fact, some frames actually depend on other frames via the bit reservoir!)
I am trying to change the sampling rate of an M4a file from 44100Hz to a customized value let's say 51200Hz. I used the followng command which worked fine with wav sampling rate conversion:
ffmpeg -i audio.m4a -ar 51200 audio_51200.m4a
Unfortunately, it generates a file with a 48000 Hz sampling rate. Any ideas?
There is a limited set of frequencies for AAC profiles. For example for HE AAC:
http://www.atsc.org/wp-content/uploads/2015/03/A153-Part-8-2012.pdf
So ffmpeg adjust any non-standard frequency to nearest available
Update: The set of available sampling frequencies is limited by AAC ADIF (Audio Data Interchange Format) and ADTS (Audio Data Transport Stream). So other rates just can't be encoded. Here are values for field sampling_frequency_index form subclause 8.1.1.2 in ISO/IEC 13818-7 standard:
Recommendations for selecting sample frequency:
When extracting Audio streams using ffmpeg from containers such as MP4, how does ffmpeg increase bitrate, if it is higher than the source bitrate?
An example might be ffmpeg -i video.mp4 -f mp3 -ab 256000 -vn music.mp3. What does ffmpeg do if the incoming bitrate is 128k? Does it interpolate or default to 128k on the output music.mp3? I know this seems like not a so-called "programming question" but ffmpeg forum says it is going out of business and no one will reply to posts there.
During transcoding, ffmpeg (or any transcoder) decodes the input into an uncompressed version; for audio, that's PCM. The encoder compresses this PCM data. It has no idea of, or interaction with, the original source representation.
If no bitrate is specified, ffmpeg will use the default rate control mode and bitrate of the encoder. For MP3 or AAC, that's typically 128 kbps for a stereo output . Although it can be lower, like 96 kbps for Opus. Encoders typically adjust based on no. of output channels. So for a 6-ch output, it may be 320 kbps. If a bitrate is specified, that's used unless the value is invalid (beyond the encoder's range). In which case, the encoder will fallback on its default bitrate selection.
I need convert MP4 to webm with ffmpeg.
So, i use :
ffmpeg -i input.mp4 -c:v libvpx -crf 10 -b:v 1M -c:a libvorbis output.webm
But it's very long.
Is there faster ?
libvpx is a relatively slow encoder. According to the VP8 Encode Parameter Guide: Encode Quality vs. Speed, you can use the -cpu-used option to increase encoding speed. A higher value results in faster encoding but lower quality:
Setting a value of 0 will give the best quality output but is
extremely slow. Using 1 (default) or 2 will give further significant
boosts to encode speed, but will start to have a more noticeable
impact on quality and may also start to effect the accuracy of the
data rate control. Setting a value of 4 or 5 will turn off "rate
distortion optimisation" which has a big impact on quality, but also
greatly speeds up the encoder.
Alternatively, it appears that VA-API can be utilized for hardware accelerated VP8 encoding, but I have no experience with this.