Changing M4a sampling rate to a customized value - ffmpeg

I am trying to change the sampling rate of an M4a file from 44100Hz to a customized value let's say 51200Hz. I used the followng command which worked fine with wav sampling rate conversion:
ffmpeg -i audio.m4a -ar 51200 audio_51200.m4a
Unfortunately, it generates a file with a 48000 Hz sampling rate. Any ideas?

There is a limited set of frequencies for AAC profiles. For example for HE AAC:
http://www.atsc.org/wp-content/uploads/2015/03/A153-Part-8-2012.pdf
So ffmpeg adjust any non-standard frequency to nearest available
Update: The set of available sampling frequencies is limited by AAC ADIF (Audio Data Interchange Format) and ADTS (Audio Data Transport Stream). So other rates just can't be encoded. Here are values for field sampling_frequency_index form subclause 8.1.1.2 in ISO/IEC 13818-7 standard:
Recommendations for selecting sample frequency:

Related

ffmpeg output file smaller than input file

I am using ffmpeg to rotate videos 90 or 180 degrees in a Python script. It works great. But, I am curious as to why the output file would be a smaller amount of bytes than the input file.
Here are the commands I use:
180 degrees:
ffmpeg -i ./input.mp4 -preset veryslow -vf "transpose=2,transpose=2,format=yuv420p" -metadata:s:v rotate=0 -codec:v libx264 -codec:a copy ./output.mp4
90 degrees:
ffmpeg -i ./input.mp4 -vf "transpose=2" ./output.mp4
For example, a GoPro Hero 3 MP4 file was originally 2.0 GB. The resulting output file was 480.9 MB. Another GoPro file was 2.0 and its resulting file was 671.5 MB. Is this maybe because the GoPro files were 2.0 but contains empty space, sort of like how some NTFS filesystems make a minimal 4k file, even when there is less bytes in it?
If this isn't the GoPro Hero 3, how do I rotate the files 90 or 180 degrees but ensure the output file size is the same? Or, is data loss expected? Does the data loss have to do with the format?
Note that the quality of the video doesn't appear to be damaged, which is good. So, I am interested in learning more about why this is happening, then I can read the section of ffmpeg documentation that is relevant to this.
Thank you!
Bitrate is ignored from the start
ffmpeg fully decodes the input into uncompressed raw video and audio (except when stream copying – more about that below). The input format or bitrate does not matter: it does this for all formats. The encoder then works from these raw, decoded frames. See diagram.
H.264 vs H.264
Your input and output are both H.264. A format, such as H.264, is created by an encoder. Anyone can make an encoder. However, not all encoders are equal. Given the same input, the output from one H.264 encoder may have the same quality as an output from another H.264 encoder, but the bitrate may be several times smaller.
The GoPro H.264 encoder was made to work on a platform with limited hardware. That means bitrate (file size) is sacrificed for speed and quality. x264 is the ultimate H.264 encoder: nothing can beat its quality-to-bitrate ratio.
Rotate without re-encoding
You can stream copy (re-mux) and rotate at the same time. The rotation is handled by the metadata/sidedata:
ffmpeg -i input.mp4 -metadata:s:v rotate=90 -c copy output.mp4
Downside is your player/device may ignore the rotation, so you may have to physically rotate with filters which requires re-encoding, and therefore stream copy can't be used.
I had the same rotation issue once...
I fixed it by "resetting" the rotation instead...
ffmpeg ...... -metadata:s:v rotate="0" ......

What does ffmpeg think is the difference between an audio frame and audio sample?

Here's a curious option listed in the man pages of ffmpeg:
-aframes number (output)
Set the number of audio frames to output. This is an obsolete alias for "-frames:a", which you should use instead.
What an 'audio frame' is seems dubious to me. This SO answer says that frame is synonymous with sample, but that can't be what ffmpeg thinks a frame is. Just look at this example when I resample some audio to 22.05 kHz and a length of exactly 313 frames:
$ ffmpeg -i input.mp3 -frames:a 313 -ar:a 22.05K output.wav
If 'frame' and 'sample' were synonymous, we would expect audio duration to be 0.014 seconds, but the actual duration is 8 seconds. ffmpeg thinks the frame rate of my input is 39.125.
What's going on here? What does ffmpeg think an audio frame really is? How do I go about finding this frame rate of my input audio?
FFmpeg uses an AVFrame structure internally to convey and process all media data in chunks. The number of samples per frame depends on the decoder. For video, a frame consists of all pixel data for one picture, which is a logical grouping, although it can also contain pixel data for two half-pictures of an interlaced video stream.
For audio, decoders of DCT-based codecs typically fill a frame with the number of samples used in the DCT window - that's 1024 for AAC and 576/1152 for MP3, as Brad mentioned, depending on sampling rate. PCM samples are independent so there is no inherent concept of framing and thus frame size. However the samples still need to be accommodated within AVFrames, and ffmpeg defaults to 1024 samples per frame for planar PCM in each buffer (one for each channel).
You can use the ashowinfo filter to display the frame size. You can also use the asetnsamples filter to regroup the data in a custom frame size.
A "frame" is a bit of an overloaded term here.
In PCM, a frame is a set of samples occurring at the same time. If your audio were 22.05 kHz and you had 313 PCM frames, it's length in time would be about 14 milliseconds, as you expect.
However, your audio isn't PCM... it's MP3. An MP3 frame is about 26 milliseconds long. 313 of them add up to about 8 seconds. The frame here is a block of audio that cannot be decoded independently. (In fact, some frames actually depend on other frames via the bit reservoir!)

Use ffmpeg to time-dilate and resample audio without changing frequencies

I have some audio (wave file) that is sampled at a rate of 48000 samples per second.
This audio was created to match a 30 FPS video. However, the video actually plays back on the target at the NTSC framerate of 29.97 (30 X 1000/1001).
This means that I need to time-dilate the audio so that there are 48048 samples where there were previously 48000 samples (it plays back 1.001 times slower) but still maintains that the final audio file's rate is 48000 samples per second.
Ideally, also, I'd like to do this resample using the sox library option for FFMPEG since I hear it has much higher quality.
Can anyone help me with the command line necessary to process a file in this manner?
Basic command is
ffmpeg -i in.wav -af asetrate=47952,aresample=48000:resampler=soxr out.wav
This assumes that libsoxr is linked.

ffmpeg conversion increase bitrate

When extracting Audio streams using ffmpeg from containers such as MP4, how does ffmpeg increase bitrate, if it is higher than the source bitrate?
An example might be ffmpeg -i video.mp4 -f mp3 -ab 256000 -vn music.mp3. What does ffmpeg do if the incoming bitrate is 128k? Does it interpolate or default to 128k on the output music.mp3? I know this seems like not a so-called "programming question" but ffmpeg forum says it is going out of business and no one will reply to posts there.
During transcoding, ffmpeg (or any transcoder) decodes the input into an uncompressed version; for audio, that's PCM. The encoder compresses this PCM data. It has no idea of, or interaction with, the original source representation.
If no bitrate is specified, ffmpeg will use the default rate control mode and bitrate of the encoder. For MP3 or AAC, that's typically 128 kbps for a stereo output . Although it can be lower, like 96 kbps for Opus. Encoders typically adjust based on no. of output channels. So for a 6-ch output, it may be 320 kbps. If a bitrate is specified, that's used unless the value is invalid (beyond the encoder's range). In which case, the encoder will fallback on its default bitrate selection.

ffmpeg how do i know what audio rate to use?

Say I have something like this
ffmpeg -i video.avi -ar 22050 -ab 32 -f flv -s 320x240 video.flv
-ar (Audio sampling rate in Hz)
-ab (Audio bit rate in kbit/s)
regarding the -ar and the -ab how do I know what rate to use? I got this ffmpeg command from a site somewhere and I was wondering how the person knew what values to put for the rates? Do I need to understand audio in order to figure that out?
Probably 44100 for audio sampling rate and 128 for bit rate should be sufficient.
Check Wikipedia's sampling rate and audio bit rate articles for examples to see if those values are too high or too low for what you're trying to do.
You have to use "ffmpeg -i video.avi" to know the sampling rate and the bitrate of the audio stream in the source video.avi.
The audio stream can be extracted with the same sampling rate and bitrate without lose quality.
You can decide to reduce one of them for size reasons, but don't increment one of them to increase quality because you never can't upgrade the original quality.
I'm using -ar 22050 and -ab 48 for Avi and Mpeg video files. It works normally.

Resources