Why is it written in the standard that PCM in mp4 is not supported, but Sony cameras (for example A7s) somehow know how to do it. Does it violate the standard?How did they manage to do it?
MP4 is a variation of the MOV container, which itself can have PCM audio specified in the metadata. For example when processing MP4 bytes I will reference the MOV specifications.
(1) It's possible that your MP4 is really just a MOV file renamed as a .mp4.
(2) If the playing side knows to expect PCM (eg: Don't assume the incoming numbers are AAC data) then its no problem with using PCM in the MP4 container. You could safely replace all AAC bytes with PCM bytes.
(3) It doesn't break the standard as far as MOV is concerned, but MPEG decoders usually only want to see MPEG codecs (h.264 / h.265 / MP3 / AAC) in an MPEG-4 file. Some will accept such a PCM containing file, some may refuse to play it. PS: Some coders are even putting VP8/VP9 video codec inside MP4 these days.
Related
When extracting Audio streams using ffmpeg from containers such as MP4, how does ffmpeg increase bitrate, if it is higher than the source bitrate?
An example might be ffmpeg -i video.mp4 -f mp3 -ab 256000 -vn music.mp3. What does ffmpeg do if the incoming bitrate is 128k? Does it interpolate or default to 128k on the output music.mp3? I know this seems like not a so-called "programming question" but ffmpeg forum says it is going out of business and no one will reply to posts there.
During transcoding, ffmpeg (or any transcoder) decodes the input into an uncompressed version; for audio, that's PCM. The encoder compresses this PCM data. It has no idea of, or interaction with, the original source representation.
If no bitrate is specified, ffmpeg will use the default rate control mode and bitrate of the encoder. For MP3 or AAC, that's typically 128 kbps for a stereo output . Although it can be lower, like 96 kbps for Opus. Encoders typically adjust based on no. of output channels. So for a 6-ch output, it may be 320 kbps. If a bitrate is specified, that's used unless the value is invalid (beyond the encoder's range). In which case, the encoder will fallback on its default bitrate selection.
A video file need to be transferred for further video processing. Sharing raw video (y4m) seems impossible. I am having two options
Encoding video file to h264 with crf 0 - lossless - file size is huge.
Encoding video file to h264 with crf 17/18 - virtually lossless - file size is manageable.
After the video is shared, it will be re-encoded only once with crf 22/23 with client info added.
Option 2 seems okay, but the quality should not be degraded on the re-encoding.
Is going with Option 1 and managing huge file is better option than Option 2?
My video stream is encoded with H.264, and audio stream is encoded with AAC. In fact, I get these streams by reading a file whose format is flv. I only decode video stream in order to get all video frames, then I do something by using ffmpeg before encoding them, such as change some pixels. At last I will push the video and audio stream to Crtmpserver. When I pull the live stream from this server, I find the video is not fluent but audio is normal. But when I change gop_size from 12 to 3, everything is OK. What reasons cause that problem, can anyone explain something to me?
Either the CPU, or the bandwidth is not sufficient for your usage. RTMP will always process audio before video. If ffmpeg, or the network is not able to keep up with the live stream, Video frames will be dropped. Because audio is so much smaller, and cheaper to encode, a very slow CPU or congested network will usually have no problems keeping up.
I am new to video encoding and trying to encode a music video for the apple itunes video store.
I am currently using FFmpeg for encoding.
My source file is mp4 file type and file size=650MB
I encode the file using the Apple ProRes 422 (HQ) codec and output a mov file.
ffmpeg -y -i busy1.mp4 -vcodec prores -profile:v 3 -r "29.97" -c:a mp2 busy2.mov
I am trying to encode the video according to the following specs:
● Apple ProRes 422 (HQ)
● VBR expected at ~220 Mbps
Encoded PASP Converted to ProRes From
1920 x 1080 1:1 HDCAM SR, D5, ATSC
1280 x 720 1:1 ATSC progressive
29.97 interlaced frames per second for video sourced
Music Video Audio Source Profile
● MPEG-2 layer II stereo
● 384 kpbs
● 48Khz
The file is encoded perfectly fine however the output is 6Gb in size.
Why would the file be so large after encoding?
Am I doing something wrong here?
The Apple ProRes is not intended for high compression. It is an intermediate codec used in post-production which optimizes the storage as opposed to keeping the videos uncompressed while retaining a high image quality.
You are supposed to use your uncompressed source file as input to retain the maximum quality and not an already lossy-compressed video.
You only mentioned the container format of your input file: MP4 but not the codecs which is the actual important information.
Since the HQ flavor of ProRes uses 220 Mbps the file size can actually increase but you don't gain anything in quality if the source is lossy.
See more here: Apple ProRes
Though you don't gain much by decompressing a source clip thats "Lossy", you do gain in some ways. Compressed video uses a compressed color palette, which can be detrimental when making color corrections or corrections to detail level, especially when you're given interlaced footage to clean up. If you put in the time on detail, microcontrast, and color, you know the benefit of expanded color detail for compressing back down. It also encodes much faster on the back end of your edits. Simply compressing the data down is faster than expanding and then compressing.
However, if you recompress all your video down to the same size and codec as what went in, most encoders and editor apps now test the datarate of the GOP, working on only those GOP's that need to be redone to fit the new settings.
I have a flac file and I have to do some analysis on the waveform looking for a particular sample. So I decompressed them in PCM data, but then I need to know, where is that particular sample in the flac file.
So: I know the byte offset in the PCM data, or in a wav file, and I want to know the byte offset of the compressed sample in the flac file.
How can I do?
You can probably trace the sample back to the frame in the FLAC file. Within that frame would be more difficult if you consider the audio may have multiple channels and is generally compressed. If you look at the flac spec, I think it should be pretty easy to parse the file yourself:
http://flac.sourceforge.net/format.html
And you probably have to decode each frame in order to know what the frame length is...