Timing Issues When Muxing Audio and Video with libav - ffmpeg

I have series of encoded packets, H.264 video and AAC audio. As they're coming on, I'm writing them to a video file, using av_write_frame.
Given the following situation in a row
10 seconds of video, then
10 seconds of video and audio, then
10 seconds of video.
Everything muxes fine and when played back via VLC or QuickTime, everything looks good. If I play it in Windows Media Player, the audio is played immediately.
It seems I'm doing something wrong, but checking the PTS of the audio stream packets, they are set to 10 seconds based on the time base of the audio stream.

It seems that it's best to inject empty audio packets at the beginning of the stream. This is the only way that video playback in WMP would work. Every player handles the streams differently and this is the best way to ensure compatibility across players.

Related

Why does it take forever just to add audio to an mp4?

I am currently using Kdenlive, but have also used ffmpeg when I have the simple task of adding audio to a video that does not yet have audio. Since it is just a matter of putting the video file together with the audio, it seems like it ought to be simple. Is there something about encoding mp4's that means it must take a lot of processing to complete?
I have good hardware (i7 6700k and gtx 1080), but kdenlive currently estimates 2.5 hours to complete adding audio to a 10 minute video.
Without more info (encoder, settings, video width x height, instructions to duplicate the behavior, etc) we can only guess. It's probably re-encoding the video instead of only muxing it. Encoding is CPU intensive and takes a long time. Although 2.5 hours for 10 minutes seems excessive, but there is not enough info in the question to say why it takes this long.
If you want to add audio with ffmpeg see How to add a new audio into a video using ffmpeg? This will allow you to mux the video (and optionally the audio) without encoding it: like a copy and paste.

Adjusting PTS when switching between streams

My application needs to switch between two (or more) streams at the input while there is only one output (you could think about as a stream multiplexer). The frames from the input are decoded and then re-encoded again due to an overlay stuff.
So to arrange the AVFrame PTS I calculate an interval before encoding the frames. But the thing is when I switch between a RTMP stream and a MP4 file, the video is delayed a bit every time I switch. So, at the third switch the resulting stream is out of sync.
I don't know if I'm missing something I have to modify on the frame before encoding. I also though about creating an independent PTS for frames at the output but I don't know how to create it.
The input streams could have different FPS, timebases or codecs and the application must be able to deal with all of them.
I discovered the root cause.
The problem was the MP4 file. With this type of file (for some reason) the video and audio packets are read in bug bunches (i.e.: 20 video packets and then 20 audio packets) whilst on a RTMP stream is more like (2 video and then 2 audio packets).
So the problem was the switch was being applied before reading all the bunch (i.e.: 20 video packets and 10 audio packets) so after that point the resulting stream is out of sync no matter what you do after that.
The solution I implemented waits until a decoded frame's type is different than the previous one. Then is when I perform the switch.

Why is live video stream not fluent while audio stream is normal when they are played by Flash RTMP Player after being encoded

My video stream is encoded with H.264, and audio stream is encoded with AAC. In fact, I get these streams by reading a file whose format is flv. I only decode video stream in order to get all video frames, then I do something by using ffmpeg before encoding them, such as change some pixels. At last I will push the video and audio stream to Crtmpserver. When I pull the live stream from this server, I find the video is not fluent but audio is normal. But when I change gop_size from 12 to 3, everything is OK. What reasons cause that problem, can anyone explain something to me?
Either the CPU, or the bandwidth is not sufficient for your usage. RTMP will always process audio before video. If ffmpeg, or the network is not able to keep up with the live stream, Video frames will be dropped. Because audio is so much smaller, and cheaper to encode, a very slow CPU or congested network will usually have no problems keeping up.

Grabbing a series of frames from an RTSP stream

I'm looking for a way to continuously grab frames, as jpg, from a RTSP stream. I've stumbled upon ffmpeg but it seems that the time between starting it and grabbing the first frame is quite high. Is there any good tool in order to do this?
Regards
I've used gstreamer libraries in the past to extract frames from mobile video

How to interleave audio video in ffmpeg library

I am using ffmpeg to write an .avi file. I have two streams, one for audio and other for video. fps of video varies from 5 - 15 and samples per second of audio varies from 60 - 80.
I am using av_interleaved_write_frame() to write the file. but it is too time consuming. So i want to replace it with av_write_frame().
But for that, i will have to manage the audio video interleaving in my code. I don't understand what combination i should use to interleave audio video properly.
Can anyone please help Or suggest me some sample code.

Resources