packet size in ffmpeg after av_read_frame

packet size in ffmpeg after av_read_frame - ffmpeg

I have following doubts in ffmpeg.Please clarify.
1.I am reading mp4 file using ffmpeg and after doing av_read_frame
i am getting video(stream_index = 0),audio(stream_index = 1) packets
First there is no order of coming of video/audio packets.Is it the standard case.
2.Video packets are coming of various sizes from the minimum of 14 to the maximum of 21824.Please point why the video packet size varies,somewhere it
is written that for video one packet means one frame,so if for video pkt->size = 14 also equals one frame of video.(pkt is of type AVPacket).
3.If we demux the incoming mp4 stream in Probe function while parsing it and store the info in some buffer or calling av_read_frame demuxes it.
4.Is it possible in ffmpeg to demux the mp4 file and then assign both a/v in one stream of packets where video packet has stream_index = 1,and for audio = 0.
or it has to be in separate stream.
5.Diff between ffmpeg processing for transport stream and mp4 file.
if both are demuxed,decoded in same way or is it different.
Regards
Mayank

A media file is created from mutipule streams. A stream can be of mnay types. Audio, video, captions, metadata etc. But a stream can NOT be of multiple types.
1) av_read_frame will (usually) return the frames in the order they are written to the file. If the software that created the file did not mux them monotonically, you can not read it monotonically.
2) This is precisely how video compression works. The codec stores only the changes between frames. If there is very little motion, then one frame may be very similar to the previous frame, so the delta is very small.
3) This is not a question.
4) No.
5) Largely, no. But there is some difference in the file types. mp4 requires random access, while TS does not.

Related

FFmpeg: why avcodec_receive_frame need additional packet to get an I frame

I'm new to FFmpeg. When learn it with the nice repo(https://github.com/leandromoreira/ffmpeg-libav-tutorial)，in the hello_world example I find avcodec_receive_frame dosen't return the first I frame until it gets the third packet, as following screenshot shows:
I'm wondering why additional packets are needed to receive an I frame.

Most modern video codecs are using I/P/B frames which brings the decoding time stamp (DTS) and presentation time stamp (PTS). So, what hello_world does with ffmpeg's lib is the following:-
Demuxing (av_read_frame)
Demuxes packets based on file format (mp4/avi/mkv etc.) until you have a packet for the stream that you want (eg. video) - (We might could say NAL units as an example here - not sure)
Feeds the decoder with the packet (avcodec_send_packet)
Starts the decoding process until it has enough packets to give you the first frame (decodes based on DTS)
Checks whether a frame is ready to be presented (avcodec_receive_frame)
Asks the decoder if it has a frame to be presented after feeding it. It might not be ready and you need to re-feed it or even it might give you more than 1 frames at once. (Frames comes out based on PTS)

FFMpeg - Is it difficultt to use

I am trying to use ffmpeg, and have been doing a lot of experiment last 1 month.
I have not been able to get through. Is it really difficult to use FFmpeg?
My requirement is simple as below.
Can you please guide me if ffmpeg is suitable one or I have implement on my own (using codec libs available).
I have a webm file (having VP8 and OPUS frames)
I will read the encoded data and send it to remote guy
The remote guy will read the encoded data from socket
The remote guy will write it to a file (can we avoid decoding).
Then remote guy should be able to pay the file using ffplay or any player.
Now I will take a specific example.
Say I have a file small.webm, containing VP8 and OPUS frames.
I am reading only audio frames (OPUS) using av_read_frame api (Then checks stream index and filters audio frames only)
So now I have data buffer (encoded) as packet.data and encoded data buffer size as packet.size (Please correct me if wrong)
Here is my first doubt, everytime audio packet size is not same, why the difference. Sometimes packet size is as low as 54 bytes and sometimes it is 420 bytes. For OPUS will frame size vary from time to time?
Next say somehow extract a single frame (really do not know how to extract a single frame) from packet and send it to remote guy.
Now remote guy need to write the buffer to a file. To write the file we can use av_interleaved_write_frame or av_write_frame api. Both of them takes AVPacket as argument. Now I can have a AVPacket, set its data and size member. Then I can call av_write_frame api. But that does not work. Reason may be one should set other members in packet like ts, dts, pts etc. But I do not have such informations to set.
Can somebody help me to learn if FFmpeg is the right choice, or should I write a custom logic like parse a opus file and get frame by frame.

Now remote guy need to write the buffer to a file. To write the file
we can use av_interleaved_write_frame or av_write_frame api. Both of
them takes AVPacket as argument. Now I can have a AVPacket, set its
data and size member. Then I can call av_write_frame api. But that
does not work. Reason may be one should set other members in packet
like ts, dts, pts etc. But I do not have such informations to set.
Yes, you do. They were in the original packet you received from the demuxer in the sender. You need to serialize all information in this packet and set each value accordingly in the receiver.

ffmpeg: Decoding specific AVProgram from the hls stream

I am developing a player based on ffmpeg.
Now I try to decode hls video. The video stream has several programs (AVProgram) separated by quality. I want to select one specific program with desired quality. But ffmpeg reads packets from all programs (all streams).
How can I tell ffmpeg which streams to read?

Solved by using disard field in AVStream structure:
_stream->discard = AVDISCARD_ALL;

What is the minimum amount of metadata is needed to stream only video using libx264 to encode at the server and libffmpeg to decode at the client?

I want to stream video (no audio) from a server to a client. I will encode the video using libx264 and decode it with ffmpeg. I plan to use fixed settings (at the very least they will be known in advance by both the client and the server). I was wondering if I can avoid wrapping the compressed video in a container format (like mp4 or mkv).
Right now I am able to encode my frames using x264_encoder_encode. I get a compressed frame back, and I can do that for every frame. What extra information (if anything at all) do I need to send to the client so that ffmpeg can decode the compressed frames, and more importantly how can I obtain it with libx264. I assume I may need to generate NAL information (x264_nal_encode?). Having an idea of what is the minimum necessary to get the video across, and how to put the pieces together would be really helpful.

I found out that the minimum amount of information are the NAL units from each frame, this will give me a raw h264 stream. If I were to write this to a file, I could watchit using VLC if adding a .h264
I can also open such a file using ffmpeg, but if I want to stream it, then it makes more sense to use RTSP, and a good open source library for that is Live555: http://www.live555.com/liveMedia/
In their FAQ they mention how to send the output from your encoder to live555, and there is source for both a client and a server. I have yet to finish coding this, but it seems like a reasonable solution

Save Live Video Stream To Local Storage

Problem:
I have to save live video streams data which come as an RTP packets from RTSP Server.
The data come in two formats : MPEG4 and h264.
I do not want to encode/decode input stream.
Just write to a file which is playable with proper codecs.
Any advice?
Best Wishes
History:
My Solutions and their problems:
Firt attempt: FFmpeg
I use FFmpeg libary to get audio and video rtp packets.
But in order to write packets i have to use av_write_frame :
which seems that decode /encode takes place.
Also, when i give output format as mp4 ( av_guess_format("mp4", NULL, NULL)
the output file is unplayable.
[ any way ffmpeg has bad doc. hard to find what is wrong]
Second attempth: DirectShow
Then i decide to use DirectShow. I found a RTSP Source Filter.
Then a Mux and File Writer.
Cretae Single graph:
RTSP Source --> MPEG MUX ---> File Writer
It worked but the problem is that the output file is not playable
if graph is not stoped. If something happens, graph crashed for example
the output file is not playable
Also i can able to write H264 data, but the video is completely unplayable.

The MP4 file format has an index that is required for correct playback, and the index can only be created once you've finished recording. So any solution using MP4 container files (and other indexed files) is going to suffer from the same problem. You need to stop the recording to finalise the file, or it will not be playable.
One solution that might help is to break the graph up into two parts, so that you can keep recording to a new file while stopping the current one. There's an example of this at www.gdcl.co.uk/gmfbridge.
If you try the GDCL MP4 multiplexor, and you are having problems with H264 streams, see the related question GDCL Mpeg-4 Multiplexor Problem

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio