Extract frames as images from an RTMP stream in real-time

Extract frames as images from an RTMP stream in real-time - ffmpeg

I am streaming short videos (4 or 5 seconds) encoded in H264 at 15 fps in VGA quality from different clients to a server using RTMP which produced an FLV file. I need to analyse the frames from the video as images as soon as possible so I need the frames to be written as PNG images as they are received.
Currently I use Wowza to receive the streams and I have tried using the transcoder API to access the individual frames and write them to PNGs. This partially works but there is about a second delay before the transcoder starts processing and when the stream ends Wowza flushes its buffers causing the last second not to get transcoded meaning I can lose the last 25% of the video frames. I have tried to find a workaround but Wowza say that it is not possible to prevent the buffer getting flushed. It is also not the ideal solution because there is a 1 second delay before I start getting frames and I have to re-encode the video when using the transcoder which is computationally expensive and unnecessarily for my needs.
I have also tried piping a video in real-time to FFmpeg and getting it to produce the PNG images but unfortunately it waits until it receives the entire video before producing the PNG frames.
How can I extract all of the frames from the stream as close to real-time as possible? I don’t mind what language or technology is used as long as it can run on a Linux server. I would be happy to use FFmpeg if I can find a way to get it to write the images while it is still receiving the video or even Wowza if I can find a way not to lose frames and not to re-encode.
Thanks for any help or suggestions.

Since you linked this question from the red5 user list, I'll add my two cents. You may certainly grab the video frames on the server side, but the issue you'll run into is transcoding from h.264 into PNG. The easiest was would be to use ffmpeg / avconv after getting the VideoData object. Here is a post that gives some details about getting the VideoData: http://red5.5842.n7.nabble.com/Snapshot-Image-from-VideoData-td44603.html
Another option is on the player side using one of Dan Rossi's FlowPlayer plugins: http://flowplayer.electroteque.org/snapshot

I finally found a way to do this with FFmpeg. The trick was to disable audio, use a different flv meta data analyser and to reduce the duration that FFmpeg waits for before processing. My FFmpeg command now starts like this:
ffmpeg -an -flv_metadata 1 -analyzeduration 1 ...
This starts producing frames within a second of receiving an input from a pipe so writes the streamed frames pretty close to real-time.

Related

Concatenating Smooth Streaming output to a single MP4 file - problems with A/V sync. What is CodecPrivateData?

I have a video in fragmented form which is an output of an Azure Media Services Live Event (Smooth Streaming).
I'm trying to concatenate the segments to get a single MP4 file, however I've run into a A/V sync problem - no matter what I do (time-shifting/speeding up/slowing down/using FFmpeg filters), the audio delay is always floating. To get the output MP4 file, I tried concatenating the segments for video and audio streams (both at OS file level and with FFmpeg) and then muxing with FFmpeg.
I've tried everything I found on the web and I'm always ending up with exactly the same result. What's important, when I play the source from the manifest file, it's all good. That made me skim through the manifest once again, and I realized there's CodecPrivateData value which I'm not using anywhere in the process. What is it? Could it somehow help solving my problem?

Mystery solved: the manifest file contains the list of stream discontinuities, which need to be taken into account when concatenating the streams.

Transcoding a Fast Video (think Snapchat, Instagram)

I am very new to the video world, but have noticed social media services.. particular snapchat and instagram do a great job of getting videos to load fast even on poorer connections. I know some of this is how the videos are transcoded.
I have gathered some presets I think I should be using when transcoding with ffmpeg, but am not sure of what formats or other parts of it. I would love to hear what people think!
ffmpeg()
.input(remoteReadStream)
.outputOptions('-preset fast')
.outputOptions('-movflags +faststart')
Other than that I am not entirely sure what else..

If you want fast start of the video you must ensure that the first frame is key-frame. Use -force_key_frames '00:00:00.000' parameter of ffmpeg for such task.
But in fact the main method for fast video response on poor connections is adaptive bitrate streaming (https://en.m.wikipedia.org/wiki/Adaptive_bitrate_streaming). It selects video source with bitrate apropriate for user bandwith. So you need to encode your video in different sizes with different qualities and bitrates and assemble them in special playlist for adaptive streaming.

Convert m3u8 (HLS) to mpd (MPEG-DASH)

I have Live stream of HLS [https://82-80-192-30.vidnt.com/ipbc_IPBCchannel11LVMRepeat/definst/IPBCchannel11LVM_3.stream/playlist.m3u8] and I want to convert it to MPEG-DASH.
What is the best practice?
The stream is already h264 aac therefore I understand I do not need to reencode and I just need to transmux.
What should I use?
ffmpeg? mp4box?
Notes:
I used nginx-rtmp-module (https://github.com/ut0mt8/nginx-rtmp-module/) in order to create DASH from RTMP stream according to this tutorial: https://isrv.pw/html5-live-streaming-with-mpeg-dash
But nginx-rtmp-module can get as input just rtmp streams and it did not work for me with HLS stream.
I used ffmpeg in order to create dash from m3u8 as following:
ffmpeg -i https://82-80-192-30.vidnt.com/ipbc_IPBCchannel11LVMRepeat/_definst_/IPBCchannel11LVM_3.stream/playlist.m3u8 -strict -2 -min_seg_duration 2000 -window_size 5 -extra_window_size 5 -use_template 1 -use_timeline 1 -f dash out.mpd
But this is very limited. I can't control the segment duration.
The min_seg_duration parameter of ffmpeg does not work very well for me, and also it can set the minimum duration while I want to limit the maximum duration of each segment (the segment comes out with ~10 seconds, while I need it to be ~2-4 seconds as I'm playing live).

Firstly it is worth saying that if you can avoid doing this you will be saving yourself a whole lot of work!
Most devices and clients these days can play both HLS and DASH streams, so the usual approach is to add any extra functionality needed in your app or client.
If you do have to convert server side, then its worth being aware that while HLS streams typically used TS segments in the past, recently support for fragmented MP4 has become available within the HLS ecosystem.
If you have TS video streams then you will need to do a conversion along the lines you outline above with ffmpeg.
If you have fragmented MP4 then you should actually have the correct format already and may find you just have to create the manifest file so DASH can access the fragmented mp4 streams.
All the above assumes that your content is not encrypted or that you don't have to support encryption - if it is then you may not be able to convert the media, or you may have to also encrypt the media differently for some streams than others, as currently most deployed windows and chrome devices and browsers use a slightly different encryption approach (a different AES mode) than Apple devices.

What is the minimum amount of metadata is needed to stream only video using libx264 to encode at the server and libffmpeg to decode at the client?

I want to stream video (no audio) from a server to a client. I will encode the video using libx264 and decode it with ffmpeg. I plan to use fixed settings (at the very least they will be known in advance by both the client and the server). I was wondering if I can avoid wrapping the compressed video in a container format (like mp4 or mkv).
Right now I am able to encode my frames using x264_encoder_encode. I get a compressed frame back, and I can do that for every frame. What extra information (if anything at all) do I need to send to the client so that ffmpeg can decode the compressed frames, and more importantly how can I obtain it with libx264. I assume I may need to generate NAL information (x264_nal_encode?). Having an idea of what is the minimum necessary to get the video across, and how to put the pieces together would be really helpful.

I found out that the minimum amount of information are the NAL units from each frame, this will give me a raw h264 stream. If I were to write this to a file, I could watchit using VLC if adding a .h264
I can also open such a file using ffmpeg, but if I want to stream it, then it makes more sense to use RTSP, and a good open source library for that is Live555: http://www.live555.com/liveMedia/
In their FAQ they mention how to send the output from your encoder to live555, and there is source for both a client and a server. I have yet to finish coding this, but it seems like a reasonable solution

Save Live Video Stream To Local Storage

Problem:
I have to save live video streams data which come as an RTP packets from RTSP Server.
The data come in two formats : MPEG4 and h264.
I do not want to encode/decode input stream.
Just write to a file which is playable with proper codecs.
Any advice?
Best Wishes
History:
My Solutions and their problems:
Firt attempt: FFmpeg
I use FFmpeg libary to get audio and video rtp packets.
But in order to write packets i have to use av_write_frame :
which seems that decode /encode takes place.
Also, when i give output format as mp4 ( av_guess_format("mp4", NULL, NULL)
the output file is unplayable.
[ any way ffmpeg has bad doc. hard to find what is wrong]
Second attempth: DirectShow
Then i decide to use DirectShow. I found a RTSP Source Filter.
Then a Mux and File Writer.
Cretae Single graph:
RTSP Source --> MPEG MUX ---> File Writer
It worked but the problem is that the output file is not playable
if graph is not stoped. If something happens, graph crashed for example
the output file is not playable
Also i can able to write H264 data, but the video is completely unplayable.

The MP4 file format has an index that is required for correct playback, and the index can only be created once you've finished recording. So any solution using MP4 container files (and other indexed files) is going to suffer from the same problem. You need to stop the recording to finalise the file, or it will not be playable.
One solution that might help is to break the graph up into two parts, so that you can keep recording to a new file while stopping the current one. There's an example of this at www.gdcl.co.uk/gmfbridge.
If you try the GDCL MP4 multiplexor, and you are having problems with H264 streams, see the related question GDCL Mpeg-4 Multiplexor Problem

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio