How to make Gstreamer return only keyframes? - ffmpeg

In gstreamer pipeline, I'm trying to figure out if there's a way to specify that I only want key frames from a RTSP stream.
In ffmpeg you can do this with with -skip_frame nokey flag. E.g.:
ffmpeg -skip_frame nokey -i rtsp://184.72.239.149/vod/mp4:BigBuckBunny_175k.mov -qscale 0 -r 1/1 frame%03d.jpg
The corresponding gstreamer command to read the RTSP feed looks like this:
gst-launch-1.0 rtspsrc location=rtsp://184.72.239.149/vod/mp4:BigBuckBunny_175k.mov ! decodebin ! videorate ! "video/x-raw,framerate=1/1" ! videoconvert ! autovideosink
Does anyone know if it is possible to ask gstreamer to only return keyframes?

I think you could try add GST_PAD_PROBE_TYPE_BUFFER pad probe and return GST_PAD_PROBE_DROP on buffers with GST_BUFFER_FLAG_DELTA_UNIT flag set.

After spending days looking for a complete answer to this question, I eventually ended up with a solution that gave me the rtsp processing boost I was looking for.
Here is the diff of a pipeline in Python that transitioned from processing every RTSP frame to only processing key frames.
https://github.com/ambianic/ambianic-edge/pull/171/files#diff-f89415777c559bba294250e788230c5e
First register for the stream start bus event:
Gst.MessageType.STREAM_START
This is triggered when the stream processing starts. When this event occurs, request seek to the next keyframe.
When the request completes, the pipeline triggers the next bus event we need to listen for:
Gst.MessageType.ASYNC_DONE
Finally, here is the keyframe seek request itself:
def _gst_seek_next_keyframe(self):
found, pos_int = self.gst_pipeline.query_position(Gst.Format.TIME)
if not found:
log.warning('Gst current pipeline position not found.')
return
rate = 1.0 # keep rate close to real time
flags = \
Gst.SeekFlags.FLUSH | Gst.SeekFlags.KEY_UNIT | \
Gst.SeekFlags.TRICKMODE | Gst.SeekFlags.SNAP_AFTER | \
Gst.SeekFlags.TRICKMODE_KEY_UNITS | \
Gst.SeekFlags.TRICKMODE_NO_AUDIO
is_event_handled = self.gst_pipeline.seek(
rate,
Gst.Format.TIME,
flags,
Gst.SeekType.SET, pos_int,
Gst.SeekType.END, 0)

You can use a new seek event gst_event_new_seek with the flag GstSeekFlags for trickmode GST_SEEK_FLAG_TRICKMODE, skip frames GST_SEEK_FLAG_SKIP and keyframes only GST_SEEK_FLAG_TRICKMODE_KEY_UNITS.
You can also use identity and its property drop-buffer-flags to filter for GST_BUFFER_FLAG_DELTA_UNIT and maybe GST_BUFFER_FLAG_DROPPABLE.
see trickmodes, seeking and GstSeekFlags in the documentation for the seeking and identity:drop-buffer-flags and GstBufferFlags for identity.

Related

How to live stream from multiple webcam

I have five webcams I want to live to stream their content to m3u8(HLS stream), so I can use an HTML web player to play that file.
My current setup:
I have five systems each has a webcam connected to it, so I am using RTSP to stream data from the system to AWS.
./ffmpeg -f avfoundation -s 640x480 -r 30 -i "0" -f rtsp rtsp://awsurl.com:10000/cam1
./ffmpeg -f avfoundation -s 640x480 -r 30 -i "0" -f rtsp rtsp://awsurl.com:10000/cam2
....
./ffmpeg -f avfoundation -s 640x480 -r 30 -i "0" -f rtsp rtsp://awsurl.com:10000/cam5
On the cloud, I want to set up a server. I Googled and learned about GStreamer, with which I can set up an RTSP server. The command below has an error. (I can't figure out how to set up one server for multiple webcam streams)
gst-launch-1.0 udpsrc port=10000 ! rtph264depay ! h264parse ! video/x-h264,stream-format=avc ! \
mpegtsmux ! hlssink target-duration=2 location="output_%05d.ts"\
playlist-root=http://localhost:8080/hls/stream/ playlists-max=3
I question how I can set up the RTSP to differentiate between multiple webcam streams using one server (or do I have to create a server for each webcam stream)?
This might not be a canonical answer, as there are no details about the camera streams, the OS and your programming language, but you may try the following:
1. Install prerequisites
You would need gstrtspserver library (and may be gstreamer dev packages as well if you want to try from C++).
Assuming a Linux Ubuntu host, you would use:
sudo apt-get install libgstrtspserver-1.0 libgstreamer1.0-dev
2. Get information about the received streams
You may use various tools for that, with gstreamer you may use:
gst-discoverer-1.0 rtsp://awsurl.com:10000/cam1
For example, if you see:
Topology:
unknown: application/x-rtp
video: H.264 (Constrained Baseline Profile)
Then it is H264 encoded video sent by RTP so RTPH264.
You would get more details adding verbose flag (-v).
If you want your RTSP server to stream with H264 encoding and the incoming stream is also H264, then you would just forward.
If the received stream has a different encoding than what you want to encode, then you would have to decode video and re-encode it.
3. Run the server:
This python script would run a RTSP server, streaming 2 cams with H264 encoding (expanding to 5 should be straight forward).
Assuming here that the first cam is H264 encoded, it is just forwarding. For the second camera, the stream is decoded and re-encoded into H264 video.
In latter case, it is difficult to give a canonical answer, because the decoder and encoder plugins would depend on your platform. Some also use special memory space (NVMM for Nvidia, 3d11 for Windows, ...), in such case you may have to copy to system memory for encoding with x264enc, or better use an other encoder using same memory space as input.
import gi
gi.require_version('Gst','1.0')
gi.require_version('GstVideo','1.0')
gi.require_version('GstRtspServer','1.0')
from gi.repository import GObject, GLib, Gst, GstVideo, GstRtspServer
Gst.init(None)
mainloop = GLib.MainLoop()
server = GstRtspServer.RTSPServer()
mounts = server.get_mount_points()
factory1 = GstRtspServer.RTSPMediaFactory()
factory1.set_launch('( rtspsrc location=rtsp://awsurl.com:10000/cam1 latency=500 ! rtph264depay ! h264parse ! rtph264pay name=pay0 pt=96 )')
mounts.add_factory("/cam1", factory1)
factory2 = GstRtspServer.RTSPMediaFactory()
factory2.set_launch('( uridecodebin uri=rtsp://awsurl.com:10000/cam2 source::latency=500 ! queue ! x264enc key-int-max=15 insert-vui=1 ! h264parse ! rtph264pay name=pay0 pt=96 )')
mounts.add_factory("/cam2", factory2)
server.attach(None)
print ("stream ready at rtsp://127.0.0.1:8554/{cam1,cam2,...}")
mainloop.run()
If you want using C++ instead of python, you would checkout sample test-launch for your gstreamer version (you can get it with gst-launch-1.0 --version) that is similar to this script and adapt.
4. Test
Note that it may take a few seconds to start before displaying.
gst-play-1.0 rtsp://[Your AWS IP]:8554/cam1
gst-play-1.0 rtsp://[Your AWS IP]:8554/cam2
I have no experience with AWS, be sure that no firewall blocks UDP/8554.
rtsp-simple-server might be a good choice for presenting and broadcasting live streams through various format/protocols such as HLS over HTTP.
Even on meager and old configuration, it still has provided me a decent latency.
If you look for reduced latency, you might be surprised with cam2ip. Unfortunatly this isn't HLS, it's actually mjpeg, and thus without sound, but with far better latency.

How get centroid value of an audio file using FFMPEG aspectralstats

I'm new to FFMPEG and I'm having a really hard time understanding the documentation: https://ffmpeg.org/ffmpeg-all.html#aspectralstats
I want to get the centroid value for an audio file using command line.
ffmpeg -i file.wav -af aspectralstats=measure=centroid -f null -
I get the following errors
[Parsed_aspectralstats_0 # 000002a19b1b9380] Option 'measure' not found
[AVFilterGraph # 000002a19b1c99c0] Error initializing filter 'aspectralstats' with args 'measure=centroid'
Error reinitializing filters!
Failed to inject frame into filter network: Option not found
Error while processing the decoded data for stream #0:0
Conversion failed!
What am I doing wrong?
The measure option was added mere 4 weeks ago. So, yeah, you probably missed it by a couple days. Grab the latest snapshot if you want to only retrieve the centroids. The snapshot you have should get you the centroids along with other parameters if you just call aspectralstats (no options).
Also, the aspectralstats outputs only goes to the frame metadata and not printed on stdout by default. So you need to append ametadata=print:file=- to your -af.
ffmpeg -i file.wav -af aspectralstats=measure=centroid,ametadata=print:file=- -f null -
<Shameless plug> FYI, if you're calling it from Python, I have implemented an interface for this in ffmpegio if interested.</sp>

Realtime Muxing of videos

My problem basically comes from me having 2 different streams for videoplayback and having to mux them realtime in memory. One for video, and another for audio.
My goal is to create a proxy which can mux 2 different webm streams from their URLs, while supporting range requests (requires knowing the encoded file size). Would this be possible?
This is how I mux the audio and video streams manually using ffmpeg:
ffmpeg -i video.webm -i audio.webm -c copy output.webm
But, this requires me to download the video fully to process it, which I don't want to do unfortunately.
Thanks in advance!
If you are looking for this to work in go you can look into
github.com/at-wat/ebml-go/webm
This provides a BlockWriter interface to write to webm file using buffers; You can see the test file to checkout how to use it
https://github.com/at-wat/ebml-go
Checkout ffmpeg pipes.
Also since you have tagged go - i'm assuming you will use os/exec - in which case also checkout Cmd.ExtraFiles. This lets you use additional pipes(files) beyond just the standard 0, 1 and 2.
So let's say you have a stream for video and one for audio piping to 3 and 4 respectively. The ffmpeg bit of your command becomes:
ffmpeg -i pipe:3 -i pipe:4 -c copy output.webm

How extract single frame from video stream using gstreamer / closing stream

I need to take one frame from video stream from web camera and write it to the file.
In ffmpeg I could do it in this way:
ffmpeg -i rtsp://10.6.101.40:554/video.3gp -t 1 img.png
My GStreamer command:
gst-launch-1.0 rtspsrc location="rtsp://10.6.101.40:554/video.3gp" is_live=true ! decodebin ! jpegenc ! filesink location=img.jpg
problem is, gstreamer process keeps running and does not end. How can I take only one frame and force stream to close after file is written?
Is it possible to do this from command line or should I code this in c/python etc...
Thanks a lot.
I was able to do this with:
! jpegenc snapshot=TRUE
See jpegenc - snapshot.
but my source is different so your mileage may vary.
Try using the property number of buffers for element queue, and restrict it to 1. This will give you hopefully a single frame.

DirectShow Capture Source and FFMPEG

I have an AJA Capture card. The drivers installed with the card include some DirectShow filter. If I pop the filter into GraphEdit I see this:
and if I run the ffmpeg command
ffmpeg -f dshow -list_options true -i video="AJA Capture Source"
I see
[dshow # 0034eec0] DirectShow video device options
[dshow # 0034eec0] Pin "Video"
[dshow # 0034eec0] pixel_format=yuyv422 min s=720x486 fps=27.2604 max s=1024x
486 fps=29.985
...
[dshow # 0034eec0] Pin "Audio 1-2"
[dshow # 0034eec0] Pin "Line21"
video=AJA Capture Source: Immediate exit requested
So I see the Video and Audio pins I need. But when I try to run an ffmpeg command to capture both, I can only figure out how to do the video part. How do I hook in to that audio pin? It seems all the examples and documentation point to using a separate audio device, and nothing about hooking into the pins. I'm running it out of a batch file for now like this and I use the ^ to break the line
ffmpeg.exe ^
-y ^
-rtbufsize 100M ^
-f dshow ^
-i video="AJA Capture Source" ^
-t 00:00:10 ^
-aspect 16:9 ^
-c:v libx264 ^
"C:\VCS_AUD_SAMPLE.mp4"
Again, the command above will get me some beautiful video, but I can't figure out the audio part. Is this even supported in ffmpeg or am I going to have to modify the ffmpeg dshow code?
I am the developer of this filter.
Actually the same device is used for both audio and video streams. Moreover, the data for both streams are the result of one function call. Dividing by separate audio and video filters in other cards (example - DeckLink) is artificial (they must be internally connected). Possible reason for division - an attempt to simplify the graph. However, this can lead to other problems (using streams from different devices).
Why ffmpeg can't work with pins of the same filter - not clear to me. This problem of ffmpeg developers.
About only one instance access - very old version of AJA Capture Source filter used. A more recent version of the filter allow you to create multiple instances simultaneously (but only one instance may be in "Play" state). Please, check AJA site for download latest versions of filters. If you like to check latest beta versions of AJA filters, please, write to me at support#avobjects.com
So after tracing through source code of FFmpeg it was deemed that it could not hook up to multiple pins on a dshow source, so instead of modifying the FFmpeg source, we piped the AJA source pins through two virtual capture sources to achieve the desired result.
OK support for this was (hopefully) added recently in FFmpeg dshow, you can specify ffmpeg -f dshow -i video="AJA Capture Source":audio="AJA Capture Source" now and it work.
There are even new parameters for selecting which pin you want to use, if you need them. https://www.ffmpeg.org/ffmpeg-devices.html#dshow
If it doesn't work for somebody/anybody please let me know rogerdpack#gmail.com or comment here.
From http://ffmpeg.org/trac/ffmpeg/wiki/DirectShow
Also this note that "The input string is in the format video=<video device name>:audio=<audio device name>.
So try
ffmpeg.exe -f dshow -i "video=AJA Capture Source:audio=audio source name"

Resources