FPS from RTSP stream info does not match actual framerate

FPS from RTSP stream info does not match actual framerate - ffmpeg

I have a 25FPS RTSP stream coming from an IP-camera. I can successfully display the video stream. But when analyzing the stream with ffmpeg (ffprobe actually), I observe fewer frames per second rate:
$ ffprobe -rtsp_transport tcp -i rtsp://camera_ip:554/stream -select_streams v:0 -show_frames -show_entries frame=coded_picture_number,pkt_pts_time -of csv=p=0
Stream #0:0: Video: h264 (Main), yuvj420p(pc, bt709, progressive), 640x480, 25 fps, 25 tbr, 90k tbn, 50 tbc
0.400000,0
0.080000,1
0.120000,2
0.200000,3
0.240000,4
0.320000,5
0.360000,6
0.440000,7
0.480000,8
0.560000,9
0.600000,10
0.680000,11
0.720000,12
0.800000,13
0.840000,14
0.920000,15
0.960000,16
1.040000,17
1.080000,18
1.160000,19
1.200000,20
1.280000,21
1.320000,22
1.400000,23
1.440000,24
1.520000,25
1.560000,26
1.640000,27
1.680000,28
1.760000,29
1.800000,30
1.880000,31
1.920000,32
2.000000,33
We can clearly see the 80ms gap between some of the frames, resulting in a ~16fps stream.
I have observed the same framerate issue with GStreamer (printing information in the rtpjitterbuffer indicates the frame gap is sometimes 80ms and sometimes 40ms). But the weird thing is, I encountered the same issue with an HDMI-RJ45 decoder, and I doubt the same issue comes from 2 different devices.
I didn't get much more informations using -loglevel debug or trace.
Does anybody have an idea about what is going wrong in the stream ?
(I used ffprobe 4.2.3 and the last "2021-05-09-git-8649f5dca6-full_build-www.gyan.dev" with the same results, and GStreamer 1.16.2 with a pipeline like "urisourcebin ! h264depay ! h264parse ! fakesink")
EDIT: The camera skipping of frames was caused by the activation of a third stream in the options. I find it really weird that it skips exactly the same frames every seconds. However, I still haven't found the cause of the downrate on my RTSP encoder.
Anyway, this was actually hardware related and not software related.

Cameras, especially cheap ones, skip frames sometimes when they can't keep up with the encoding. The camera, not GStreamer, ffmpeg, or other software, is responsible for the skipped frames. It's more or less normal.

Related

HTML5 H264 video sometimes not displaying

Given this stream from an RTSP camera which produce H264 stream:
Input #0, rtsp, from 'rtsp://admin:admin#192.168.0.15:554':
Metadata:
title : LIVE555 Streaming Media v2017.10.28
comment : LIVE555 Streaming Media v2017.10.28
Duration: N/A, start: 0.881956, bitrate: N/A
Stream #0:0: Video: h264 (Main), yuv420p(progressive), 1600x900, 25 fps, 25 tbr, 90k tbn, 50 tbc
I want to run ffmpeg and pipe its output to a HTML5 video component with MSE.
Everything is fine and smooth as long I run this ffmpeg command (piping is removed!):
$ ffmpeg -i 'rtsp://admin:admin#192.168.0.15:554' -c:v copy -an -movflags frag_keyframe+empty_moov -f mp4
However it takes a bit time at the beginning.
I realized that the function avformat_find_stream_info makes about 15-20 seconds of delay on my system. Here is the docs.
Now I have also realized that if I add -probesize 32, avformat_find_stream_info will return almost immediately, but it cause some warnings:
$ ffmpeg -probesize 32 -i 'rtsp://admin:admin#192.168.0.15:554' -c:v copy -an -movflags frag_keyframe+empty_moov -f mp4
[rtsp # 0x1b2b300] Stream #0: not enough frames to estimate rate; consider increasing probesize
[rtsp # 0x1b2b300] decoding for stream 0 failed
Input #0, rtsp, from 'rtsp://admin:admin#192.168.0.15:554':
Metadata:
title : LIVE555 Streaming Media v2017.10.28
comment : LIVE555 Streaming Media v2017.10.28
Duration: N/A, start: 0.000000, bitrate: N/A
Stream #0:0: Video: h264 (Main), yuv420p(progressive), 1600x900, 25 tbr, 90k tbn, 50 tbc
If I dump out this stream (into a file, test.mp4), all mediaplayers can play it perfectly.
However if I pipe this output into the HTML5 video with MSE, the stream sometimes displayed correctly, sometimes it just doesn't. No warnings or error messages are printed on the console in the browser.
From the second output I can see the fps is missing. I tried to set it up manually, but was not succeed (it seemed I could not change it manually).
How can I avoid avformat_find_stream_info and have the HTML5 MSE playback if I know everything of the stream beforehand?
Update
According to #szatmary's comments and answers I have search for a h264 bitstream parser.
This is what I found. I did also save the mp4 file which is not playable by HTML5 video, but by VLC it does, and I dropped into this analyser.
Here is a screenshot of my analysis:
Some facts here:
until #66 there is no type7 (SPS) unit in the stream.
62 is the last PPS before the first SPS arrived.
there are a lot of PPS even before 62.
bitstream ends at #103.
playing in VLC the stream is 20 seconds long.
I have several things to clear:
the #62 and #66 sps/pps units (or whatever) are holding metadata only for the next coming frames, or they can even refer to previous frames?
VLC plays 20 seconds, is it possible that it scans the whole file before, then play the frames from #1 based on #62 and #66 infos? - if VLC would get the file as stream, in this case it might play only a few seconds (#66 - #103).
most important: what shall I do with the bitstream parser to make HTML5 video playing this data? Shall I drop all the units before #62? Or before #66?
Now I'm really lost in this topic. I have created a video, with FFMPEG but this time I allowed it to finish its avformat_find_stream_info function.
Saved the video with the same methods as previously. VLC now plays 18 seconds (this is okay, I have a 1000 frame limitation in ffmpeg command).
However let's see now the bitstream information:
Now PPS and SPS are 130 and 133 respectively. This resulted a stream which is 2 sec shorter than before. (I guess)
Now I have learned that in a correct parsed h264 there can still be a lot of units before the first SPS(/PPS).
SO I would finetune my question above: what shall I do with the bitstream parser to make HTML5 video playing this data?
Also the bitstream parser I have found is not good, because it uses a binary wrapper => it can not be run purely on the client side.
I'm looking at mp4box now.

How can I avoid avformat_find_stream_info and have the HTML5 MSE playback if I know everything of the stream beforehand?
You don't know everything of the stream beforehand. You don't know the resolution, or the bitrate, or the level or the profile, or the constraint flags. You don't know the scaling lists values, You don't know the VUI data, you don't know if CABAC is used.
The player needs all these things to play the video, and they are not know until the player, or ffmpeg sees the first sps/pps in the stream. By limiting the analyze duration you are telling ffmpeg to give up looking for it, so it cant be guaranteed to produce a valid stream. It may work sometimes, it may not other times, and it largely depends on what frame in the rstp stream you start on.
A possible solution would be to add more keyframes to the source video if you can, This will send the sps/pps more frequently. If you don't control the source stream, you must just wait until a sps/pps show up int the stream.

How to decode in C a stream from this noname almost-UVC grayscale camera

Edit: I found the cause. The stream always begins with something which is not a JPEG. Only after it there is a normal MJPEG stream. Interestingly, not all of the small examples of using V4L2/MJPEG decoders can divide what the camera produces properly into frames. Something called capturev4l2.c is a rare example of doing it properly. Possibly there is some detail, which decides if the camera's bugginess is worked around or not.
I have a noname almost-UVC-compliant camera (it fails several compatibility tests). This is a relatively cheap global shutter camera, and thus I would like to use it instead of something properly documented. It outputs what is reported (and properly played) by mplayer as
Opening video decoder: [ffmpeg] FFmpeg's libavcodec codec family
libavcodec version 57.107.100 (external)
Selected video codec: [ffmjpeg] vfm: ffmpeg (FFmpeg MJPEG)
ffprobe shows the following:
[mjpeg # 0x55c086dcc080] Format mjpeg detected only with low score of 25, misdetection possible!
Input #0, mjpeg, from '/home/sc/Desktop/a.raw':
Duration: N/A, bitrate: N/A
Stream #0:0: Video: mjpeg, yuvj422p(pc, bt470bg/unknown/unknown), 640x480, 25 tbr, 1200k tbn, 25 tbc
But as opposed to mplayer, it is unable to play it.
I tried decode_jpeg_raw from mjpegtools, it complains about the header, which seems to change with each captured stream. So does not look like an unwrapped stream of JPEG images.
I thus tried 0_hello_world.c from libavcodec/libavformat, but its stops at avformat_open_input() with an error Invalid data found when processing input. A 100-frame sample file is sitting here a.raw. Do you have any idea how to determine a method of decoding it in C into anything plain bitmap?
The file is grayscale, does not begin with a constant value, guvcview and mplayer are the only players I know, which can decode it without artifacts...

Since you have raw stream, I think what you need is a decoder with parser.
Check this decode_video.c example on ffmpeg:
https://github.com/FFmpeg/FFmpeg/blob/master/doc/examples/decode_video.c
Change necessary parts accordingly, like avcodec_find_decoder(...).
Hope that helps.

Getting raw h264 packets from USB camera on Raspberry Pi

I am trying to receive H264 frames from a USB webcamera connected to my Raspberry PI
Using the RPi Camera Module I can run the following command to get H264 data outputted in stdin: raspivid -t 0 -w 640 -h 320 -fps 15 -o - with close to zero latency
Is there an equivalent function to do this with a USB camera? I have two USB cameras I would like to do this with.
Using ffprobe /dev/videoX I get the following output: (shorted down to the important details):
$ ffprobe /dev/video0
...
Input #0, video4linux2,v4l2, from '/dev/video0':
Duration: N/A, start: 18876.273861, bitrate: 147456 kb/s
Stream #0:0: Video: rawvideo (YUY2 / 0x32595559), yuyv422, 1280x720, 147456 kb/s, 10 fps, 10 tbr, 1000k tbn, 1000k tbc
$ ffprobe /dev/video1
...
Input #0, video4linux2,v4l2, from '/dev/video1':
Duration: N/A, start: 18980.783228, bitrate: 115200 kb/s
Stream #0:0: Video: rawvideo (YUY2 / 0x32595559), yuyv422, 800x600, 115200 kb/s, 15 fps, 15 tbr, 1000k tbn, 1000k tbc
$ ffprobe /dev/video2
...
Input #0, video4linux2,v4l2, from '/dev/video2':
Duration: N/A, start: 18998.984143, bitrate: N/A
Stream #0:0: Video: h264 (Main), yuv420p(progressive), 1920x1080, -5 kb/s, 30 fps, 30 tbr, 1000k tbn, 2000k tbc
As far as I can tell two of them are not H264, which will need to be "decoded" to H264 so I understand there is added a bit latency there. But the third one (video2) is H264 so I should be able to get data from it? I've tried to just pipe it out with CAT but it says I got invalid arguments.
I've come as far as using FFMPEG might be the only option here. Would like to use software easily available for all RPi (apt install).
Bonus question regarding H264 packets: When I stream the data from raspivid command to my decoder it works perfectly. But if I decide to drop the 10 first packets then it never initializes the decoding process and just shows a black background. Anyone know what might be missing in the first packets that I might be able to recreate in my software so I dont have to restart the stream for every newly connected user?
EDIT: Bonus Question Answer: After googling around I see that the first two frames raspivid sends me are. So by ignoring the two first frames my decoder wont "decode" properly. So if I save those frames and send them first to all new users it works perfectly. Seems like these are used in some kind of initial process.
0x27 = 01 00111 = type 7 Sequence parameter set (B-frame)
0x28 = 01 01000 = type 8 Picture parameter set (B-frame)

First, let us get the data flow right. For the Raspi cam:
The Raspi camera is connected by CSI (Camera Serial Interface) to the Raspi. This link carries uncompressed, raw image data.
raspivid talks to the embedded GPU of the Raspi to access the image data and also asks the GPU to perform H.264 encoding, which always adds some latency (you could use raspiyuv to get the raw uncompressed data, possibly with less latency).
USB webcams typically transfer uncompressed, raw image data. But some also transfer H.264 or jpeg encoded data.
Next, the Video for Linux API version 2 was not made for shell pipes, so you can't get data out of a /dev/videoX with cat. You need some code to perform IOCTL calls to negotiate what and how to read data from the device. ffmpeg does exactly that.
Regarding your bonus question, you might try the --inline option of raspivid, which forces the stream to include PPS and SPS headers on every I-frame.
Next, outputting H.264 data from ffmpeg, using -f rawvideo looks wrong to me, since rawvideo means uncompressed video. You could instead try -f h264 to force raw H.264 videooutput format:
ffmpeg -i /dev/video2 -c copy -f h264 pipe:1
Finally, you actually want to get a H.264 stream from your USB webcam. Since the image data comes uncompressed from the camera, it first has to be encoded to H.264. The sensible option on the Raspi is to use the hardware encoder, since using a software encoder like x264 would consume too much CPU resources.
If you have an ffmpeg that was configured using --enable-mmal and/or --enable-omx-rpi, you can use ffmpeg to talk to the hardware H.264 encoder.
Otherwise, take a look at gstreamer and its omxh264enc element, eg. here. gstreamer can also talk to v4l2 devices.

Remove sequentially duplicate frames when using FFmpeg

Is there any way to detect duplicate frames within the video using ffmpeg?
I tried -vf flag with select=gt(scene\,0.xxx) for scene change. But, it did not work for my case.

Use the mpdecimate filter, whose purpose is to "Drop frames that do not differ greatly from the previous frame in order to reduce frame rate."
This will generate a console readout showing which frames the filter thinks are duplicates.
ffmpeg -i input.mp4 -vf mpdecimate -loglevel debug -f null -
To generate a video with the duplicates removed
ffmpeg -i input.mp4 -vf mpdecimate,setpts=N/FRAME_RATE/TB out.mp4
The setpts filter expression generates smooth timestamps for a video at FRAME_RATE FPS. See an explanation for timestamps at What is video timescale, timebase, or timestamp in ffmpeg?

I also had this problem and Gyan's excellent answer above got me started but the result of it was desynchronized audio so I had to explore more options:
mpdecimate vs decimate filters
mpdecimate is the standard recommendation I found all over SO and the internet, but I don't think it should be the first pick
it uses heuristics so it may and will skip some duplicate frames
you can tweak the detection with frac parameter, but that's extra work you may want to avoid if you can
it is not really supposed to work with mp4 container (source), but I was using mkv so this limitation didn't apply on my case, but good to be aware of it
decimate removes frames precisely, but it is useful only for periodically occurring duplicates
detected vs actual frame rate
so you have multimedia file with duplicate frames, it is good idea to make sure that the detected frame rate matches the actual one
ffprobe in.mkv will output the detected FPS; it may look like this
Stream #0:0: Video: h264 (Main), yuvj420p(pc, bt709, progressive), 1920x1080, SAR 1:1 DAR 16:9, 25 fps, 25 tbr, 1k tbn, 50 tbc (default)
the actual frame rate can be found out if you open the media in.mkv in a media player that lets you step one frame at the time; then count the steps needed to advance the playback time for 1 second, in my case it was 30 fps
not a big surprise for me, because every 6th frame was duplicate (5 good frames and 1 duplicate), so after 25 good frames there was also 5 duplicates
what is N/FRAME_RATE/TB
except the use of FRAME_RATE variable the N/FRAME_RATE/TB is equal to the example below from ffmpeg documentation (source)
Set fixed rate of 25 frames per second:
setpts=N/(25*TB)
the math behind it perfectly explained in What is video timescale, timebase, or timestamp in ffmpeg?
it basically calculates timestamp for each frame and multiplies it with timebase TB to enhance precision
FRAME_RATE variable vs literal FPS value (e.g. 25)
this is why it is important to know your detected and actual FPS
if the detected FPS matches your actual FPS (e.g. both are 30 fps) you can happily use FRAME_RATE variable in N/FRAME_RATE/TB
but if the detected FPS differs than you have to calculate the FRAME_RATE on your own
in my case my actual FPS was 30 frames per second and I removed every 6th frame, so the target FPS is 25 which leads to N/25/TB
if I used FRAME_RATE (and I actually tried that) it would take the wrong detected fps of 25 frames i.e. FRAME_RATE=25, run it through mpdecimate filter which would remove every 6th frame and it would update to FRAME_RATE=20.833 so N/FRAME_RATE/TB would actually be N/20.833/TB which is completely wrong
to use or not to use setpts
so the setpts filter already got pretty complicated especially because of the FPS mess that duplicate frames may create
the good news is you actually may not need the setpts filter at all
here is what I used with good results
ffmpeg -i in.mkv -vf mpdecimate out.mkv
ffmpeg -i in.mkv -vf decimate=cycle=6,setpts=N/25/TB out.mkv
but the following gave me desynchronized audio
ffmpeg -i in.mkv -vf mpdecimate,setpts=N/FRAME_RATE/TB out.mkv
ffmpeg -i in.mkv -vf mpdecimate,setpts=N/25/TB out.mkv
ffmpeg -i in.mkv -vf decimate=cycle=6 out.mkv
as you see
mpdecimate and decimate does not work the same way
mpdecimate worked better for me without setpts filter
while decimate needed setpts filter and furthermore I need to avoid FRAME_RATE variable and use N/25/TB instead because the actual FPS was not detected properly
note on asetpts
it does the same job as setpts does but for audio
it didn't really fix desync audio for me but you want to use it something like this -af asetpts=N/SAMPLE_RATE/TB
maybe you are supposed to adjust the SAMPLE_RATE according to the ratio of duplicate frames removed, but it seems to me like extra unnecessary work especially when my video had the audio in sync at the beginning, so it is better to use commands that will keep it that way instead of fixing it later
tl;dr
If the usually recommended command ffmpeg -i in.mkv -vf mpdecimate,setpts=N/FRAME_RATE/TB out.mkv does not work for you try this:
ffmpeg -i in.mkv -vf mpdecimate out.mkv
or
ffmpeg -i in.mkv -vf decimate=cycle=6,setpts=N/25/TB out.mkv
(cycle=6 because every 6th frame is duplicate and N/25/TB because after removing the duplicates the video will have 25 fps (avoid the FRAME_RATE variable); adjust for your use case)

I tried the solution here and none of them seem to work when you need to trim the video and keep the audio. There is a mpdecimate_trim repo that does a good job. It basically list all the frames to be dropped (using mpdecimate) and then creates a complex filter to trim all those frames (and the corresponding audio) from the video by splitting the video and only including the portion without duplicate frames.
I did have to tweak a few options in the code though. For instance, in mpdecimate_trim.py, I had to change this line:
dframes2 = get_dframes(ffmpeg(True, "-vf", "mpdecimate=hi=576", "-loglevel", "debug", "-f", "null", "-").stderr)
I had to detect duplicates a bit more aggressively, so I changed the mpdecimate option to mpdecimate=hi=64*32:lo=64*24:frac=0.05

If your getting duplicate frozen frames after cropping you could be entering your cmd line incorrectly. I was entering the orders of parameters incorrectly causing the first few seconds of my video to freeze. Here is the fix if that is the case. Hopefully this helps you avoid getting frozen frames altogether.
ffmpeg -ss {start_time} -i {input_path} -vcodec copy -acodec copy -to {end_time} {output_path}

How to start Quicktime streaming immediately

I am trying to make Quicktime files stream from a HTML document using the <video> tag.
The video format is:
Video: h264 (Main), yuv420p, 640x360, 2175 kb/s, 24 fps, 24 tbr, 24 tbn, 48 tbc
Audio: aac, 48000 Hz, stereo, s16, 104 kb/s
However, the browser does not start playing the video until after downloading the whole file (which takes minutes). I, of course, want the video to start streaming as soon as possible.
The guy who gave me the video files has recollection of some "Streaming friendly" output option in his video editing software that resolved a similar problem earlier. However, many of the video files don't have original project files anymore and thus cannot be regenerated.
So my question is: How can I make the existing video files stream immediately? (I have FFMPEG on my machine)? Or does the solution lie elsewhere?

Another way to move the moov atom to the start to make it play immediately is qt-faststart.
qt-faststart input_file output_file will do the trick.

You need to hint the files so that the mov atom is at the begining of the file. A good tool to do it is MP4Box.
MP4Box -hint input_file.mp4
and then it should be playable immediately.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio