gstreamer webrtc h264 playback stops after few frames in browser - windows

I need help for debugging a probabilistic issue. I built a gstreamer pipeline to stream NVENC encoded h264 bitstreams(video only) to browser. Browser seldom plays properly. In most cases only few frames are rendered then the picture gets frozen.
The NVENC settings follow "https://cloud.google.com/solutions/gpu-accelerated-streaming-using-webrtc" which are h264 high profile & low latency high quality & NVENC_INFINITE_GOPLENGTH(some settings have been tried, like rateControlMode/enableVFR/sliceMode/repeatSPSPPS/outputAUD but no help). At runtime, NVENC encodes real-time rendered opengl fbo texture to h264 bitstreams and push them into gstreamer via appsrc. Currently the texture size is 512x512 and fed at 10/20/30 fps.
I use gstreamer 1.18.2, the pipeline is defined as "appsrc name=nvenc_src do-timestamp=1 ! video/x-h264, stream-format=byte-stream, alignment=au ! rtph264pay aggregate-mode=zero-latency ! queue ! application/x-rtp,media=video,encoding-name=H264,payload=123 ! webrtcbin bundle-policy=max-compat name=backend_webrtc".
The gstreamer part codes follow the sendrecv example(replacing libsoup with websocketpp and removing the recv logics).
The application is built as MSVC 2019 32-bit. The browser decoder is NVDEC. Exe application and js codes run on the same PC(windwos 10, gtx1060, driver version 460.89). I've tried in Chrome(87.0.4280.88) and edge(87.0.664.66). I also tried running the js codes in android(chrome) and ios(safari) and get the same results.
It can be concluded that NVENC generates 'correct' h264 bitstreams. I dump the raw h264 bitstreams into file. The file plays properly in VLC. I also tried pushing the dumped h264 bitstreams into gstreamer. The frozen issue still happens.
After the picture is frozen, the playback never recovers. the browser's 'webrtc-internals' shows that bytes/headerBytes/packests_Received keep growing, while frameReceived/framesDecoded/framesDropped stay unchanged.
Since the bitwise same h264 frames behave differently at different runs, I guess rtp timestamps might cause the issue. I've tried setting appsrc's do-timestamp to 0 and manually set gstbuffer's PTS but it does not help.

Here are few things that you need to pay attention to:
Infinite GOP will not work - you must configure NVENC to send a key frame every 30 - 60 frames.
Of course SPS-PPS NALs must come before each key frame.
Prohibit B-frames: WebRTC doesn't support them because they increase latency.
Startup codes between NALs must be 3-bytes startup codes: WebRTC doesn't respect 2-bytes startup codes. We bumped into this issue before and had to manually correct the startup codes.

Thanks to user1390208's kind reminds, I use an h264 analyzer tool to check the dumpped bitstreams and find the evils.
Browser does support infinite GOP. But it needs keyframe & SPS-PPS for recovering from error. The need for resend arises with a high probability during the launch procedure. So a quick solution is resending keyframe & SPS-PPS after js detects fps is 0 and send a resend request to gstreamer via the webrtc data channel.
The reasons I failed to find the answer are two:
I didn't set encodePicFlags before calling nvEncEncodePicture. NVENC always generates infinite GOP whether gopLength & frameIntervalP are set to all I or I&P. There are many GOP related parameters and look confusing to me now. In my current codes, the only way to get the desried GOP control is by setting NV_ENC_PIC_PARAMS::encodePicFlags before calling nvEncEncodePicture. Note that I use NV_ENC_PRESET_LOW_LATENCY_HQ_GUID & NV_ENC_H264_PROFILE_HIGH_GUID which might cause infinite GOP is always generated(when encodePicFlags is not set)? NVENC reports no error when settings gopLength & frameIntervalP & repeatSPSPPS, so I thought the issue also happens when the GOP is all of I frames and SPS-PPS does not help.
Infinity GOP does not always cause the mentioned issue in browser during the launch procedure.
So before I check the h264 bitstreams with analyzer tool, it looks to me that even all-keyframe+SPS-PPS bitstreams have this probability issue.
BTW, NVENC generates 4-byte start code.

Related

In what way does this HEVC video not comply to Apples's requirements document?

My goal is to work out why a given video file does not play on Macos/Safari/Quicktime.
The background to this question is that it is possible to play HEVC videos with a transparent background/alpha channel on Safari/MacOS. To be playable, a video must meet the specific requirements set out by Apple in this document:
https://developer.apple.com/av-foundation/HEVC-Video-with-Alpha-Interoperability-Profile.pdf
The video that does not play on Apple/Safari/Quicktime is an HEVC video with an alpha transparency channel. Note that VLC for MacOS DOES play this file. Here it is:
https://drive.google.com/file/d/1ZnXjcDbk-_YxTgRuH_D7RSR9SXdY_XTv/view?usp=share_link
I have two example HEVC video files with a transparent background/alpha channel, and they both play fine using either Quicktime player or Safari:
Working video #1:
https://drive.google.com/file/d/1PJAyg_sVKVvb-Py8PAu42c1qm8l2qCbh/view?usp=share_link
Working video #2:
https://drive.google.com/file/d/1kk8ssUyT7qAaK15afp8VPR6mIWPFX8vQ/view?usp=sharing
The first step is to work out in what way my non-working video ( https://drive.google.com/file/d/1ZnXjcDbk-_YxTgRuH_D7RSR9SXdY_XTv/view?usp=share_link ) does not comply with the specification.
Once it is clear which requirements are not met by the non-working video then I can move onto the next phase, which is to try to formulate an ffmpeg command that will output a video meeting the requirements.
I have read Apples requirements document and I am out of my depth in trying to analyse the non working video against the requirements - I don't know how to do it.
Can anyone suggest a way to identify what is wrong with the video?
Additional context is that I am trying to find a way to create Apple/MacOS compatible alpha channel / transparent videos using ffmpeg with hevc_nvenc running on an Intel machine. I am aware that Apple hardware can create such videos, but for a wide variety of reasons it is not practical for me to use Apple hardware to do the job. I have spent many hours trying all sorts of ffmpeg and ffprobe commands to try to work out what is wrong and modify the video to fix it, but to be honest most of my attempts are guesswork.
The Apple specification for an alpha layer in HEVC requires that the encoder process and store the alpha in a certain manner. It also requires that the stream configuration syntax be formed in a specific manner. At time of writing, I'm aware of only the videotoolbox HEVC encoder being capable of emitting such a stream.

How to make mpv more compatible with ffmpeg filters like minterpolate?

ffmpeg filter minterpolate (motion interpolation) does not work in MPV.
(Nevertheless the file then is played normally without the minterpolate).
(I researched using search engines and throughout documentation and troubleshooted to make a use of opengl and generally tried everything apart from asking for help and learning to understand more in the source code and I'm not a programmer)…
--gpu-context=angle --gpu-api=opengl also does not make opengl work. (I'm guessing opengl could help from seeing its use in the documentations).
Note
To get a full list of available video filters, see --vf=help and
http://ffmpeg.org/ffmpeg-filters.html .
Also, keep in mind that most actual filters are available via the
lavfi wrapper, which gives you access to most of libavfilter's
filters. This includes all filters that have been ported from MPlayer
to libavfilter.
Most builtin filters are deprecated in some ways, unless they're only
available in mpv (such as filters which deal with mpv specifics, or
which are implemented in mpv only).
If a filter is not builtin, the lavfi-bridge will be automatically
tried. This bridge does not support help output, and does not verify
parameters before the filter is actually used. Although the mpv syntax
is rather similar to libavfilter's, it's not the same. (Which means
not everything accepted by vf_lavfi's graph option will be accepted by
--vf.)
You can also prefix the filter name with lavfi- to force the wrapper.
This is helpful if the filter name collides with a deprecated mpv
builtin filter. For example --vf=lavfi-scale=args would use
libavfilter's scale filter over mpv's deprecated builtin one.
I expect MPV to play with minterpolate (one of several filters that MPV can use, listed in http://ffmpeg.org/ffmpeg-filters.html) enabled. But this is what happens:
Input: "--vf=lavfi=[minterpolate=fps=60000/1001:mi_mode=mci]"
Output:
cplayer: (+) Video --vid=1 (*) (h264 1280x720 29.970fps)
cplayer: (+) Audio --aid=1 (*) (aac 2ch 44100Hz)
vd: Using hardware decoding (d3d11va).
ffmpeg: Impossible to convert between the formats supported by the filter 'mpv_src_in0' and the filter 'auto_scaler_0'
lavfi: failed to configure the filter graph
vf: Disabling filter lavfi.00 because it has failed.
(Interesting is also that --gpu-api=opengl does not work (despite that according to specification my—not to brag—HD Graphics 400 Braswell supports its 4.2 version)… And that aresample seems to have no effect too, and with the few audio filters selected playback often doesn't start nor output errors.)
The problem is that you're using hardware decoding WITHOUT copying the decoded video back to system memory. This means your video filter can't access it. The fix is simple but that error message makes it very hard to figure this out.
To fix this just pass in --hwdec=no. Though --hwdec=auto-copy also fixes it but minterpolate in mci mode is so CPU intensive there's not much point in also using hardware decoding. (for most video sources)
All together:
mpv input.mkv --hwdec=no --vf=lavfi="[minterpolate=fps=60000/1001:mi_mode=mci]"
Explanation: The most efficient hardware decoding doesn't copy the video data back to system memory after decoding. But you need it in memory for running CPU based filtering on the decoded video data. You were asking mpv to do some video filtering but it doesn't have access to the decoded video data.
More details from the mpv docs:
auto-copy selects only modes that copy the video data back to system memory after decoding. This selects modes like vaapi-copy (and so on). If none of these work, hardware decoding is disabled. This mode is usually guaranteed to incur no additional quality loss compared to software decoding (assuming modern codecs and an error free video stream), and will allow CPU processing with video filters. This mode works with all video filters and VOs.
Because these copy the decoded video back to system RAM, they're often less efficient than the direct modes, and may not help too much over software decoding.

ffmpeg and 7160 HD Capture card error, already set rtbufsize 2000M, still real time buffer too full

The 7160 Capture card original video was shown fine in the Honestech HD DVR software that is included.
However, when the card was captured using ffmpeg and publish out. This error occurred after a while running ffmpeg:
real-time buffer [7160 HD Capture] video input too full or near too full ...
I have already set -rtbufsize 2000M which is nearly the maximum that is allowed and can not be increased further.
Please tell me how to resolve this bug or give me an example that can be used without producing this bug. Thank you very much. You do not neeed the code that I used because almost any code even the simplest code I used produced this error after running for a while. The published video also lag and lost.

Firefox audio tag doubles length of OGG Vorbis

So, here I have a demo file from my website
http://members.shaw.ca/darolynk/breakup/html5game/snd_music.ogg
I am running off of Firefox Beta 30.0 and this issue does not persist in Google Chrome. In Firefox, when I play the file back, the length is displayed at around twice of its actual length as 32:13 when it is only 12:52 in length. Even worse, the audio stutters, playing one second of noise, one second of silence. This issue is not persistent in Chrome nor Opera. It is not a streaming issue (the song is streamed by the time it is played back in fact).
I am wondering if this is an issue with the codec or with Firefox's intrepretation of the codec, but more importantly, I want to know how to fix it. Some information about the file: the file is in OGG Vorbis format, 44100 Hz, 32 kbps Mono (yes, I am running out of storage space). It was encoded with SUPER, which in turn uses FFMpeg and MEncoder as necessary.
This does not apply to all files of this format, making the issue even stranger. Are OGG Vorbis files over a certain length not allowed or interpreted differently by Firefox? This must be the case for someone else and not just me...
Any help is appreciated, thanks in advance!
The problem has nothing to do with Firefox, it's an issue with your file. Also, I've reproduced the issue in VLC... that'd a bad sign as VLC can usually play any corrupt file you throw at it but I wouldn't be surprised if Firefox used the same libvorbis or whatever for the codec.
Some observations:
What we have here is sort of a codec issue. However, the audio being played back is listenable, sort of, meaning it's likely just an issue with some flags.
The sample rate is correct as all the pitches sound correct.
The gaps in the audio are at regular intervals, so it isn't likely you have a plainly corrupt file.
The time on/off in audio is exactly the same length.
Your file is in mono.
It seems to me like the decoder is looking for stereo interleaved channels, but your file is in mono so it cannot decode the bitstream properly. VLC tells me the audio is in mono, but if I remember correctly, Ogg and Vorbis can disagree which might be happening here.
I would recommend simply using FFmpeg to do the encoding. If you still have the problem, at least then we know what version of FFmpeg you have and what the command line was.

What is the minimum amount of metadata is needed to stream only video using libx264 to encode at the server and libffmpeg to decode at the client?

I want to stream video (no audio) from a server to a client. I will encode the video using libx264 and decode it with ffmpeg. I plan to use fixed settings (at the very least they will be known in advance by both the client and the server). I was wondering if I can avoid wrapping the compressed video in a container format (like mp4 or mkv).
Right now I am able to encode my frames using x264_encoder_encode. I get a compressed frame back, and I can do that for every frame. What extra information (if anything at all) do I need to send to the client so that ffmpeg can decode the compressed frames, and more importantly how can I obtain it with libx264. I assume I may need to generate NAL information (x264_nal_encode?). Having an idea of what is the minimum necessary to get the video across, and how to put the pieces together would be really helpful.
I found out that the minimum amount of information are the NAL units from each frame, this will give me a raw h264 stream. If I were to write this to a file, I could watchit using VLC if adding a .h264
I can also open such a file using ffmpeg, but if I want to stream it, then it makes more sense to use RTSP, and a good open source library for that is Live555: http://www.live555.com/liveMedia/
In their FAQ they mention how to send the output from your encoder to live555, and there is source for both a client and a server. I have yet to finish coding this, but it seems like a reasonable solution

Resources