Look for fastest video encoder with least lag to stream webcam streaming to ipad - ffmpeg

I'm looking for the fastest way to encode a webcam stream that will be viewable in a html5 video tag. I'm using a Pandaboard: http://www.digikey.com/product-highlights/us/en/texas-instruments-pandaboard/686#tabs-2 for the hardware. Can use gstreamer, cvlc, ffmpeg. I'll be using it to drive a robot, so need the least amount of lag in the video stream. Quality doesn't have to be great and it doesn't need audio. Also, this is only for one client so bandwidth isn't an issue. The best solution so far is using ffmpeg with a mpjpeg gives me around 1 sec delay. Anything better?

I have been asked this many times so I will try and answer this a bit generically and not just for mjpeg. Getting very low delays in a system requires a bit of system engineering effort and also understanding of the components.
Some simple top level tweaks I can think of are:
Ensure the codec is configured for the lowest delay. Codecs will have (especially embedded system codecs) a low delay configuration. Enable it. If you are using H.264 it's most useful. Most people don't realize that by standard requirements H.264 decoders need to buffer frames before displaying it. This can be upto 16 for Qcif and upto 5 frames for 720p. That is a lot of delay in getting the first frame out. If you do not use H.264 still ensure you do not have B pictures enabled. This adds delay to getting the first picture out.
Since you are using mjpeg, I don't think this is applicable to you much.
Encoders will also have a rate control delay. (Called init delay or vbv buf size). Set it to the smallest value that gives you acceptable quality. That will also reduce the delay. Think of this as the bitstream buffer between encoder and decoder. If you are using x264 that would be the vbv buffer size.
Some simple other configurations: Use as few I pictures as possible (large intra period).
I pictures are huge and add to the delay to send over the network. This may not be very visible in systems where end to end delay is in the range of 1 second or more but when you are designing systems that need end to end delay of 100ms or less, this and several other aspects come into play. Also ensure you are using a low latency audio codec aac-lc (and not heaac).
In your case to get to lower latencies I would suggest moving away from mjpeg and use at least mpeg4 without B pictures (Simple profile) or best is H.264 baseline profile (x264 gives a zerolatency option). The simple reason you will get lower latency is that you will get lower bitrate post encoding to send the data out and you can go to full framerate. If you must stick to mjpeg you have close to what you can get without more advanced features support from the codec and system using the open source components as is.
Another aspect is the transmission of the content to the display unit. If you can use udp it will reduce latency quite a lot compared to tcp, though it can be lossy at times depending on network conditions. You have mentioned html5 video. I am curious as to how you are doing live streaming to a html5 video tag.
There are other aspects that can also be tweaked which I would put in the advanced category and requires the system engineer to try various things out
What is the network buffering in the OS? The OS also buffers data before sending it out for performance reasons. Tweak this to get a good balance between performance and speed.
Are you using CR or VBR encoding? While CBR is great for low jitter you can also use capped vbr if the codec provides it.
Can your decoder start decoding partial frames? So you don't have to worry about framing the data before providing it to the decoder. Just keep pushing the data to the decoder as soon as possible.
Can you do field encoding? Halves the time from frame encoding before getting the first picture out.
Can you do sliced encoding with callbacks whenever a slice is available to send over the network immediately?
In sub 100 ms latency systems that I have worked in all of the above are used. Some of the features may not be available in open source components but if you really need it and are enthusiastic you could go ahead and implement them.
EDIT:
I realize you cannot do a lot of the above for a ipad streaming solution and there are limitations because of hls also to the latency you can achieve. But I hope it will prove useful in other cases when you need any low latency system.

We had a similar problem, in our case it was necessary to time external events and sync them with the video stream. We tried several solutions but the one described here solved the problem and is extremely low latency:
Github Link
It uses gstreamer transcode to mjpeg which is then sent to a small python streaming server. This has the advantage that it uses the tag instead of so it can be viewed by most modern browsers, including the iPhone.
As you want the <video> tag, a simple solution is to use http-launch. That
had the lowest latency of all the solutions we tried so it might work for you. Be warned that ogg/theora will not work on Safari or IE so those wishing to target the Mac or Windows will have to modify the pipe to use MP4 or WebM.
Another solution that looks promising, gst-streaming-server. We simply couldn't find enough documentation to make it worth pursuing. I'd grateful if somebody could ask a stackoverflow question about how it should be used!

Related

Performant AV1 encoding, does it exist?

I'm developing a VoD application as a white label product that runs in a SaaS context using K8s. To enable streaming, I take the input video and re-convert it into HLS segments in multiple version and codecs to reach maximum compatibility.
Yesterday I started implementing AV1 as codec, as it will in near future detach h264 as it's more efficient with the same level of compatibility across all the available browsers.
That was the point where things started to get strange, as I want to have this codec instead of h264 ^^.
If you take a look at the following doc pages from ffmpeg: https://trac.ffmpeg.org/wiki/Encode/AV1
You will notice that there are 3 main encoders available to handle encoding to av1. These are: libaom, SVT-AV1 and rav1e. No matter which one of these I try, the performance is slow, even slower than with HEVC. Recently I came along a news article about Netflix and that they are upgrading their library to AV1. If I take a look at the numbers of media elements Netflix offers, the amount is just huge, and I really don't understand how they did it. From what I know, SVT-AV1 is developed by Netflix in cooperation with Intel, So I assume they somehow rely on hardware encoding using an Intel CPU extension.
Does somebody maybe know more and how they did it? I really can't imagine that they just do CPU only encoding. A movie would take days to get encoded.
Thanks in advance
Encoding quality and quality differs heavily between all encoders. SVT-AV1 is the fastest but looks like garbage. For real-time encoding you should probably use GPU's. Intels GPU's don't really put out great quality AV1 encodes though, Nvidia's H265 is basically the same quality.
With Nvidia and AMD soon getting AV1 encoding hardware support (currently drivers are a bit lacking but it's already possible on Nvidia). AMD GPU's coming out for it soon.

How to stream the video from one PC to another with an acceptable quality and synchronization?

I have the following task: to organize the broadcast of several gamers on the director's computer, which will switch the image to, to put it simply, the one who currently has more interesting gameplay.
The obvious solution would be to raise an RTMP server and broadcast to it. We tried that. The image quality clearly correlates with the bitrate of the broadcast, but the streams aren't synchronized and there is no way to synchronize them. As far as I know, it's just not built into the RTMP protocol.
We also tried streaming via UDP, SRT and RTSP protocols. We got minimal delay but a very blurry image and artifacts from lost packets. It feels like all these formats are trying to achieve constant FPS and sacrifice the quality.
What we need:
A quality image.
Broken frames can be discarded (it's okay to have not constant FPS).
Latency isn't important.
The streams should be synchronized within a second or two.
There is an assumption that broadcasting on UDP should be a solution, but some kind of intermediate buffer is needed to provide the necessary broadcasting conditions. But I don't know how to do that. I assume that we need an intermediate ffmpeg instance, which will read the incoming stream, buffer it and publish the result to some local port, from which the picture will be already taken by the director's OBS.
Is there any solution to achieve our goals?
NDI is perfect for this, and in fact used a lot to broadcast games. Providing your network is in order, it offers great quality at very low latency and comes with a free utility to capture the screens and output them in NDI. There are several programmes supporting the NDI intake and the broadcasting (I developed one of them). With proper soft- and hardware you can quite easily handle a few dozen games. If you're limited to OBS then you'll have to check it supports NDI, but I'd find that very likely. Not sure which programmes support synchronisation between streams, but there's at least one ;). See also ndi.newtek.com.

Which parameters decide the performance of video online/watching websites?

I am new in web developer. I wanted to know about the performance of video in web. My question is Which parameters decide the performance of video online/watching websites? anybody can tell.
When you're streaming video over a network connection, there are two main reasons why a video might perform poorly: network and computing power. Either the network couldn't retrieve the data in time, or the computer the browser is running on couldn't decode and render it fast enough. The former is much more common.
The major properties of a video that would affect this:
Bitrate:
Expressed in Kbps or Mbps, most people think this is a measurement of quality, but it's not. Rather, bitrate is a measurement of how much data is used to represent a second of video. A larger bitrate means a bigger file for the same runtime, and assuming limited bandwidth, this is the single most important factor in determining how your video will perform.
Codec:
The codec refers to the specific algorithm used to encode and compress moving picture data into bits. The main features affected are file size and video quality, (which in turn affects the bitrate), but some codecs are also more challenging to render than others, leading to poor performance on an older or burdened system even when the network bandwidth isn't an issue. Again, note that a video requiring too much network is much more common than a video requiring too much computer.
For the end user who is watching the video, there are a few factors that are not part of the videos themselves that can impact performance:
The network:
Obviously, a user has to have a certain amount of bandwidth consistently available to stream video at a given quality level, so they won't be able to play much while downloading from a fast server or running Tor, but the server also needs to be able to deliver the bits to everyone who's asking for them. The quality level of the video that can play without stuttering can be drastically reduced by network congestion, disparity in geographical location between the client and the server, denial of service (i.e., things not responding), or any other factor that keeps all the viewers from retrieving bits consistently as the video plays. This is a tough challenge, and there's a whole industry of Content Delivery Networks (CDNs) devoted to the problem of how to deliver a large amount of data can get to a large number of people in many different places on the globe as fast as possible.
Their computer/device:
As codecs have gotten more advanced, they've been able to do better, more complex math to turn pictures into bits. This has made file sizes smaller and quality higher, but it's also made the videos more computationally expensive to decode. Turning bits back into video takes horsepower, and older computers, less powerful devices, and systems that are just doing too much at the moment may be unable to decode video delivered at a certain bitrate.
There are a few other video properties relevant to performance, but mostly these end up affecting the bitrate. Resolution is an example of this: a video encoded at a native resolution of 1600x900 will be harder to stream than a video encoded at 320x240, but since the higher resolution takes up more space (i.e., requires more bits) to store than the lower resolution does for the same length of video, the difference ends up being reflected in the bitrate.
The same is true of file size: it doesn't really matter how big the file is in total; the important number is the bitrate -- the amount of space/bandwidth one second of video takes up.
I think those are the major factors that determine whether a certain video will perform well for a particular user requesting from a specific computer at a given network location.

AVFoundation - macOS - 2 cameras simultaneous recording with compression

I'm programming video capturing app and need to have 2 input sources (USB cams) to record from at the same time.
When I record only the raw footage simultaneously without compression at is working quite well (Low CPU load, no video lags), but when the compression is turned on the CPU is very high and the footage is lagging.
How to solve it? Or how to tune-up the settings so that it can be accomplished?
Note: the Raw streams are to big and thus cannot be used, otherwise I would not bother with compression at all and just leave it as it is.
The AVFoundation framework in its current configuration is setup to provide HW acceleration only for one source at time. For multiple accelerated sources one need to go deeper to VideoToolbox framework and even deeper.

BackgroundAudioPlayer- Buffering & MediaStreamSource

I have created a MediaStreamSource to decode an live internet audio stream and pass it to the BackgroundAudioPlayer. This now works very well on the device. However I would now like to implement some form of buffering control. Currently all works well over WLAN - however i fear that in live situations over mobile operator networks that there will be a lot of cutting in an out in the stream.
What I would like to find out is if anybody has any advice on how best to implement buffering.
Does the background audio player itself build up some sort of buffer before it begings to play and if so can the size of this be increased if necessary?
Is there something I can set whilst sampling to help with buffering or do i simply need to implement a kind of storeage buffer as i retrieve the stream from the network and build up a substantial reserve in this before sampling.
What approach have others taken to this problem?
Thanks,
Brian
One approach to this that I've seen is to have two processes managing the stream. The first gets the stream and writes it a series of sequentially numbered files in Isolated Storage. The second reads the files and plays them.
Obviously that's a very simplified description but hopefully you get the idea.
I don't know how using a MediaStreamSource might affect this, but from experience with a simple Background Audio Player agent streaming direct from remote MP3 files or MP3 live radio streams:
The player does build up a buffer of data received from the server before it will start playing your track.
you can't control the size of this buffer or how long it takes to fill that buffer (I've seen it take over a minute of buffering in some cases).
once playback starts if you lose connection or bandwidth goes so low that your buffer is emptied after the stream has started then the player doesn't try and rebuffer the audio, so you can lose the audio completely or it can cut in or out.
you can't control that either.
Implementing the suggestion in Matt's answer solves this by allowing you to take control of the buffering and separates download and playback neatly.

Resources