Is it simply a question of adjusting the amount of prebuffered content depending on network speed? Do you adjust for this once at the beginning, every second...?
Or is it more complicated - sampling a history of recordings of your network speed and taking the mean / median and adjusting on that??
Your second paragraph sums it up pretty well.
The client looks at how fast the previous chunk of audio/video (usually just a second or two's worth) downloaded, then requests a bitrate of video it thinks it can handle downloading fast enough. It always buffers (downloads) at least several seconds into the future, to give itself leeway in case the next chunk of audio/video downloads slower than expected.
Note that every combination of bitrate and resolution needs to be encoded separately. They're usually pre-encoded and stored on the server. So how many bitrates there are to choose from, and what they are, completely depends on whoever encoded and/or is hosting the content.
Related
I'm trying to guarantee synchronization between multiple clients using DASH and/or HLS. Synchronization between each client must fall within 40 milliseconds.
Live streaming seems to be an obvious choice. However, the only way to really get within a small time frame of synchronization would be to lower the segment times. Is this the only viable solution? Are there any tags that would help me keep clients within 40 milliseconds to the live time?
Currently, I'm using FFMPEG to encode video and audio to live content.
There are a couple of separate issues here:
'Live time' - assuming the is the real time the event actually happens that is being broadcast, for example the actual time that a football is kicked in a game, then achieving a full end to end delivery to a end screen within 40 milliseconds is pushing the boundaries of any possible delivery technology. Certainly HLS and DASH streams won't give you that.
Your target may be to have each end user be no more than 40ms different than each other end user - e.g. every user receives the broadcast with a 10 second delay, but that delay is the same plus or minus 40ms for each user. This is still quite a tricky problem as, unless you have some common clock that all the devices are synched to, you will be relying on some mechanism to signal the position in the stream between each device and some central or distributed control mechanism and, again, 40ms is not a lot of time to allow even for small messages to travel back and forth along with any processing required to calculate any time difference and adjust.
Synchronising internet delivered media streams is not an easy problem but there is at least some work you can look at to help you get some ideas - see here for some examples: https://stackoverflow.com/a/51819066/334402
Can anyone offer me pointers or tips on finding/creating a data compression algorithm that has a guaranteed compression ratio? Obviously this couldn't be a loss-less algorithm.
My question is similar to the one here, but there was no suitable answer:
Shrink string encoding algorithm
What I am trying to do is stream live audio over a wireless network (my own spec, not WiFi) with a tight bandwidth restriction. Let's say I have packets 60 bytes in size. I need an algorithm to compress these to, say, 35 bytes every time without fail. Reliable guaranteed compression to a fixed size is key. Audio quality is less of a priority.
Any suggestions or pointers? I may end up creating my own algorithm from scratch, so even if you don't know of any libraries or standard algorithms, I would be grateful for brilliant ideas of any kind!
It is good that you mentioned your use case: live audio.
There are many audio CODECs (COder-DECoder) that work exactly this way (constant bit-rate). For example, take a look at Opus. You can select bitrates from 6 kb/s to 510 kb/s, and frame sizes from 2.5 ms to 60 ms.
I used it in the past to tranfer audio over RF datalinks. You'll probably need to implement a de-jitter buffer as well (see more here). Also note that the internal clock of many sound cards is not accurate, and there may be a "drift" between the source and target rate (e.g. a 30mSec buffer may be played in 29.9mSec or 30.1mSec) and you may need to compensate for this as well.
I am new in web developer. I wanted to know about the performance of video in web. My question is Which parameters decide the performance of video online/watching websites? anybody can tell.
When you're streaming video over a network connection, there are two main reasons why a video might perform poorly: network and computing power. Either the network couldn't retrieve the data in time, or the computer the browser is running on couldn't decode and render it fast enough. The former is much more common.
The major properties of a video that would affect this:
Bitrate:
Expressed in Kbps or Mbps, most people think this is a measurement of quality, but it's not. Rather, bitrate is a measurement of how much data is used to represent a second of video. A larger bitrate means a bigger file for the same runtime, and assuming limited bandwidth, this is the single most important factor in determining how your video will perform.
Codec:
The codec refers to the specific algorithm used to encode and compress moving picture data into bits. The main features affected are file size and video quality, (which in turn affects the bitrate), but some codecs are also more challenging to render than others, leading to poor performance on an older or burdened system even when the network bandwidth isn't an issue. Again, note that a video requiring too much network is much more common than a video requiring too much computer.
For the end user who is watching the video, there are a few factors that are not part of the videos themselves that can impact performance:
The network:
Obviously, a user has to have a certain amount of bandwidth consistently available to stream video at a given quality level, so they won't be able to play much while downloading from a fast server or running Tor, but the server also needs to be able to deliver the bits to everyone who's asking for them. The quality level of the video that can play without stuttering can be drastically reduced by network congestion, disparity in geographical location between the client and the server, denial of service (i.e., things not responding), or any other factor that keeps all the viewers from retrieving bits consistently as the video plays. This is a tough challenge, and there's a whole industry of Content Delivery Networks (CDNs) devoted to the problem of how to deliver a large amount of data can get to a large number of people in many different places on the globe as fast as possible.
Their computer/device:
As codecs have gotten more advanced, they've been able to do better, more complex math to turn pictures into bits. This has made file sizes smaller and quality higher, but it's also made the videos more computationally expensive to decode. Turning bits back into video takes horsepower, and older computers, less powerful devices, and systems that are just doing too much at the moment may be unable to decode video delivered at a certain bitrate.
There are a few other video properties relevant to performance, but mostly these end up affecting the bitrate. Resolution is an example of this: a video encoded at a native resolution of 1600x900 will be harder to stream than a video encoded at 320x240, but since the higher resolution takes up more space (i.e., requires more bits) to store than the lower resolution does for the same length of video, the difference ends up being reflected in the bitrate.
The same is true of file size: it doesn't really matter how big the file is in total; the important number is the bitrate -- the amount of space/bandwidth one second of video takes up.
I think those are the major factors that determine whether a certain video will perform well for a particular user requesting from a specific computer at a given network location.
I'm looking for the fastest way to encode a webcam stream that will be viewable in a html5 video tag. I'm using a Pandaboard: http://www.digikey.com/product-highlights/us/en/texas-instruments-pandaboard/686#tabs-2 for the hardware. Can use gstreamer, cvlc, ffmpeg. I'll be using it to drive a robot, so need the least amount of lag in the video stream. Quality doesn't have to be great and it doesn't need audio. Also, this is only for one client so bandwidth isn't an issue. The best solution so far is using ffmpeg with a mpjpeg gives me around 1 sec delay. Anything better?
I have been asked this many times so I will try and answer this a bit generically and not just for mjpeg. Getting very low delays in a system requires a bit of system engineering effort and also understanding of the components.
Some simple top level tweaks I can think of are:
Ensure the codec is configured for the lowest delay. Codecs will have (especially embedded system codecs) a low delay configuration. Enable it. If you are using H.264 it's most useful. Most people don't realize that by standard requirements H.264 decoders need to buffer frames before displaying it. This can be upto 16 for Qcif and upto 5 frames for 720p. That is a lot of delay in getting the first frame out. If you do not use H.264 still ensure you do not have B pictures enabled. This adds delay to getting the first picture out.
Since you are using mjpeg, I don't think this is applicable to you much.
Encoders will also have a rate control delay. (Called init delay or vbv buf size). Set it to the smallest value that gives you acceptable quality. That will also reduce the delay. Think of this as the bitstream buffer between encoder and decoder. If you are using x264 that would be the vbv buffer size.
Some simple other configurations: Use as few I pictures as possible (large intra period).
I pictures are huge and add to the delay to send over the network. This may not be very visible in systems where end to end delay is in the range of 1 second or more but when you are designing systems that need end to end delay of 100ms or less, this and several other aspects come into play. Also ensure you are using a low latency audio codec aac-lc (and not heaac).
In your case to get to lower latencies I would suggest moving away from mjpeg and use at least mpeg4 without B pictures (Simple profile) or best is H.264 baseline profile (x264 gives a zerolatency option). The simple reason you will get lower latency is that you will get lower bitrate post encoding to send the data out and you can go to full framerate. If you must stick to mjpeg you have close to what you can get without more advanced features support from the codec and system using the open source components as is.
Another aspect is the transmission of the content to the display unit. If you can use udp it will reduce latency quite a lot compared to tcp, though it can be lossy at times depending on network conditions. You have mentioned html5 video. I am curious as to how you are doing live streaming to a html5 video tag.
There are other aspects that can also be tweaked which I would put in the advanced category and requires the system engineer to try various things out
What is the network buffering in the OS? The OS also buffers data before sending it out for performance reasons. Tweak this to get a good balance between performance and speed.
Are you using CR or VBR encoding? While CBR is great for low jitter you can also use capped vbr if the codec provides it.
Can your decoder start decoding partial frames? So you don't have to worry about framing the data before providing it to the decoder. Just keep pushing the data to the decoder as soon as possible.
Can you do field encoding? Halves the time from frame encoding before getting the first picture out.
Can you do sliced encoding with callbacks whenever a slice is available to send over the network immediately?
In sub 100 ms latency systems that I have worked in all of the above are used. Some of the features may not be available in open source components but if you really need it and are enthusiastic you could go ahead and implement them.
EDIT:
I realize you cannot do a lot of the above for a ipad streaming solution and there are limitations because of hls also to the latency you can achieve. But I hope it will prove useful in other cases when you need any low latency system.
We had a similar problem, in our case it was necessary to time external events and sync them with the video stream. We tried several solutions but the one described here solved the problem and is extremely low latency:
Github Link
It uses gstreamer transcode to mjpeg which is then sent to a small python streaming server. This has the advantage that it uses the tag instead of so it can be viewed by most modern browsers, including the iPhone.
As you want the <video> tag, a simple solution is to use http-launch. That
had the lowest latency of all the solutions we tried so it might work for you. Be warned that ogg/theora will not work on Safari or IE so those wishing to target the Mac or Windows will have to modify the pipe to use MP4 or WebM.
Another solution that looks promising, gst-streaming-server. We simply couldn't find enough documentation to make it worth pursuing. I'd grateful if somebody could ask a stackoverflow question about how it should be used!
I am writing a client/server app in that server send live audio data that capture audio samples that captured from some external device( mic. for example ) and send it to the client. Then client want to play those samples. My app will run on local network so I have no problem with bandwidth( My sound is 8k, 8bit stereo while my net card 1000Mb ). In client I buffer the data for a small time and then start playback. and as data arrive from server I send them to sound card. This seems to work fine but there is a problem:
when my buffer in the client side finished, I will experience gaps in played sound.
I consider this is because of the difference in sampling time of the server and the client, it means that 8K on server is not same as 8K on client.
I can solve this with pausing client's playback and buffer again, but my boss doesn't accept it, since I have proper bandwidth and I should be able to play sound with no gap or pause.
So I decided to dynamically change playback speed in the client but I don't know how.
I am programming in Windows( native ) and I currently use waveOutXXX to play the sound. I can use any other native library( DirectX/DirectSound, Jack or ... ) but they should provide a smooth playback in the client.
I have programmed with waveOutXXX many times without any problem and I know it good but I can't solve my problem of dynamic resampling
I would suggest that your problem isn't likely due to mis-matched sample rates, but something to do with your buffering. You should be continuously dumping data to the sound card, and continuously filling your buffer. Use a reasonable buffer size... 300ms should be enough for most applications.
Now, over long periods of time, it is possible for the clock on the recording side and the clock on the playback side to drift apart enough that the 300ms buffer is no longer sufficient. I would suggest that rather than resampling at such a small difference, which could introduce artifacts, simply add samples at the encoding end. You still record at 8kHz, but you might add a sample or two every second, to make that 8.001kHz or so. Simply doubling one of the existing samples for this (or even a simple average between one sample and the next) will not be audible. Adjust this as necessary for your application.
I had a similar problem in an application I worked on. It did not involve network, but it did involve source data being captured in real-time at a certain fixed sampling rate, a large amount of signal processing, and finally output to the sound card at a fixed rate. Like you, I had gaps in the playback at buffer boundaries.
It seemed to me like the problem was that the processing being done caused audio data to make it to the sound card in a very jerky manner. That is, it would get a large chunk, then it would be a long time before it got another chunk. The overall throughput was correct, but this latency caused the sound card to often be starved for data. I suppose you may have the same situation with the network piece in your system.
The way I solved it was to first make the audio buffer longer. Then, every time a new chunk of audio was received, I checked how full the buffer was. If it was less than 20% full, I would write some silence to make it around 60% full.
You may think that this goes against reducing the gaps in playback since it is actually adding a gap, but it actually helps. The problem that I was having was that even though I had a significantly large audio buffer, I was always right at the verge of it being empty. With the other latencies in the system, this resulted in playback gaps on almost every buffer.
Writing the silence when the buffer started to get empty, but before it actually did, ensured that the buffer always had some data to spare if the processing fell behind a little. Also, just a single small gap in playback is very hard to notice compared to many periodic gaps.
I don't know if this will work for you, but it should be easy to implement and try out.