Capture raw video byte stream for real time transcoding - ffmpeg

I would like to achieve the following:
Set up a proxy server to handle video requests by clients (for now, say all video requests from any Android video client) from a remote video server like YouTube, Vimeo, etc. I don't have access to the video files being requested, hence the need for a proxy server. I have settled for Squid. This proxy should process the video signal/stream being passed from the remote server before relaying it back to the requesting client.
To achieve the above, I would either
1. Need to figure out the precise location (URL) of the video resource being requested, download it really fast, and modify it as I want before HTTP streaming it back to the client as the transcoding continues (simultaneously, with some latency)
2. Access the raw byte stream, pipe it into a transcoder (I'm thinking ffmpeg) and proceed with the streaming to client (also with some expected latency).
Option #2 seems tricky to do but lends more flexibility to the kind of transcoding I would like to perform. I would have to actually handle raw data/packets, but I don't know if ffmpeg takes such input.
In short, I'm looking for a solution to implement real-time transcoding of videos that I do not have direct access to from my proxy. Any suggestions on the tools or approaches I could use? I have also read about Gstreamer (but could not tell if it's applicable to my situation), and MPlayer/MEncoder.
And finally, a rather specific question: Are there any tools out there that, given a YouTube video URL, can download the byte stream for further processing? That is, something similar to the Chrome YouTube downloader but one that can be integrated with a server-side script?
Thanks for any pointers/suggestions!

You should ask single coding questions. What you asked is more like a general "how would a write my application". A few comments though:
squid is a http proxy, video use usually streamed over e.g. rtsp.
yes there are tools that grab the rtsp url from a youtube url, be sure to understand the terms of use for the video servie before going that way though.
gstreamer has a gst-rtsp-server module that contains a rtsp server, that also can be used as a proxy for a given rtsp stream.

Related

Can I read an encoded stream from a URL with WebRTC

I'm trying to stream the video of my C++ 3D application (similar to streaming a game).
I have encoded an H.264 video stream with the ffmpeg library (i.e. internally to my application) and can push it to a local address, e.g. rtp://127.0.0.1:6666, which can be played by VLC or other player (locally).
I'm not particularly wedded to h.264 at this point, or rtp. I could send as srtp if that would help.
I'd like to use WebRTC to set up a connection across different machines, but can't see in the examples how to make use of this pre-existing stream - the video and audio examples are understandably focused on getting data from devices like connected web cams, or the display.
Is what I'm thinking feasible? I.e. ideally I'd just point webRTC at my rtp://127.0.0.1:6666 address and that would be the video stream source.
I am writing out an sdp file as well which can be read by VLC, could I use this in a similar way?
As noted in the comment below there is an example out there using go to weave some magic that enables an rtp stream to be shown in a browser via webRTC.
I am trying to find a more "standard" way to be able to set the source of a video track in webRTC to be the URL of an encoded stream. If there isn't one, that is valuable information to me too, as I can change tack and use a webrtc library to send frames directly.
Unfortunately FFMPEG doesn't support WebRTC output. It lacks support for ICE and DTLS-SRTP.
You will need to use a RTP -> WebRTC bridge. I wrote rtp-to-webrtc that can do this. You can do this with lots of different WebRTC clients/servers!
If you have a particular language/paradigm that you prefer happy to provide examples for those.

Stream microphone from client browser to remote server and pass audio in real time to ffmpeg to combine with a second video source

As a beginner at working with these kinds of real-time streaming services, I've spent hours trying to work out how this is possible, but can't seem to work out I'd precisely go about it.
I'm prototyping a personal basic web app that does the following:
In a web browser, the web application has a button that says 'Stream Microphone' - when pressed it streams the audio from the user's microphone (the user obviously has to consent to give permission to send their microphone audio) through to the server which I was presuming would be running node.js (no specific reason at this point, just thought this is how I'd go about doing it).
The server receives the audio close enough to real-time somehow (not sure how I'd do this).
I can then run ffmpeg on the command line and take the real-time audio coming in real-time and add it as the sound to a video file (let's just say I'm going to play testmovie.mp4) that I want to play.
I've looked at various solutions - such as maybe using WebRTC, RTP/RTSP, Piping audio into ffmpeg, Gstreamer, Kurento, Flashphoner and/or Wowza - but somehow they look overly complicated and usually seem to focus on video along with audio. I just need to work with audio.
As you've found there are numerous different options to receive the audio from a WebRTC enabled browser. The options from easiest to more difficult are probably:
Use a WebRTC enabled server such as Janus, Kurento, Jitsi (not sure about wowzer) etc. These servers tend to have plugin systems and one of them may already have the audio mixing capability you need.
If you're comfortable with node you could use the werift library to receive the WebRTC audio stream and then forward it to FFmpeg.
If you want to take full control over the WebRTC pipeline and potentially do the audio mixing as well you could use gstreamer. From what you've described it should be capable of doing the complete task without having to involve a separate FFmpeg process.
The way we did this is by creating a Wowza module in Java that would take the audio from the incoming stream, take the video from wherever you want it, and mix them together.
There's no reason to introduce a thrid party like ffmpeg in the mix.
There's even a sample from Wowza for this: https://github.com/WowzaMediaSystems/wse-plugin-avmix

How to stream multiple inputs to multiple outputs on Windows?

I'm used to using ffmpeg and stuff to broadcast/do testing.. but I don't understand how iptv servers succeed at having 50+ input streams, making 50+ output streams and sharing them, as I can't even run 3 ffmpeg commands with encoding without having the CPU crying for help...
I've tried to get infos, but except Wowza that seems to do what I'm trying to understand, I don't find any info...
I hope that you can enlight me on understanding how this whole thing works. Also, I'd like to test it out so if you got any recommendations on how to do this, I'll be thankful to you !
Most large streaming services actually will have multiple servers - this is partly due to different function being performed by different servers and also due to performance as you have noted.
There are many different ways you can stitch a service together but it will generally (for live streams) have the following elements:
some sort of live encoder which receives the external stream and converts it to a format the rest of the system understand
transcoders - these take the inout video and create multiple bit rate versions of it to support Adaptive Bit Rate Streaming (see: https://stackoverflow.com/a/42365034/334402)
Packagers - these package the resulting video streams into the required video streaming protocol, usually HLS or MPEG DASH these days. This is typically done 'Just in Time' so only streams and bit rates required are actually packaged. If encryption is required it is typically applied at this point also.
Origin server and CDN - the video streams, which actually consist of packets of data making up the ABR video segments, are delivered to an Origin server which is the source for the CDN. The CDN, Content Delivery Network, is alike a large dispersed video cache and it copies the video to the edge of the network to reduce latency when a user request the video.
You can also build this using cloud services rather than installing or spinning up the servers yourself - it might be useful to look at some of the documentation from providers like AWS Media Services or BitMovin.
Whichever way it is done, your initial thoughts are correct - it takes quite a bit of horsepower to server large numbers of video streams.

Live stream multi-bitrate video

Preface
I have read this two part tutorial (Part-1 and Part-2) by Steamroot on MPEG-DASH, and below is my understanding (please correct me if I am wrong):
The video needs to be encoded into multiple bit-rates using FFmpeg.
The encoded videos need to be transcoded (dashified) using MP4Box.
The dashified videos can be served using a web server.
Problem
I intend to live-stream an event and I need help to understand the following:
Can I club the FFmpeg and MP4Box commands into a single step? Maybe through a wrapper program so that I do not have to run them separately? Is there any other or better solution?
How do I send the dashified content to the web server? FTP? Would any vanilla web server do?
Lastly, a friend had hinted that I could also use GStreamer to achieve my objective. But, I could not find any good resource on the internet for the same. So, where (and how) does GStreamer fit in the above process?
What is the format you will be getting out of your camera for your live-event? There are a lot of solutions a lot more adapted for live streaming (the tutorial I wrote is for VOD streams only). You can check out simple solutions like Wowza Streaming Server, Nible streamer (free), etc, that take a RTMP stream and transform it into other formats (HLS, DASH, etc...).
Most of the livestreaming platforms can even do that for you (livestream.com, youtube, twitch, or even facebook now)
The dashified content will be requested as HTTP ressources by the browser or other players. In the case of a VoD stream, indeed you just need to make the dash segments available through a web-server. For live content, you need something smarter, that will encode, package the segments and make them available on the fly.
Gstreamer can transcode and transmux the original content, and can do it on the fly. You will be able to get different formats as outputs, like RTMP, HLS, and probably even mpeg-dash. Then you still need to make your content available via a webserver.
In conclusion, if you just want to transmit an occasional live event, it's probably a lot easier a platform that will ingest your RTMP stream and do all the complicated steps for you.

Can I use Kickflip with Parse to stream video TO server from Android device during recording

Parse is turning out to be convenient to use. However the service does not allow for streaming videos to the server. I need a service that allows my users to stream videos to the server while they are recording. I was able to find the KickFlip SDK. Does anyone know how I might be able to pair KickFlip with Parse for Video Streaming? Even if I were to use both services, as opposed to just the KickFlip SDk, how would I coordinate the two? Parse provides rich social database but storing a ParseFile video is limiting (no streaming and 10MB max).
Should work on a REST client with a normal POST ( i would not do your scenario without adding something like Heroku/EC2 to parse.com PAAS architecture, but it will work )... I dont know 'flipkick' but you must know low-level protocol details b4 you include this in a solution.
In REST.POST.Headers you would need to include the combo of header values consistent with a client posting a stream on https to parse.( more here ). ie NO length header, ask for chunked encoding, use http 1.1. On a FILE POST to parse, IMO you dont need to worry about a timeout. You will have to append to the end of your posted bytestream the normal http , chunked-encoding , byte sequence signifying END_OF_DATA.
Sample Curl interaction with headers etc.
Observe relevant parse.file.sz. limitations...

Resources