As a beginner at working with these kinds of real-time streaming services, I've spent hours trying to work out how this is possible, but can't seem to work out I'd precisely go about it.
I'm prototyping a personal basic web app that does the following:
In a web browser, the web application has a button that says 'Stream Microphone' - when pressed it streams the audio from the user's microphone (the user obviously has to consent to give permission to send their microphone audio) through to the server which I was presuming would be running node.js (no specific reason at this point, just thought this is how I'd go about doing it).
The server receives the audio close enough to real-time somehow (not sure how I'd do this).
I can then run ffmpeg on the command line and take the real-time audio coming in real-time and add it as the sound to a video file (let's just say I'm going to play testmovie.mp4) that I want to play.
I've looked at various solutions - such as maybe using WebRTC, RTP/RTSP, Piping audio into ffmpeg, Gstreamer, Kurento, Flashphoner and/or Wowza - but somehow they look overly complicated and usually seem to focus on video along with audio. I just need to work with audio.
As you've found there are numerous different options to receive the audio from a WebRTC enabled browser. The options from easiest to more difficult are probably:
Use a WebRTC enabled server such as Janus, Kurento, Jitsi (not sure about wowzer) etc. These servers tend to have plugin systems and one of them may already have the audio mixing capability you need.
If you're comfortable with node you could use the werift library to receive the WebRTC audio stream and then forward it to FFmpeg.
If you want to take full control over the WebRTC pipeline and potentially do the audio mixing as well you could use gstreamer. From what you've described it should be capable of doing the complete task without having to involve a separate FFmpeg process.
The way we did this is by creating a Wowza module in Java that would take the audio from the incoming stream, take the video from wherever you want it, and mix them together.
There's no reason to introduce a thrid party like ffmpeg in the mix.
There's even a sample from Wowza for this: https://github.com/WowzaMediaSystems/wse-plugin-avmix
Preface
I have read this two part tutorial (Part-1 and Part-2) by Steamroot on MPEG-DASH, and below is my understanding (please correct me if I am wrong):
The video needs to be encoded into multiple bit-rates using FFmpeg.
The encoded videos need to be transcoded (dashified) using MP4Box.
The dashified videos can be served using a web server.
Problem
I intend to live-stream an event and I need help to understand the following:
Can I club the FFmpeg and MP4Box commands into a single step? Maybe through a wrapper program so that I do not have to run them separately? Is there any other or better solution?
How do I send the dashified content to the web server? FTP? Would any vanilla web server do?
Lastly, a friend had hinted that I could also use GStreamer to achieve my objective. But, I could not find any good resource on the internet for the same. So, where (and how) does GStreamer fit in the above process?
What is the format you will be getting out of your camera for your live-event? There are a lot of solutions a lot more adapted for live streaming (the tutorial I wrote is for VOD streams only). You can check out simple solutions like Wowza Streaming Server, Nible streamer (free), etc, that take a RTMP stream and transform it into other formats (HLS, DASH, etc...).
Most of the livestreaming platforms can even do that for you (livestream.com, youtube, twitch, or even facebook now)
The dashified content will be requested as HTTP ressources by the browser or other players. In the case of a VoD stream, indeed you just need to make the dash segments available through a web-server. For live content, you need something smarter, that will encode, package the segments and make them available on the fly.
Gstreamer can transcode and transmux the original content, and can do it on the fly. You will be able to get different formats as outputs, like RTMP, HLS, and probably even mpeg-dash. Then you still need to make your content available via a webserver.
In conclusion, if you just want to transmit an occasional live event, it's probably a lot easier a platform that will ingest your RTMP stream and do all the complicated steps for you.
I would like to achieve the following:
Set up a proxy server to handle video requests by clients (for now, say all video requests from any Android video client) from a remote video server like YouTube, Vimeo, etc. I don't have access to the video files being requested, hence the need for a proxy server. I have settled for Squid. This proxy should process the video signal/stream being passed from the remote server before relaying it back to the requesting client.
To achieve the above, I would either
1. Need to figure out the precise location (URL) of the video resource being requested, download it really fast, and modify it as I want before HTTP streaming it back to the client as the transcoding continues (simultaneously, with some latency)
2. Access the raw byte stream, pipe it into a transcoder (I'm thinking ffmpeg) and proceed with the streaming to client (also with some expected latency).
Option #2 seems tricky to do but lends more flexibility to the kind of transcoding I would like to perform. I would have to actually handle raw data/packets, but I don't know if ffmpeg takes such input.
In short, I'm looking for a solution to implement real-time transcoding of videos that I do not have direct access to from my proxy. Any suggestions on the tools or approaches I could use? I have also read about Gstreamer (but could not tell if it's applicable to my situation), and MPlayer/MEncoder.
And finally, a rather specific question: Are there any tools out there that, given a YouTube video URL, can download the byte stream for further processing? That is, something similar to the Chrome YouTube downloader but one that can be integrated with a server-side script?
Thanks for any pointers/suggestions!
You should ask single coding questions. What you asked is more like a general "how would a write my application". A few comments though:
squid is a http proxy, video use usually streamed over e.g. rtsp.
yes there are tools that grab the rtsp url from a youtube url, be sure to understand the terms of use for the video servie before going that way though.
gstreamer has a gst-rtsp-server module that contains a rtsp server, that also can be used as a proxy for a given rtsp stream.
I want to have a stress/performance testing for my content management site, especially for hosted streamed video part. I am using IIS to host the videos. More specifically, I am using the new Windows Server 2008 x64 and IIS 7.0.
The confusion is,
I plan to write code to start a lot of threads, and in each thread I will send web request to video URL, and read response stream from server, but I am not sure whether in this way, it behaves the same as a real user using player to render the video (in my code, I just read the stream, without really play it or write to anywhere). I want to test similar to the real scenario as much as possible;
I also plan to use real Media Player to render video (or what-so-ever media player), but my concern is if I start multiple Media Players on my test machine, since Media Player will utilize some H/W or some other resources (video card specific memory?) to decode/render the video (not sure, needs guru help to check and confirm), if I start multiple players, are there any potential H/W or resource contention between the players? If there is contention, it is also not actual ens user scenario, i.e. few user will start 100 players on his/her machine. :-)
Does anyone have any advice to me?
BTW: I prefer to use any .Net based solution, but not a must.
thanks in advance,
George
You should use mplayer. It has a lot of command line options. I don't know how all theses options are available under Windows, but under linux something like this is possible :
mplayer some_url -dump-video -dump-file=some_file
It will behave the same as a "normal" player I think, and your test machine won't need to handle hundreds of decompression thread, sot it fits your need 1 and 2
If you know the bit rate of your video stream, you can pace your downloading request to simulate video player clients. The bit rate can be calculated from the information carried in the stream, but it's a little more complicated. There is software for stressing testing video server too, such as this IP Video Monitor.
I need to setup a simple IVR system for a friend's company that will let the caller navigate through the menu by pressing phone keys. Its kind of like a bus schedule.
for today's schedule press '1', for tomorrow's schedule press '2' and
so on.
It is solely an information system, i.e. no navigation route will end up with a real person but only audio messages will be played.
Now, I've never setup anything like this before and did a little digging on Google. Seems like I will be able to achieve this using Asterisk.
What else do I need hardware-wise?
Is a simple Linux server and a VOIP account with a provider in Germany sufficient?
Will a VPS handle the task?
How about multiple concurrent incoming calls?
Are those handled by Asterisk?
It's perfectly possible.
What you need to know:
Asterisk has some problems with H323. If your provider supplies SIP, ask them for SIP instead.
You may build a whole IVR on dial plans in your extensions.conf, but for complex tasks it's better to use AGI. These are Perl or Python or whatever language scripts that implement your IVR logic. Each AGI session spans a child process, use FastAGI and a network daemon if you expect frequent connections.
Multiple concurrent calls are not a problem, my installation of Asterisk on a simple PC handles hundreds sumultaneous calls.
The only things that may really affect performance are sound conversion and tone detection.
To improve performance, you should:
Stick to one codec (µLaw is that I use), force all SIP connections to use that codec, and preconvert all your sound files to it using sox -t ul. As soon as you've done it, all Asterisk operation amounts to reading the file bytes from disk and sending them over network with just basic wrapping. There are no math, nothing except simple read-wrap-send operations.
Ask your provider to detect tones on his side and send them to you out of band, using RFC 2833. Tone detection is quite a CPU consuming operation, let them do it theirselves.
I personally run Asterisk on a 2,66 MHz Celeron IV with 2048 MB RAM, under Fedora 10 X86_64. 150 connections at once work OK, there are no delays.
Overall traffic amounts to about 9.6 KByte/sec per connection. For a modern VPS there should be no problem at all.
Asterisk rocks. For a few lines a simple P3 or better will do. Don't virtualise the PBX; Asterisk relies on pretty accurate timing.
FreePBX makes it really easy to set up an IVR - got a decent web-based front end and supports some cool Asterisk tools out of the box.
EDIT: FreePBX isn't Asterisk - it is a pretty interface that generates the configs for you. Trixbox includes it by default if you want a simple point and shoot solution.
If your VoIP account supports multiple incoming lines then Asterisk will use them just fine. You also need sufficient Internet bandwidth and decent QoS. For more than one line on a business system I would insist on a dedicated connection so you don't experience dropouts when users are accessing the net.
The best way to build an IVR applications is to use VoiceXML designed by W3C.org (http://www.w3.org/TR/voicexml21/) . Asterisk does not come with VoiceXML browser but there are companies that provide that for Asterisk such as SoftSyl Technologies (http://www.softsyl.com).
Companies like Cisco and Avaya also provide VoiceXML browser but they are not for Asterisk.
If you're completely fresh, I would suggest studying FreeSWITCH instead of Asterisk. It's much better structured, and also comes with some pre-built examples, including an IVR menu, and the IVR syntax is pretty simple: http://wiki.freeswitch.org/wiki/IVR_Menu
I'm running a FreeSWITCH instance on a Xen virtual server, and it runs perfectly with multiple simultaneous calls.
IVR design in Asterisk is not difficult, but there is a bit of a "learning cliff" associated with getting your first Asterisk server up and running.
As stated by someone else, call quality is everything. Pay to have professional grade recordings done for your IVR prompts and your announcements. Ensure that you're using 64k codes such as uLaw and aLaw; GSM (cellphone) might be cheap on bandwith, but it breaks your customer expectations about quality.
I strongly suggest that you put the IVR into it's own dial plan context and then direct calls into it. That makes managing things like menu choices much easier. For each sub-set of options, use a different dial plan context.
Try and keep your menu "shallow". If it takes more than three menu options to get the information your customer is looking for, they are very likely to hang up, or just press "0" to talk to a human. That defeats the point of your IVR.
If you are going to do something fairly cool with database look-ups, or account authentication or the like, I'd recommend using an "AGI" - Asterisk Gateway Interface - application. My personal favorite is "Adhearsion", which blends well with Ruby/Rails on the DB/Web side.
If you need help or more info, let me know.
For more complex IVR's you can try Astive Toolkit, especially if you need databases or webservices iteration.
I've worked with IVR in the past but mainly with large systems and have never used Asterisk. I took a quick look at their website (http://www.asterisk.org/) though and it seems very informative, have you checked there?
Its not programing related but...
Take a look at trixbox.org, It supports configuration from cisco to... snom phones
Its Asterisk/Freepbx mod and everything under a nice ui!
i have a provider in australia added them as a gsm trunk, took 3hrs to setup 4phones. IVR is supported
The only problems would possibly being... voice quality of recording
Its quite simple. I'm using sipgate.de as provider for my asterisk.
you need to setup a dialplan.
this is also quite simple. take a look here.
you sould also take a look into the extensions.conf.
there a some samples inside.
the is also a sample which fits on your problem.
to connect to sipgate, take a look into their knowlogebase.
there are some samples for asterisk configuration.
sipgate is free, except you are doing outgoing calls.
You can do this in the dial plan...
[menu-main]
exten => s,1,Noop()
exten => s,n(msg),Background(ForTodayPress1TomorrPress2)
exten => s,1,Goto(menu-today)
exten => s,2,Goto(menu-tomorrow)
exten => i,1,Playback(invalid)
exten => i,n,Goto(msg)
exten => t,1,Goto(msg)
[menu-today]
etc...
[menu-tomorrow]
etc...
Or as someone else has suggested you can do it in any language that can write to stdin and read from stdout. The phpagi implementation is my particular favorite flavor. It might fit in to this example like so where the PHP is being run on a separate box so it does not effect the PBX under any kind of load.
[menu-main]
exten => s,1,Noop()
exten => s,n(msg),Background(ForTodayPress1TomorrPress2)
exten => s,1,Goto(menu-today,s,1)
exten => s,2,Goto(menu-tomorrow,s,1)
exten => i,1,Playback(invalid)
exten => i,n,Goto(msg)
exten => t,1,Goto(msg)
[menu-today]
exten => s,1,Noop()
exten => s,n,agi(http://myapache/agi/readschedule.php)
exten => s,n,Hangup()
If you want to setup an Asterisk IVR, you can also use some Drag and Drop web based tool in order to make simple auto-attendant (like in your example) or complex IVR (managing scripting or data base driven IVR).
One option is Cally Square. Have a look here:
http://www.callysquare.com/