SIP communication with Web socket (Web RTC) - websocket

Sip (session initiation protocol) does not understand websocket so we need sip proxy which is basically a translator between sip and websocket.
i am following this architecture for sip handshaking with web socket. I have few questions
which sip proxy must be used to make audio and video call. and in the Gateway to SIP module i am using ASTERISK. how asterisk can be used for video call is there any codec available for video call? Please share some useful links.
Your kind answers will be highly appreciated.

Check out They provide a javascript API which uses SIP over WebSocket for client-side and they also have a SIP proxy and server (also works with Asterisk,Kamailio). They are the authors of RFC7118 "The WebSocket Protocol as a Transport for the Session Initiation Protocol (SIP)".

that s only one way to do it. There are many ways.
you have to distinguish between the signaling path and the media path
on the signaling path, you have to choose a signalling protocol and corresponding transport protocol. A browser can use web socket for transport and sip for the protocol as far as signaling is concerned. On the legacy SIP side, you need SID over UDP, there is a need to change the transport of the signaling, not the protocol of the signaling.
On the media path, you have two problems, the encryption and the codec. The encryption is mandatory in webrtc and not in SIP. You need a B2BUA to make the transition between both words.
on the codec side, you either choose an overlapping codec between both words, or you have to transcode. The use of a media server seems mandatory here. If you have multiple parties in a conference, you will need to mix the audio and compose the video to send it to legacy SIP, in which case your media server should be an MCU.
Eventually, you also have a discovery and identity problem. During the original handshake, SIP is expecting a user ID and a domain (which is either a DNS entry or a fixed IP) while webRTC is using ICE. Here again, it is very likely that you need to use a B2BUA to bridge both world.
Asterisk/kamailio/freeswitch are likely to handle most of the above for the simple cases (1 to 1, audio). For anything complicated, you're on your own. You might want to look at that was made by digium, the company behind asterisk.


Is there an application-agnostic signaling protocol?

Is there an application-agnostic signaling protocol?
The use case is this. We have an open-source library for a multi-agent system that supports several protocols of the application layer of the OSI model. On the moment HTTP, XMPP, and ZeroMQ are supported for example. We would like to add high-bandwidth real-time streaming possibilities. It is logical to use RTP for that.
So, to recapitulate, we already have a connection to the other party that we can use for signalling. We want to negotiate only a new channel for data communication.
However, regarding the current standards, with respect to signaling all of them seem to be tied to their application. These current "standards" seem to be SIP, RTSP, and Jingle. They all seem to use RTP or SRTP on the application layer, and UDP on the transport layer. See e.g. XEP-0167.
The only thing we want to negotiate is another connection to that party that can be used for data transmission. In the Session Description Protocol all kind of stuff about media shows up, optional phone numbers, etc. If someone can point at a signaling protocol that is meant to be application-agnostic, that would be great!
I'm a big fan of XMPP and I think you'll get what you need with it. However since you already have HTTP as well, I want to mention that PubSubHubbub can also be used for that!
The current version of the protocol applies to any mime type that can be transported with HTTP so that would work.
In practice it's just a webhooks API which makes it easy to use and scale via load balancing.
Is there an application-agnostic signaling protocol?
Yes there are lots and you already mention a number of them such as XMPP, SIP and RTSP. You could also add the brand new WebRTC protocol to the list.
We would like to add high-bandwidth real-time streaming possibilities. It is logical to use RTP for that.
Yes. RTP is lightweight and as its name suggest was designed for carrying real-time traffic. It's also popular so you will be able to find numerous existing implementations.
The only thing we want to negotiate is another connection to that
party that can be used for data transmission. In the Session
Description Protocol all kind of stuff about media shows up, optional
phone numbers, etc. If someone can point at a signaling protocol that
is meant to be application-agnostic, that would be great!
I'm not sure what you mean here. Session Description Protocol (SDP) is a standard way to describe the media capabilities of a device. It's commonly used in SIP and RTSP (and XMPP has something equivalent) however it's separate from those protocols and if you don't want to use it you are free to come up with your own way of describing media.
You may be getting overwhelmed by some of the SDP examples, and they can indeed get very complicated when there are multiple streams and codecs offered. However an SDP payload can also be very simple; below is an SDP example for an RTSP server offering a single MJPEG video stream.
o=- - 0 IN IP4
t=0 0
m=video 0 RTP/AVP 26
If you just need a signalling protocol that is system and application agnostic, XMPP is the way to go.

SIP over websockets to true SIP

I'm trying to implement a sip server for connecting to from an HTML sip client(made using sipml5). During my research into doing this I've come across sip over web-sockets which might be useful to me, however, I am unsure if a user agent connecting through sip over web-sockets to a compatible server would then be able to successfully make a call to some one using an incompatible server(i.e. calling from SIP over web-sockets to true SIP).
I know webrtc2sip can be used for connecting to legacy networks but I would rather avoid using another proxy if at all possible. So, is it possible to connect to a compatible SIP server using SIP over web-sockets then make a call from this user agent to another that does not support SIP over web-sockets without using a gateway?
You are right, SIP over Websockets is a draft, not specification. And I do not know many SIP vendors who support this draft.
Possible solution is truly websocket-SIP gateway. For example Flashphoner Web Call Server is implemented as a gateway which works through websockets with browser and works via SIP(TCP and UDP) with SIP servers. Therefore it is compatible with any server that supports RFC3261 - standard SIP specification.
Brief signaling scheme is:
Browser - [Websockets] - Web Call Server - [SIP TCP, UDP] - any SIP Server
Brief streaming scheme:
Browser - [WebRTC = SRTP, DTLS, ICE, STUN ] - Web Call Server - [RTP UDP] - any SIP/RTP Server
An alternate way is to use kamailio as it understands both sip and ws sip .
when you say "implementing a sip server " is it a simple registrar or proxy server or you want cal control logic / presence other features ?
In all cases kamailio fulfills all requirements , plus it is opensource .
Mobicents SIP Servlets Example already provides a B2BUA Application taking care of that for you. The Media is peer to peer (or through a TURN Relay Server) but if you need to bridge to a Media Server, you can indeed patch the SDP Body to make the media of each party go through the Media Server (pending it supports Media related codecs from WebRTC, DTLS-SRTP etc) to add conferencing, recording type of capabilities.

WebRTC vs Websockets: If WebRTC can do Video, Audio, and Data, why do I need Websockets? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
So I'm looking to build a chat app that will allow video, audio, and text. I spent some time researching into Websockets and WebRTC to decide which to use. Since there are plenty of video and audio apps with WebRTC, this sounds like a reasonable choice, but are there other things I should consider?
Feel free to share your thoughts.
Things like:
Due to being new WebRTC is available only on some browsers, while WebSockets seems to be in more browsers.
Scalability - Websockets uses a server for session and WebRTC seems to be p2p.
Multiplexing/multiple chatrooms - Used in Google+ Hangouts, and I'm still viewing demo apps on how to implement.
Server - Websockets needs RedisSessionStore or RabbitMQ to scale across multiple machines.
WebRTC is designed for high-performance, high quality communication of video, audio and arbitrary data. In other words, for apps exactly like what you describe.
WebRTC apps need a service via which they can exchange network and media metadata, a process known as signaling. However, once signaling has taken place, video/audio/data is streamed directly between clients, avoiding the performance cost of streaming via an intermediary server.
WebSocket on the other hand is designed for bi-directional communication between client and server. It is possible to stream audio and video over WebSocket (see here for example), but the technology and APIs are not inherently designed for efficient, robust streaming in the way that WebRTC is.
As other replies have said, WebSocket can be used for signaling.
I maintain a list of WebRTC resources: strongly recommend you start by looking at the 2013 Google I/O presentation about WebRTC.
Websockets use TCP protocol.
WebRTC is mainly UDP.
Thus main reason of using WebRTC instead of Websocket is latency.
With websocket streaming you will have either high latency or choppy playback with low latency. With WebRTC you may achive low-latency and smooth playback which is crucial stuff for VoIP communications.
Just try to test these technology with a network loss, i.e. 2%. You will see high delays in the Websocket stream.
Ratified IETF standard (6455) with support across all modern browsers and even legacy browsers using web-socket-js polyfill.
Uses HTTP compatible handshake and default ports making it much easier to use with existing firewall, proxy and web server infrastructure.
Much simpler browser API. Basically one constructor with a couple of callbacks.
Client/browser to server only.
Only supports reliable, in-order transport because it is built On TCP. This means packet drops can delay all subsequent packets.
Just beginning to be supported by Chrome and Firefox. MS has proposed an incompatible variant. The DataChannel component is not yet compatible between Firefox and Chrome.
WebRTC is browser to browser in ideal circumstances but even then almost always requires a signaling server to setup the connections. The most common signaling server solutions right now use WebSockets.
Transport layer is configurable with application able to choose if connection is in-order and/or reliable.
Complex and multilayered browser API. There are JS libs to provide a simpler API but these are young and rapidly changing (just like WebRTC itself).
webRTC or websockets? Why not use both.
When building a video/audio/text chat, webRTC is definitely a good choice since it uses peer to peer technology and once the connection is up and running, you do not need to pass the communication via a server (unless using TURN).
When setting up the webRTC communication you have to involve some sort of signaling mechanism. Websockets could be a good choice here, but webRTC is the way to go for the video/audio/text info. Chat rooms is accomplished in the signaling.
But, as you mention, not every browser supports webRTC, so websockets can sometimes be a good fallback for those browsers.
Security is one aspect you missed.
With Websockets the data has to go via a central webserver which typically sees all the traffic and can access it.
With WebRTC the data is end-to-end encrypted and does not pass through a server (except sometimes TURN servers are needed, but they have no access to the body of the messages they forward).
Depending on your application this may or may not matter.
If you are sending large amounts of data, the saving in cloud bandwidth costs due to webRTC's P2P architecture may be worth considering too.
Comparing websocket and webrtc is unfair.
Websocket is based on top of TCP. Packet's boundary can be detected from header information of a websocket packet unlike tcp.
Typically, webrtc makes use of websocket. The signalling for webrtc is not defined, it is upto the service provider what kind of signalling he wants to use. It may be SIP, HTTP, JSON or any text / binary message.
The signalling messages can be send / received using websocket.
Webrtc is a part of peer to peer connection.
We all know that before creating peer to peer connection, it requires handshaking process to establish peer to peer connection.
And websockets play the role of handshaking process.
Websocket and WebRTC can be used together, Websocket as a signal channel of WebRTC, and webrtc is a video/audio/text channel, also WebRTC can be in UDP also in TURN relay, TURN relay support TCP HTTP also HTTPS.
Many projects use Websocket and WebRTC together.

Interoperability of SIP/H.323/IAX2

I am curious to know if interoperability exists between those three protocols. Like if a call originated from a SIP protocol can go through a H.323 protocol? An article or book link about this topic will be much appreciated.Thanks.
SIP, H.323 and IAX2 are all different protocols and are not directly interoperable. That is, you cannot connect a SIP phone to an H.323 device and make a call.
The problems these protocols solve are all similar (e.g. Make a voice or video call). Protocol converters and other devices (like gateways) are available and can do the conversion.
You may also have to transcode the audio and video data from one codec to another, but you may also have to do that on a SIP-SIP or H.323-H.323 call.
Many PBXes and SoftSwitches support both SIP and H.323: asterisk supports all 3 (SIP, H.323 and IAX2).

How to establish a TCP Socket connection from a web browser (client side)?

I've read about WebSockets but they don't seem to be pure "sockets", because there is an application layer protocol over them. "ws:"
Is there any way of doing a pure socket connection from a web browser, to enliven webpages?
Here are my random stabs in the dark
Applets sockets provided by Java (need java installed)
Flash sockets provided by Flash (need flash installed)
But about HTML5, Why are they called WebSockets if they aren't Sockets?
Is the websocket protocol so simple to implement that it is "almost"-sockets?
I've read about WebSockets but they don't seem to be pure "sockets", because there is an application layer protocol over them.
[Is the] websocket protocol so simple to implement that [it is] "almost"-sockets?
Allowing regular socket connections directly from the browser is never going to happen because it opens up a huge risk. WebSockets is about as close to raw sockets from the browser as you are going to get. The initial WebSockets handshake is similar to an HTTP handshake (allowing web servers to proxy/bridge it) and adds CORS type security. In addition, WebSockets is a message based transport (rather than streaming as raw TCP) and this is done using a two byte header on each message frame.
Even flash is not able to quite make raw TCP connections. Flash sockets also add CORS security, but instead of an in-band handshake, flash socket connections make a connection to port 843 on the target server to request a security policy file.
Is there any way of doing a pure socket connection from a web browser, to enliven webpages?
Yes, you can use my websockify bridge/proxy which allows a WebSockets enabled browser to connect directly to a TCP socket via websockify.
But about HTML5, Why are they called WebSockets if they aren't Sockets?
WebSockets are a transport built on TCP sockets. After the handshake there is very minimal overhead (typically just a two byte header).
I can't improve on Kanaka's answers to your secondary questions, and I know this question is a year old. But for the main question, Is there any way of doing a pure socket connection from a web browser, to enliven webpages? There is a project called the Java / JavaScript Socket Bridge that might be what you (or anyone coming across this page from a Google search) are looking for. The advantage of this method over what others have mentioned is that it does not require either a client-side or a server-side service to be run. So, for instance, if you wanted to implement an IRC client purely in JavaScript but your web host does not allow you sufficient rights to proxy the connection, this Java applet would be the way to go. The only concern is making sure the client has Java installed and allowed.
You can just send data between a client and a server with WebSockets. Simply speaking, the only difference that WebSockets introduces is that the client:
adds some header bytes, like the type of data and the length
adds masks and encodes the data using them
The server also has to add header bytes, but does not need to encode the data.
If you implement the protocol correctly (server side, that is, since the browser already has an implementation), you can use it with ease to send text and binary data. (Although browser support is narrow, especially for the latter.)
The benefit of WebSocket is that it is HTTP based. You can use it also in environments there http proxies are used. Thus Websocket has a higher infrastructure compatibility as plain tcp.
Additionally http/WebSocket is providing you some features which you otherwise have to specify on your own:
NAT keepalive
Multiplexing via URI
If you are asking for some data to be pushed from server it is widely termed as COMET or Reverse Ajax.
Web sockets is still not very popular as there are inherent firewall issues and minimal support yet from popular browsers.
You can take a look at as this is one of the most popular implementations (but native to unix/linux only for now. For windows they suggest using a virtual box or vmware based implementation)
