So i have been making a live chat room with my websockets app, using socket.io.
But i also have a secondary feature that requires real time that is unrelated to my live chat feature. So I wanted to know should i run a separate app on a different port ?
Does this provide any technical benefit on performance and scalability ? Or should a website use one app which handles all web socket data.
I'm no expert on servers so i was wondering if there is a correct approach for this on a technical aspect.
Please note: I'm not looking for opinions only factual truths of pros and cons of separating over centralising.
Assuming there is any pros and cons at all. The lack of documentation on alot of websocket frameworks don't really go into much detail on this particular aspect.
WebSocket Apps and Ports
Q: So I wanted to know should i run a separate app on a different port ?
The short answer is yes because unrelated services may expect different growth patterns, so you want to separate the functionality as much as you can so you can deploy one or more services on the same box on different ports.
The longer answer is that you should do this to separate things internally, but it sure would be great if you could let your clients access all your services over ports 80 or 443 - with WebSockets you can use the WebSocket URL to distinguish services and share the same port.
To achieve this, you can introduce a WebSocket gateway application that supports port-sharing. Then your clients can hit port 80 for every WebSocket service, but you utilize different ports for your separate apps.
If your WebSocket gateway application supports port-sharing for WebSockets it means your server accept on the same port with different paths... e.g.
ws://example.com:80/chat_service
ws://example.com:80/unrelated_service
In general to get traffic over the Internet at large it's better to use ports 80 and 443, port-shared if you can, to pierce through any number of intermediary proxies/firewalls along the way.
Q: Does this provide any technical benefit on performance and scalability ? Or should a website use one app which handles all web socket data?
Just like an application server, there is no reason why one WebSocket server should not serve multiple different web applications. If you expect the applications will experience different growth rates, you should separate them into separate web apps, sharing libraries.
For performance, if your applications push a larger amount of data over the WebSocket to the clients, you might want to consider the total network interface traffic of all services to make sure you are not maxing out the bandwidth of the network card. This is independent of your separate port question.
For scalability, if the functionality is unrelated it will be more scalable to keep things separate and share libraries if necessary. This is because you may experience different growth levels for the unrelated services.
Summary
I'd suggest the combination of keeping things simple for clients and scalable for the servers is as the following:
Use a WebSocket gateway that can port-share. All clients can use 80 or 443 with separate paths.
Configure the WebSocket gateway to proxy through to separate apps on separate ports to allow for the servers to scale independently.
+----------+
+------| App:Chat |
+--------+ +---------+ | +----------+
| Client |-----internet------| gateway |------+ (port 8081)
+--------+ +---------+ |
(port80) | +----------+
+------| App:App2 |
+----------+
(port 8082)
Gateway solutions like the Kaazing WebSocket gateway offer this type of functionality and are free for development use up to 50 connections at a time.
(Disclaimer: I used to work for Kaazing as a server engineer).
I don't have a definitive answer for you but two things that may be helpful
1) There is a great book from O'Reilly on browser networking which you can actually read online for free to learn how the series of tubes that we know as the internet actually works. Here is the chapter on WebSockets (which is in the latter portion of the book) http://chimera.labs.oreilly.com/books/1230000000545/ch17.html. I just skimmed this and found a paragraph on multiplexing which seems relevant to your research.
2) One reason that springs to mind for why you may want to have the separate connections is that they act as FIFO queues so if one your 2nd usecase involves large payloads, the chat session would essentially hang while that payload transfers. Surely there are other reasons to go one way or the other though and hopefully someone else can give you a more complete answer.
Related
I have several servers handling the same requests and several clients sending requests. The servers are routers to keep/track the identity of the clients and the clients are dealers which round robin servers. Does this dealer/router pair without a broker make sense? It works and fits my need but I don't see this pattern on the official guides.
If you are trying to balance your request loads between servers at the client, yes you are right. This pattern is named client-side-load-balancing. You can read about that here
But I'm not sure about your implementation.
You can see more in google SRE book.
I have a concox GT06 device from which I want to send tracking data to my AWS Server.
The coding protocol manual that comes with it only explains the data structure and protocol.
How does my server receive the GPS data collected by my tracker?
Verify if your server allows you to open sockets, which most low cost solutions do NOT allow for security reasons (i recommend using an Amazon EC2 virtual machine as your platform).
Choose a port on which your application will listen to incoming data, verify if it is open (if not open it) and code your application (i use C++) to listen to that port.
Compile and run your application on the server (and make sure that it stays alive).
Configure your tracker (usually by sending an sms to it) to send data to your server's IP and to the port which your application is listening to.
If you are, as i suspect you are, just beginning, consider that you will invest 2 to 3 weeks to develop this solution from scratch. You might also consider looking for a predeveloped tracking platform, which may or may not be acceptable in terms of data security.
You can find examples and tutorials online. I am usually very open with my coding and would gladly send a copy of the socket server, but, in this case, for security reasons, i cannot do so.
Instead of direct parsing of TCP or UDP packets you may use simplified solution putting in-between middleware backends specialized in data parsing e.g. flespi.
In such approach you may use HTTP REST API to fetch each new portion of data from trackers sent to you dedicated IP:port (called channel) or even send standardized commands with HTTP REST to connected devices.
At the same time it is possible to open MQTT connection using standard libraries and receive converted into JSON messages from devices as MQTT in real time, which is even better then REST due to almost zero latency.
If you are using python you may take a look at open-source flespi_receiver library. In this approach with 10 lines of code you may have on your EC2 whole parsed into JSON messages from Concox GT06.
I am new in XMPP. I want to use it for my chatting application which can be accessible from web and mobile. I have searched a lot about background working of XMPP but could not found much clear. What is the actual role of XMPP. XMPP is not a protocol for transferring data because it uses BOSH or Websockets, XMPP is not for storing data because many server side implementations are using external databases. Then what is XMPP doing exactly in the process of chatting ?
XMPP is a protocol.
Protocols can be and usually are layered. You can build a protocol layered on a protocol layered on a protocol.
XMPP is layered on BOSH or Websockets
Websockets/BOSH is layered on HTTP(S)
HTTP(S) is layered on TCP
TCP is layered on IP
IP is layered on Ethernet
For further reading I recommend to familiarize yourself with the OSI model.
When you want to create an application which implements the XMPP protocol, you also need an implementation of every layer below it. When you are smart, you will try to find a library which provides you with an implementation of the highest layer you can find and all layers below it.
Or when you really want to learn how each of these protocols works exactly, it would be a great exercise to read the protocol specifications and build your own protocol stack starting as low as your environment allows and working up. But do not do this when you have the goal to create a market-ready product. An implementation created and tested by people who knew what they were doing will likely work much better than what you will build and save you a lot of time.
Is is possible to run a Node.js TCP Socket oriented application on the Cloud, more specifically on Heroku or AppFog.
It's not going to be a web application, but a server for access with a client program. The basic idea is to use the capabilities of the Cloud - scaling and an easy to use platform.
I know that such application could easily run on IaaS like Amazon AWS, but I would really like to take advantage of the PaaS features of Heroku or AppFog.
I am reasonably sure that doesn't answer the question at hand: "Is is possible to run a Node.js TCP Socket oriented application". All PaaS companies (including Nodejitsu) support HTTP[S]-only reverse proxies for incoming connections.
Generally with node.js + any PaaS with a socket oriented connection you want to use WebSockets, but:
Heroku does not support WebSockets and will only hold open your connection for 55-seconds (see: https://devcenter.heroku.com/articles/http-routing#timeouts)
AppFog does not support WebSockets, but I'm not sure how they handle long-held HTTP connections.
Nodejitsu supports WebSockets and will hold your connections open until closed or reset. Our node.js powered reverse-proxies make this very cheap for us.
We have plans to support front-facing TCP load-balancing with custom ports in the future. Stay tuned!
AppFog and Heroku give your app a single arbitrary port to listen on which is port mapped from port 80. You don't get to pick your port. If you need to keep a connection open for extended periods of time see my edit below. If your client does not need to maintain and open connection you should consider creating a restful API which emits json for your client app to consume. Port 80 is more than adequate for this and Node.js and Express make a superb combo for creating APIs on paas.
AppFog
https://docs.appfog.com/languages/node#node-walkthrough
var port = process.env.VCAP_APP_PORT || 5000;
Heroku
https://devcenter.heroku.com/articles/nodejs
var port = process.env.PORT || 5000;
EDIT: As mentioned by indexzero, AppFog and Heroku support http[s] only and close long held connections. AppFog will keep the connection open as long as there is activity. This can be worked around by using Socket.io or a third party solutions like Pusher
// Socket.io server
var io = require('socket.io').listen(port);
...
io.configure(function () {
io.set("transports", ["xhr-polling"]);
io.set("polling duration", 12);
});
tl;dr - with the current state of the world, it's simply not possible; you must purchase a virtual machine with its own public IP address.
All PaaS providers I've found have an HTTP router in front of all of their applications. This allows them to house hundreds of thousands of applications under a single IP address, vastly improving scalability, and hence – how they offer application hosting for free. So in the HTTP case, the Hostname header is used to uniquely identify applications.
In the TCP case however, an IP address must be used to identify an application. Therefore, in order for this to work, PaaS providers would be forced to allocate you one from their IPv4 range. This would not scale for two main reasons: the IPv4 address space having been completely exhausted and the slow pace of "legacy" networks would make it hard to physically move VMs. ("legacy" networks refer to standard/non-SDN networks.)
The solution to these two problems are IPv6 and SDN, though I foresee ubiquitous SDN arriving before IPv6 does – which could then be used to solve the various IPv4 problems. Amazon already use SDN in their datacenters though there is still a long way to go. In the meantime, just purchase a virtual machine/linux container instance with a public IP address and run your TCP servers there.
I'm looking into getting an openfire server started and setting up a strophe.js client to connect to it. My concern is that using http-bind might be costly in terms of performance versus making a straight on XMPP connection.
Can anyone tell me whether my concern is relevant or not? And if so, to what extend?
The alternative would be to use a flash proxy for all communication with OpenFire.
Thank you
BOSH is more verbose than normal XMPP, especially when idle. An idle BOSH connection might be about 2 HTTP requests per minute, while a normal connection can sit idle for hours or even days without sending a single packet (in theory, in practice you'll have pings and keepalives to combat NATs and broken firewalls).
But, the only real way to know is to benchmark. Depending on your use case, and what your clients are (will be) doing, the difference might be negligible, or not.
Basics:
Socket - zero overhead.
HTTP - requests even on IDLE session.
I doubt that you will have 1M users at once, but if you are aiming for it, then conection-less protocol like http will be much better, as I'm not sure that any OS can support that kind of connected socket volume.
Also, you can tie your OpenFires together, form a farm, and you'll have nice scalability there.
we used Openfire and BOSH with about 400 concurrent users in the same MUC Channel.
What we noticed is that Openfire leaks memory. We had about 1.5-2 GB of memory used and got constant out of memory exceptions.
Also the BOSH Implementation of Openfire is pretty bad. We switched then to punjab which was better but couldn't solve the openfire issue.
We're now using ejabberd with their built-in http-bind implementation and it scales pretty well. Load on the server having the ejabberd running is nearly 0.
At the moment we face the problem that our 5 webservers which we use to handle the chat load are sometimes overloaded at about 200 connected Users.
I'm trying to use websockets now but it seems that it doesn't work yet.
Maybe redirecting the http-bind not via Apache rewrite rule but directly on a loadbalancer/proxy would solve the issue but I couldn't find a way on how to do this atm.
Hope this helps.
I ended up using node.js and http://code.google.com/p/node-xmpp-bosh as I faced some difficulties to connect directly to Openfire via BOSH.
I have a production site running with node.js configured to proxy all BOSH requests and it works like a charm (around 50 concurrent users). The only downside so far: in the Openfire admin console you will not see the actual IP address of the connected clients, only the local server address will show up as Openfire get's the connection from the node.js server.