Node.js TCP Socket Server on the Cloud [Heroku/AppFog] - heroku

Is is possible to run a Node.js TCP Socket oriented application on the Cloud, more specifically on Heroku or AppFog.
It's not going to be a web application, but a server for access with a client program. The basic idea is to use the capabilities of the Cloud - scaling and an easy to use platform.
I know that such application could easily run on IaaS like Amazon AWS, but I would really like to take advantage of the PaaS features of Heroku or AppFog.

I am reasonably sure that doesn't answer the question at hand: "Is is possible to run a Node.js TCP Socket oriented application". All PaaS companies (including Nodejitsu) support HTTP[S]-only reverse proxies for incoming connections.
Generally with node.js + any PaaS with a socket oriented connection you want to use WebSockets, but:
Heroku does not support WebSockets and will only hold open your connection for 55-seconds (see: https://devcenter.heroku.com/articles/http-routing#timeouts)
AppFog does not support WebSockets, but I'm not sure how they handle long-held HTTP connections.
Nodejitsu supports WebSockets and will hold your connections open until closed or reset. Our node.js powered reverse-proxies make this very cheap for us.
We have plans to support front-facing TCP load-balancing with custom ports in the future. Stay tuned!

AppFog and Heroku give your app a single arbitrary port to listen on which is port mapped from port 80. You don't get to pick your port. If you need to keep a connection open for extended periods of time see my edit below. If your client does not need to maintain and open connection you should consider creating a restful API which emits json for your client app to consume. Port 80 is more than adequate for this and Node.js and Express make a superb combo for creating APIs on paas.
AppFog
https://docs.appfog.com/languages/node#node-walkthrough
var port = process.env.VCAP_APP_PORT || 5000;
Heroku
https://devcenter.heroku.com/articles/nodejs
var port = process.env.PORT || 5000;
EDIT: As mentioned by indexzero, AppFog and Heroku support http[s] only and close long held connections. AppFog will keep the connection open as long as there is activity. This can be worked around by using Socket.io or a third party solutions like Pusher
// Socket.io server
var io = require('socket.io').listen(port);
...
io.configure(function () {
io.set("transports", ["xhr-polling"]);
io.set("polling duration", 12);
});

tl;dr - with the current state of the world, it's simply not possible; you must purchase a virtual machine with its own public IP address.
All PaaS providers I've found have an HTTP router in front of all of their applications. This allows them to house hundreds of thousands of applications under a single IP address, vastly improving scalability, and hence – how they offer application hosting for free. So in the HTTP case, the Hostname header is used to uniquely identify applications.
In the TCP case however, an IP address must be used to identify an application. Therefore, in order for this to work, PaaS providers would be forced to allocate you one from their IPv4 range. This would not scale for two main reasons: the IPv4 address space having been completely exhausted and the slow pace of "legacy" networks would make it hard to physically move VMs. ("legacy" networks refer to standard/non-SDN networks.)
The solution to these two problems are IPv6 and SDN, though I foresee ubiquitous SDN arriving before IPv6 does – which could then be used to solve the various IPv4 problems. Amazon already use SDN in their datacenters though there is still a long way to go. In the meantime, just purchase a virtual machine/linux container instance with a public IP address and run your TCP servers there.

Related

Load balancer and WebSockets

Our infrastructure is composed by
1 F5 load balancer
3 nodes
We have an application which uses websockets, so when a user goes to our site, it opens a websocket to the balancer which it connects to the first available node, and it works as expected.
Our truobles arrives with maintenance tasks, when we have to update our software, we need to turn offline 1 node at a time, deploy the new release and then turn it on again. Doing this task, the balancer drops the open websocket connections to the node and the clients retries to connect after few seconds to the first available nodes, creating an inconvenience for the client because he could miss a signal (or more).
How we can keep the connection between the client and the balancer, changing the backend websocket server? Is the load balancer enough to achieve our goal or we need to change our infrastructure?
To avoid this kind of problems I recommend to read about the Azure SignalR. With this you don't need to thing about stuff like load balancer, redis backplane and other infrastructures that you possibly need to a WebSockets connection.
Basically the clients will not connected to your node directly but redirected to Azure SignalR. You can read more about it here: https://learn.microsoft.com/en-us/azure/azure-signalr/signalr-overview
Since it is important to your application to maintain the connection, I don't see how any other way to archive no connection drop to your nodes, since you need to shut them down.
It's important to understand that the F5 is a full TCP proxy. This means that the F5 is the server to the client and the client to the server. If you are using the websockets protocol then you must apply a websockets profile to the F5 Virtual Server in order for the websockets application to be handled properly by the Load Balancer.
Details of the websockets profile can be found here: https://support.f5.com/csp/article/K14754
If a websockets and an HTTP profile are applied to the Virtual Server - meaning that you have websockets and web traffic using the same port and LB nodes - then the F5 will allow the websockets traffic as passthrough. Also keep in mind that if this is an HTTPS virtual sever that you will need to ensure a client and server side HTTPS profile (SSL offload) are applied to the Virtual Server.
While there are a variety of ways that you can fiddle with load balancers to minimize the downtime caused by a software upgrade, none of them solve the problem, which is that your application-layer protocol seems to not tolerate some small network outages.
Even if you have a perfect load balancer and your software deploys cause zero downtime, the customer's computer may be on flaky wifi which causes a network dropout for half a second - or going over ethernet and someone reconfigures some routing on their LAN, etc.
I'd suggest having your server maintain a queue of messages for clients (up to some size/time limit) so that when a client drops a connection - whether it be due to load balancers/upgrades - or any other reason, it can continue without disruption.

Scalability of websockify?

I new with websockify. So here my situation.
Our company have servers written in C# to handle about 1000 to 2000 raw TCP sockets connect per time from Flash and mobile client to play a game online. So we consider to port Flash to Html5 and use Websockify and port native protocol build on TCP at client but still remain native TCP at server side(for mobile client still work).
So I guess Websock client and Websockify server connect via Websock protocol and Websockify and our server connect via TCP protocol
If I right, so can we do that to handle kind of amount connections on Websockify and it performance
The are implementations of websockify in several different languages. The python implementation is the default and has the most additional functionality (auth, logging, etc). However, the basic function of websockify is just to bridge transports (WebSockets to TCP sockets) so it's actually not that difficult to implement. There is a C version that you might look to get maximum efficiency although it is quite dated and probably buggy.
That being said, the python version of websockify is fairly scalable. Each new connection to websockify starts a new child process so it should be linearly scalable to the amount of CPU/memory on your host (separate processes means no GIL contention). Also, websockify is horizontally scalable if a single host can't handle the load of all connections. In other words, you could just put a load balancer (that supports WebSockets) in front of multiple websockify servers.
Also, websockify (the python version) is easy to configure to support multiple targets per instance of websockify. I've added a wiki page describing how to do that.

Does Azure Redis work over http?

Does Azure Redis support transport over http. I am aware of the setting that allows me to choose whether to enable SSL or not. But it seems to me the connection to Azure Redis happens over TCP.
"Every Redis Cluster node requires two TCP connections open. The normal Redis TCP port used to serve clients, for example 6379, plus the port obtained by adding 10000 to the data port, so 16379 in the example."
I have also posted this question on the Microsoft forum. It can be found here.
No, Redis (and Azure's as well) does not use HTTP but rather a text-based protocol called RESP. There are 3rd party servers that let you do that, such as Lark, Webdis and tinywebdis.

Shall I use WebSocket on ports other than 80?

Shall I use WebSocket on non-80 ports? Does it ruin the whole purpose of using existing web/HTTP infrastructures? And I think it no longer fits the name WebSocket on non-80 ports.
If I use WebSocket over other ports, why not just use TCP directly? Or is there any special benefits in the WebSocket protocol itself?
And since current WebSocket handshake is in the form of a HTTP UPGRADE request, does it mean I have to enable HTTP protocol on the port so that WebSocket handshake can be accomplished?
Shall I use WebSocket on non-80 ports? Does it ruin the whole purpose
of using existing web/HTTP infrastructures? And I think it no longer
fits the name WebSocket on non-80 ports.
You can run a webSocket server on any port that your host OS allows and that your client will be allowed to connect to.
However, there are a number of advantages to running it on port 80 (or 443).
Networking infrastructure is generally already deployed and open on port 80 for outbound connections from the places that clients live (like desktop computers, mobile devices, etc...) to the places that servers live (like data centers). So, new holes in the firewall or router configurations, etc... are usually not required in order to deploy a webSocket app on port 80. Configuration changes may be required to run on different ports. For example, many large corporate networks are very picky about what ports outbound connections can be made on and are configured only for certain standard and expected behaviors. Picking a non-standard port for a webSocket connection may not be allowed from some corporate networks. This is the BIG reason to use port 80 (maximum interoperability from private networks that have locked down configurations).
Many webSocket apps running from the browser wish to leverage existing security/login/auth infrastructure already being used on port 80 for the host web page. Using that exact same infrastructure to check authentication of a webSocket connection may be simpler if everything is on the same port.
Some server infrastructures for webSockets (such as socket.io in node.js) use a combined server infrastructure (single process, one listener) to support both HTTP requests and webSockets. This is simpler if both are on the same port.
If I use WebSocket over other ports, why not just use TCP directly? Or
is there any special benefits in the WebSocket protocol itself?
The webSocket protocol was originally defined to work from a browser to a server. There is no generic TCP access from a browser so if you want a persistent socket without custom browser add-ons, then a webSocket is what is offered. As compared to a plain TCP connection, the webSocket protocol offers the ability to leverage HTTP authentication and cookies, a standard way of doing app-level and end-to-end keep-alive ping/pong (TCP offers hop-level keep-alive, but not end-to-end), a built in framing protocol (you'd have to design your own packet formats in TCP) and a lot of libraries that support these higher level features. Basically, webSocket works at a higher level than TCP (using TCP under the covers) and offers more built-in features that most people find useful. For example, if using TCP, one of the first things you have to do is get or design a protocol (a means of expressing your data). This is already built-in with webSocket.
And since current WebSocket handshake is in the form of a HTTP UPGRADE
request, does it mean I have to enable HTTP protocol on the port so
that WebSocket handshake can be accomplished?
You MUST have an HTTP server running on the port that you wish to use webSocket on because all webSocket requests start with an HTTP request. It wouldn't have to be heavily featured HTTP server, but it does have to handle the initial HTTP request.
Yes - Use 443 (ie, the HTTPS port) instead.
There's little reason these days to use port 80 (HTTP) for anything other than a redirection to port 443 (HTTPS), as certification (via services like LetsEncrypt) are easy and free to set up.
The only possible exceptions to this rule are local development, and non-internet facing services.
Should I use a non-standard port?
I suspect this is the intent of your question. To this, I'd argue that doing so adds an unnecessary layer of complication with no obvious benefits. It doesn't add security, and it doesn't make anything easier.
But it does mean that specific firewall exceptions need to be made to host and connect to your websocket server. This means that people accessing your services from a corporate/school/locked down environment are probably not going to be able to use it, unless they can somehow convince management that it is mandatory. I doubt there are many good reasons to exclude your userbase in this way.
But there's nothing stopping you from doing it either...
In my opinion, yes you can. 80 is the default port, but you can change it to any as you like.

Do I *really* need RPC and NETBIOS to use transactional NServiceBus queues between local servers and Amazon EC2?

We have been trying - without success - to get transactional message queues working between local servers and our cloud servers up in Amazon EC2.
We're using NServiceBus, and have got the pub/sub examples and various other trivial apps working locally between here and EC2, but trying to spin up the components of our actual application is proving... vexatious.
As far as I can work out, to allow a local server (DYLAN-PC) to send a message transactionally via a queue on an Amazon EC2 instance, I will need to:
Enable NETBIOS name resolution (e.g. via the /etc/lmhosts file) at both ends
Allow RPC connections to be initiated from either end (so open port 135 for RPC plus various other ports)
Configure MSTDC on both systems, enabling remote connections and inbound/outbound connections
Have I missed something? In particular, the requirement to allow NetBIOS in an age where everything (including Active Directory!) runs on DNS seems particularly archaic. Are we doing something stupid trying to use MSMQ between sites like this? This is the first big project where we've tried this kind of distributed architecture, and the deployment/configuration is starting to hurt so much I'm convinced we've taken a wrong turn somewhere... a little perspective or advice would be gratefully received!
If you're look to build a geographically distributed system, where you can't arrange a VPN between these sites, you should be using the gateway capabilities of NServiceBus to communicate over alternate transports (like HTTP) between those sites.
RPC is required for reading from remote queues.
If you push to remote queues and pull from local queues, you won't be using RPC.

Resources