NLB and persistent connections - windows

We are trying to implement network load balancing of windows in one of our high performing application, lets call it middleware. The middleware connects to three channels thorugh socket and the connection are persistent, means the clients remain connected between the transaction. We would like to distribute the work on nodes on the basis of transactions not the connections as there are only three. What approach should be taken.

seems like Windows NLB only works with the on demand connection and obviously it has to, in order distribute load. we will be using MSMQ to distribute the transaction load itself accross all nodes.


Simple Server to PUSH lots of data to Browser?

I'm building a Web Application that consumes data pushed from Server.
Each message is JSON and could be large, hundreds of kilobytes, and messages send couple times per minute, and the order doesn't matter.
The Server should be able to persist not yet delivered messages, potentially storing couple of megabytes for client for couple of days, until client won't get online. There's a limit on the storage size for unsent messages, say 20mb per client, and old undelivered messages get deleted when this limit is exceeded.
Server should be able to handle around 1 thousand simultaneous connections. How it could be implemented simply?
Possible Solutions
I was thinking maybe store messages as files on disk and use Browser Pool for 1 sec, to check for new messages and serve it with NGinx or something like that? Is there some configs / modules for NGinx for such use cases?
Or maybe it's better to use MQTT Server or some Message Queue like Rabbit MQ with some Browser Adapter?
Actually, MQTT supports the concept of sessions that persist across client connections, but the client must first connect and request a "non-clean" session. After that, if the client is disconnected, the broker will hold all the QoS=1 or 2 messages destined for that client until it reconnects.
With MQTT v3.x, technically, the server is supposed to hold all the messages for all these disconnected clients forever! Each messages maxes out at a 256MB payload, but the server is supposed to hold all that you give it. This created a big problem for servers that MQTT v5 came in to fix. And most real-world brokers have configurable settings around this.
But MQTT shines if the connections are over unreliable networks (wireless, cell modems, etc) that may drop and reconnect unexpectedly.
If the clients are connected over fairly reliable networks, AMQP with RabbitMQ is considerably more flexible, since clients can create and manage the individual queues. But the neat thing is that you can mix the two protocols using RabbitMQ, as it has an MQTT plugin. So, smaller clients on an unreliable network can connect via MQTT, and other clients can connect via AMQP, and they can all communicate with each other.
MQTT is most likely not what you are looking for. The protocol is meant to be lightweight and as the comments pointed out, the protocol specifies that there may only exist "Control Packets of size up to 268,435,455 (256 MB)" source. Clearly, this is much too small for your use case.
Moreover, if a client isn't connected (and subscribed on that particular topic) at the time of the message being published, the message will never be delivered. EDIT: As #Brits pointed out, this only applies to QoS 0 pubs/subs.
Like JD Allen mentioned, you need a queuing service like Rabbit MQ or AMQ. There are countless other such services/libraries/packages in existence so please investigate more.
If you want to role your own, it might be worth considering using AWS SQS and wrapping some of your own application logic around it. That'll likely be a bit hacky though, so take that suggestion with a grain of salt.

Multiple websocket channels, single ws object?

I will be subscribing to multiple websocket channels of the same server. Writing a manager to assign the various types of updates I receive to different queues based on tags present in the Json is possible, but it would save programming time to just create a multiple websocket client objects in my app, so each websocket object only subscribies to a single channel.
Is this a sensible idea or should I stick to a single websocket client?
The correct answer really depends on your architecture. However, as a general rule:
Stick to a single websocket client if you can.
Servers have a limit on the number of connections they can handle, meaning that with every new Websocket client, you're getting closer to your server's limits (even if the Websocket does absolutely nothing except remain open).
If each client opens two Websocket connections, the number of clients the server can handle is cut by half, open 4 connections per client and the server's capacity just dropped to 25%.
This directly translates to money and costs since running another server will increase your expenses. Also, the moment you have to scale beyond a single server, you add backend costs.

Web server and ZeroMQ patterns

I am running an Apache server that receives HTTP requests and connects to a daemon script over ZeroMQ. The script implements the Multithreaded Server pattern (, it successfully receives the request and dispatches it to one of its worker threads, performs the action, responds back to the server, and the server responds back to the client. Everything is done synchronously as the client needs to receive a success or failure response to its request.
As the number of users is growing into a few thousands, I am looking into potentially improving this. The first thing I looked at is the different patterns of ZeroMQ, and whether what I am using is optimal for my scenario. I've read the guide but I find it challenging understanding all the details and differences across patterns. I was looking for example at the Load Balancing Message Broker pattern ( It seems quite a bit more complicated to implement than what I am currently using, and if I understand things correctly, its advantages are:
Actual load balancing vs the round-robin task distribution that I currently have
Asynchronous requests/replies
Is that everything? Am I missing something? Given the description of my problem, and the synchronous requirement of it, what would you say is the best pattern to use? Lastly, how would the answer change, if I want to make my setup distributed (i.e. having the Apache server load balance the requests across different machines). I was thinking of doing that by simply creating yet another layer, based on the Multithreaded Server pattern, and have that layer bridge the communication between the web server and my workers.
Some thoughts about the subject...
Keep it simple
I would try to keep things simple and "plain" ZeroMQ as long as possible. To increase performance, I would simply to change your backend script to send request out from dealer socket and move the request handling code to own program. Then you could just run multiple worker servers in different machines to get more requests handled.
I assume this was the approach you took:
I was thinking of doing that by simply creating yet another layer, based on the Multithreaded Server pattern, and have that layer bridge the communication between the web server and my workers.
Only problem here is that there is no request retry in the backend. If worker fails to handle given task it is forever lost. However one could write worker servers so that they handle all the request they got before shutting down. With this kind of setup it is possible to update backend workers without clients to notice any shortages. This will not save requests that get lost if the server crashes.
I have the feeling that in common scenarios this kind of approach would be more than enough.
Mongrel2 seems to handle quite many things you have already implemented. It might be worth while to check it out. It probably does not completely solve your problems, but it provides tested infrastructure to distribute the workload. This could be used to deliver the request to be handled to multithreaded servers running on different machines.
One solution to increase the robustness of the setup is a broker. In this scenario brokers main role would be to provide robustness by implementing queue for the requests. I understood that all the requests the worker handle are basically the same type. If requests would have different types then broker could also do lookups to find correct server for the requests.
Using the queue provides a way to ensure that every request is being handled by some broker even if worker servers crashed. This does not come without price. The broker is by itself a single point of failure. If it crashes or is restarted all messages could be lost.
These problems can be avoided, but it requires quite much work: the requests could be persisted to the disk, servers could be clustered. Need has to be weighted against the payoffs. Does one want to use time to write a message broker or the actual system?
If message broker seems a good idea the time which is required to implement one can be reduced by using already implemented product (like RabbitMQ). Negative side effect is that there could be a lot of unwanted features and adding new things is not so straight forward as to self made broker.
Writing own broker could covert toward inventing the wheel again. Many brokers provide similar things: security, logging, management interface and so on. It seems likely that these are eventually needed in home made solution also. But if not then single home made broker which does single thing and does it well can be good choice.
Even if broker product is chosen I think it is a good idea to hide the broker behind ZeroMQ proxy, a dedicated code that sends/receives messages from the broker. Then no other part of the system has to know anything about the broker and it can be easily replaced.
Using broker is somewhat developer time heavy. You either need time to implement the broker or time to get use to some product. I would avoid this route until it is clearly needed.
Some links
Comparison between broker and brokerless

ØMQ N-to-M message queue

I am considering the feasibility that if we can replace our message-queue-middleware with ØMQ.
I have two set of servers.
The first set of the servers, they don't talk to another server from the same set, they only append the requests into specific message-queue.
The 2nd set of the servers, they don't talk to another server from the same set, they only receive the requests from specific message-queue to handle the requests.
It looks like a producer-consumer model.
And I think it can be replaced by the ØMQ's freelance pattern
But the questions are:
How to support dynamic discovery for both server & clients?
How to support dynamic discovery for both server & clients?
There are probably a hundred ways you could implement that, and greatly depend on your situation. If all the servers will always be on the same LAN you could bootstrap using the broadcast address on the local network and ask all responders who they are. Quick and dirty.
I would personally implement a bootstrap service that everyone knows about. They all can ask this always-available service for who is 'online' for the type of server they're after.
Another option, you could also use pub-sub. This would require a central publisher. newly connecting nodes would notify the publisher who would notify all other nodes of the new join, possibly including the new nodes ID, ip:port (if desired) etc. All nodes will still be able to communicate if the publisher crashes since its only used for global notifications, and a backup publisher could be used to make the system failsafe. Each node can also send heartbeats to publisher, with publisher notifying all other nodes when a node leaves/crashes.

What is the best way to deliver real-time messages to Client that can not be requested

We need to deliver real-time messages to our clients, but their servers are behind a proxy, and we cannot initialize a connection; webhook variant won't work.
What is the best way to deliver real-time messages considering that:
client that is behind a proxy
client can be off for a long period of time, and all messages must be delivered
the protocol/way must be common enough, so that even a PHP developer could easily use it
I have in mind three variants:
WebSocket - client opens a websocket connection, and we send messages that were stored in DB, and messages comming in real time at the same time.
RabbitMQ - all messages are stored in a durable, persistent queue. What if partner will not read from a queue for some time?
HTTP GET - partner will pull messages by blocks. In this approach it is hard to pick optimal pull interval.
Any suggestions would be appreciated. Thanks!
Since you seem to have to store messages when your peer is not connected, the question applies to any other solution equally: what if the peer is not connected and messages are queueing up?
RabbitMQ is great if you want loose coupling: separating the producer and the consumer sides. The broker will store messages for you if no consumer is connected. This can indeed fill up memory and/or disk space on the broker after some time - in this case RabbitMQ will shut down.
In general, RabbitMQ is a great tool for messaging-based architectures like the one you describe:
Load balancing: you can use multiple publishers and/or consumers, thus sharing load.
Flexibility: you can configure multiple exchanges/queues/bindings if your business logic needs it. You can easily change routing on the broker without reconfiguring multiple publisher/consumer applications.
Flow control: RabbitMQ also gives you some built-in methods for flow control - if a consumer is too slow to keep up with publishers, RabbitMQ will slow down publishers.
You can refactor the architecture later easily. You can set up multiple brokers and link them via shovel/federation. This is very useful if you need your app to work via multiple data centers.
You can easily spot if one side is slower than the other, since queues will start growing if your consumers can't read fast enough from a queue.
High availability and fault tolerance. RabbitMQ is very good at these (thanks to Erlang).
So I'd recommend it over the other two (which might be good for a small-scale app, but you might grow it out quickly is requirements change and you need to scale up things).
Edit: something I missed - if it's not vital to deliver all messages, you can configure queues with a TTL (message will be discarded after a timeout) or with a limit (this limits the number of messages in the queue, if reached new messages will be discarded).
