Vert.X Event Bus Scalability - cluster-computing

One question on vert.x event bus scalability. I am planning to use vert.x in smart device (small form facor) application and a remote management application. Initial estimate is that there will be close to 100K smart devices and 3/4 servers hosting management application. In this case, can you please advise using event bus between the smart device and web application (in cluster mode). My primary requirement of using event bus is to send dynamic notifications originated from device to the management servers and take corrective steps in case of system failure.
I posted another query recently and one of the users pointed me that internally vert.x uses the netsockets for event bus backed by hazelcast for cluster mode discovery. If that is the case, my assumption is that the scalability will be limited by the number of sockets that can be handled by the management server. Is this right ?
Also appreciate if anyone can point me to any benchmark test done on the vert.x eventbus in terms of msg processing performance.

My primary requirement of using event bus is to send dynamic notifications originated from device to the management servers and take corrective steps in case of system failure.
No, use regular HTTP requests for this. EventBus, and indeed every concurrent two-way networking model, is fundamentally unsuitable for this use case. Absolutely do not use Hazelcast on the clients; using a SockJS EventBus bridge is possible but so error-prone that you will certainly waste more time doing that correctly than writing a simple HTTP endpoint for this heartbeat behaviour.
my assumption is that the scalability will be limited by the number of sockets that can be handled by the management server. Is this right ?
No. Your scalability will be limited by however you'll be persisting the data you receive from the device. Hazelcast's maps are fine for this (accessed via vertx.sharedData()), but it really depends if you 100% understand what you want.

Related

Microservice State Synchronization

We are working on an application that has a WebSocket connection to every client. For high availability and load balancing purposes, we would like to scale the receiving micro service. As the WebSocket connection is used to propagate the state of a client to every other client it is important to synchronize the current state of a client with all other instances of the receiving micro service. It is also important that the state has to be reset when a client disconnects.
To give you some specs:
We are using docker swarm
Its a NodeJS Backend and an Angular 9 Frontend
We have looked into multiple ideas, for example:
Redis Cache (The state would not be deleted if the instance fails.)
Queues/Topics (This would mean every instance has to keep track of the current state of all clients.)
WebSockets between instances (This looks promising but is not really scalable.)
What is the best practice to sync the state of a micro service between multiple instances while making sure that there are no inconsistencies? How are you solving this issue? Are we missing something obvious? Any tips and tricks?
We appreciate any suggestions.
This might not be 100% what you want to hear, but generally people advise that all microservices should be stateless.
An overall application, of course, has state, and databases, persistent event streams or key-value caches (e.g. Redis) are excellent ways of persisting this. Ideally this is bounded per service though, otherwise you risk end up writing a distributed monolith.
Hard to say in your particular case, but perhaps rethink how state is stored conceptually, and make that more explicit - determining what is cache (for performance) and what is genuine state that should be persisted externally (e.g. to Redis & a database), that allows many service instances to use instantly, thus making sure they can are truly disposable processes.

Should a websocket connection be general or specific?

Should a websocket connection be general or specific?
e.g. If I was building a stock trading system, I'd likely to have real time stock prices, real time trade information, real time updates to the order book, perhaps real time chat to enable traders to collude and manipulate the market. Should I have one websocket to handle all the above data flow or is it better to have several websocket to handle different topics?
It all depends. Let's look at your options, assuming your stock trader, your chat, and your order book are built as separate servers/micro-services.
One WebSocket for each server
You can have each server running their own WebSocket server, streaming events relevant to that server.
Pros
It is a simple approach. Each server is independent.
Cons
Scales poorly. The number of open TCP connections will come at a price as the number of concurrent users increases. Increased complexity when you need to replicate the servers for redundancy, as all replicas needs to broadcast the same events. You also have to build your own fallback for recovering from client data going stale due to lost WebSocket connection. Need to create event handlers on the client for each type of event. Might have to add version handling to prevent data races if initial data is fetched over HTTP, while events are sent on the separate WebSocket connection.
Publish/Subscribe event streaming
There are many publish/subscribe solutions available, such as Pusher, PubNub or SocketCluster. The idea is often that your servers publish events on a topic/subject to a message queue, which is listened to by WebSocket servers that forwards the events to the connected clients.
Pros
Scales more easily. The server only needs to send one message, while you can add more WebSocket servers as the number of concurrent users increases.
Cons
You most likely still have to handle recovery from events lost during disconnect. Still might require versioning to handle data races. And still need to write handlers for each type of event.
Realtime API gateway
This part is more shameless, as it covers Resgate, an open source project I've been involved in myself. But it also applies to solutions such as Firebase. With the term "realtime API gateway", I mean an API gateway that not only handles HTTP requests, but operates bidirectionally over WebSocket as well.
With web clients, you are seldom interested in events - you are interested in change of state. Events are just means to either describe the changes. By fetching the data through a gateway, it can keep track on which resources the client is currently interested in. It will then keep the client up to date for as long as the data is being used.
Pros
Scales well. Client requires no custom code for event handling, as the system updates the client data for you. Handles recovery from lost connections. No data races. Simple to work with.
Cons
Primarily for client rendered web sites (using React, Vue, Angular, etc), as it works poorly with sites with server-rendered pages. Harder to apply to already existing HTTP API's.

Why should I use JMS for MOM

I am really curious about this topic.
I will create a communication mechanism for internal systems and may also need connection to some external clients too. The internal modules are also distributed systems.
I need to create a ESB between that modules. The system should provide high performence over millions of subscribers.
publish subscribe or p2p communications are both needed,
When I first started to thinking about that implementation , I was planed to make a REST api on front and the REST api will communicate with a JMS bus .The JMS bus has an ability to provide communication between internal systems.
Unfortunately as per my investigation, using JMS can be caused so musch critical problems : performance,scalability... and looks like JMS is needless, I can create some adapters over internal modules and both can communicate with REST services.
Does anyone have any idea why should I use JMS for internal communication ?
Both REST and JMS/MQ enable communicate between remote systems (and local). You can get help based on the scenarios below:
Some Reasons for using JMS in your case:
If your producer is spitting messages at a very high rate than the consumer then the persistent messaging will help. This may also mean you are fine with the transaction/message to be processed later.
All systems are not up all the time.
You need a publish subscribe mechanism (topic).
Messages are not critical and discard old messages when load is high.
Reasons for using REST API (without any jms connected):
1. You want an immediate response that transaction is completed. Example, hotel booking etc.
2. All systems should be up all the time for processing to complete.
You would want to use JMS (or enterprise messaging) when don't have to rely on all the systems being available. So if one of your internal systems was down for some reason, then a REST api interface would fail when communicating to that system, but a JMS interface would not as you are communicating to the MOM.
For some MOM you don't have to just communicate using JMS, so you can have different runtimes communicate to the MOM.

Web server and ZeroMQ patterns

I am running an Apache server that receives HTTP requests and connects to a daemon script over ZeroMQ. The script implements the Multithreaded Server pattern (http://zguide.zeromq.org/page:all#header-73), it successfully receives the request and dispatches it to one of its worker threads, performs the action, responds back to the server, and the server responds back to the client. Everything is done synchronously as the client needs to receive a success or failure response to its request.
As the number of users is growing into a few thousands, I am looking into potentially improving this. The first thing I looked at is the different patterns of ZeroMQ, and whether what I am using is optimal for my scenario. I've read the guide but I find it challenging understanding all the details and differences across patterns. I was looking for example at the Load Balancing Message Broker pattern (http://zguide.zeromq.org/page:all#header-73). It seems quite a bit more complicated to implement than what I am currently using, and if I understand things correctly, its advantages are:
Actual load balancing vs the round-robin task distribution that I currently have
Asynchronous requests/replies
Is that everything? Am I missing something? Given the description of my problem, and the synchronous requirement of it, what would you say is the best pattern to use? Lastly, how would the answer change, if I want to make my setup distributed (i.e. having the Apache server load balance the requests across different machines). I was thinking of doing that by simply creating yet another layer, based on the Multithreaded Server pattern, and have that layer bridge the communication between the web server and my workers.
Some thoughts about the subject...
Keep it simple
I would try to keep things simple and "plain" ZeroMQ as long as possible. To increase performance, I would simply to change your backend script to send request out from dealer socket and move the request handling code to own program. Then you could just run multiple worker servers in different machines to get more requests handled.
I assume this was the approach you took:
I was thinking of doing that by simply creating yet another layer, based on the Multithreaded Server pattern, and have that layer bridge the communication between the web server and my workers.
Only problem here is that there is no request retry in the backend. If worker fails to handle given task it is forever lost. However one could write worker servers so that they handle all the request they got before shutting down. With this kind of setup it is possible to update backend workers without clients to notice any shortages. This will not save requests that get lost if the server crashes.
I have the feeling that in common scenarios this kind of approach would be more than enough.
Mongrel2
Mongrel2 seems to handle quite many things you have already implemented. It might be worth while to check it out. It probably does not completely solve your problems, but it provides tested infrastructure to distribute the workload. This could be used to deliver the request to be handled to multithreaded servers running on different machines.
Broker
One solution to increase the robustness of the setup is a broker. In this scenario brokers main role would be to provide robustness by implementing queue for the requests. I understood that all the requests the worker handle are basically the same type. If requests would have different types then broker could also do lookups to find correct server for the requests.
Using the queue provides a way to ensure that every request is being handled by some broker even if worker servers crashed. This does not come without price. The broker is by itself a single point of failure. If it crashes or is restarted all messages could be lost.
These problems can be avoided, but it requires quite much work: the requests could be persisted to the disk, servers could be clustered. Need has to be weighted against the payoffs. Does one want to use time to write a message broker or the actual system?
If message broker seems a good idea the time which is required to implement one can be reduced by using already implemented product (like RabbitMQ). Negative side effect is that there could be a lot of unwanted features and adding new things is not so straight forward as to self made broker.
Writing own broker could covert toward inventing the wheel again. Many brokers provide similar things: security, logging, management interface and so on. It seems likely that these are eventually needed in home made solution also. But if not then single home made broker which does single thing and does it well can be good choice.
Even if broker product is chosen I think it is a good idea to hide the broker behind ZeroMQ proxy, a dedicated code that sends/receives messages from the broker. Then no other part of the system has to know anything about the broker and it can be easily replaced.
Using broker is somewhat developer time heavy. You either need time to implement the broker or time to get use to some product. I would avoid this route until it is clearly needed.
Some links
Comparison between broker and brokerless
RabbitMQ
Mongrel2

Factors Affected for Low Performance of middleware Messaging Softwares

I am planning to inegrate messaging middleware in my web application. Right now I am tesing different messaging middleware software like RabbitMQ,JMS, HornetQ, etc..
Examples provided with this softwares are working but its not giving as desired results.
So, I want to know that which are the factors which are responsible to improve peformance that one should keep in eyes?
Which are the areas, a developer should take care of to improve the performance of middleware messaging software?
I'm the project lead for HornetQ but I will try to give you a generic answer that could be applied to any message system you choose.
A common question that I see is people asking why a single producer / single consumer won't give you the expected performance.
When you send a message, and are asking confirmation right away, you need to wait:
The message transfer from client to server
The message being persisted on the disk
The server acknowledging receipt of the message by sending a callback to the client
Similarly when you are receiving a message, you ACK to the server:
The ACK is sent from client to server
The ACK is persisted
The server sends back a callback saying that the callback was achieved
And if you need confirmation for all your message-sends and mesage-acks you need to wait these steps as you have a hardware involved on persisting the disk and sending bits on the network.
Message Systems will try to scale up with many producers and many consumers. That is if many are producing they should all use the resources available at the server shared for all the consumers.
There are ways to speed up a single producer or single consumer:
One is by using transactions. So, you minimize the blocks and syncs you perform on disk while persisting at the server and roundtrips on the network. (This is actually the same on any database)
Another one, is by using Callbacks instead of blocking at the consumer. (JMS 2 is proposing a Callback similar to the ConfirmationHandler on HornetQ).
Also: most providers I know will have a performance section on their docs with requirements and suggestions for that specific product. You should look individually at each product

Resources