Factors Affected for Low Performance of middleware Messaging Softwares - jms

I am planning to inegrate messaging middleware in my web application. Right now I am tesing different messaging middleware software like RabbitMQ,JMS, HornetQ, etc..
Examples provided with this softwares are working but its not giving as desired results.
So, I want to know that which are the factors which are responsible to improve peformance that one should keep in eyes?
Which are the areas, a developer should take care of to improve the performance of middleware messaging software?

I'm the project lead for HornetQ but I will try to give you a generic answer that could be applied to any message system you choose.
A common question that I see is people asking why a single producer / single consumer won't give you the expected performance.
When you send a message, and are asking confirmation right away, you need to wait:
The message transfer from client to server
The message being persisted on the disk
The server acknowledging receipt of the message by sending a callback to the client
Similarly when you are receiving a message, you ACK to the server:
The ACK is sent from client to server
The ACK is persisted
The server sends back a callback saying that the callback was achieved
And if you need confirmation for all your message-sends and mesage-acks you need to wait these steps as you have a hardware involved on persisting the disk and sending bits on the network.
Message Systems will try to scale up with many producers and many consumers. That is if many are producing they should all use the resources available at the server shared for all the consumers.
There are ways to speed up a single producer or single consumer:
One is by using transactions. So, you minimize the blocks and syncs you perform on disk while persisting at the server and roundtrips on the network. (This is actually the same on any database)
Another one, is by using Callbacks instead of blocking at the consumer. (JMS 2 is proposing a Callback similar to the ConfirmationHandler on HornetQ).
Also: most providers I know will have a performance section on their docs with requirements and suggestions for that specific product. You should look individually at each product

Related

Simple Server to PUSH lots of data to Browser?

I'm building a Web Application that consumes data pushed from Server.
Each message is JSON and could be large, hundreds of kilobytes, and messages send couple times per minute, and the order doesn't matter.
The Server should be able to persist not yet delivered messages, potentially storing couple of megabytes for client for couple of days, until client won't get online. There's a limit on the storage size for unsent messages, say 20mb per client, and old undelivered messages get deleted when this limit is exceeded.
Server should be able to handle around 1 thousand simultaneous connections. How it could be implemented simply?
Possible Solutions
I was thinking maybe store messages as files on disk and use Browser Pool for 1 sec, to check for new messages and serve it with NGinx or something like that? Is there some configs / modules for NGinx for such use cases?
Or maybe it's better to use MQTT Server or some Message Queue like Rabbit MQ with some Browser Adapter?
Actually, MQTT supports the concept of sessions that persist across client connections, but the client must first connect and request a "non-clean" session. After that, if the client is disconnected, the broker will hold all the QoS=1 or 2 messages destined for that client until it reconnects.
With MQTT v3.x, technically, the server is supposed to hold all the messages for all these disconnected clients forever! Each messages maxes out at a 256MB payload, but the server is supposed to hold all that you give it. This created a big problem for servers that MQTT v5 came in to fix. And most real-world brokers have configurable settings around this.
But MQTT shines if the connections are over unreliable networks (wireless, cell modems, etc) that may drop and reconnect unexpectedly.
If the clients are connected over fairly reliable networks, AMQP with RabbitMQ is considerably more flexible, since clients can create and manage the individual queues. But the neat thing is that you can mix the two protocols using RabbitMQ, as it has an MQTT plugin. So, smaller clients on an unreliable network can connect via MQTT, and other clients can connect via AMQP, and they can all communicate with each other.
MQTT is most likely not what you are looking for. The protocol is meant to be lightweight and as the comments pointed out, the protocol specifies that there may only exist "Control Packets of size up to 268,435,455 (256 MB)" source. Clearly, this is much too small for your use case.
Moreover, if a client isn't connected (and subscribed on that particular topic) at the time of the message being published, the message will never be delivered. EDIT: As #Brits pointed out, this only applies to QoS 0 pubs/subs.
Like JD Allen mentioned, you need a queuing service like Rabbit MQ or AMQ. There are countless other such services/libraries/packages in existence so please investigate more.
If you want to role your own, it might be worth considering using AWS SQS and wrapping some of your own application logic around it. That'll likely be a bit hacky though, so take that suggestion with a grain of salt.

Should a websocket connection be general or specific?

Should a websocket connection be general or specific?
e.g. If I was building a stock trading system, I'd likely to have real time stock prices, real time trade information, real time updates to the order book, perhaps real time chat to enable traders to collude and manipulate the market. Should I have one websocket to handle all the above data flow or is it better to have several websocket to handle different topics?
It all depends. Let's look at your options, assuming your stock trader, your chat, and your order book are built as separate servers/micro-services.
One WebSocket for each server
You can have each server running their own WebSocket server, streaming events relevant to that server.
Pros
It is a simple approach. Each server is independent.
Cons
Scales poorly. The number of open TCP connections will come at a price as the number of concurrent users increases. Increased complexity when you need to replicate the servers for redundancy, as all replicas needs to broadcast the same events. You also have to build your own fallback for recovering from client data going stale due to lost WebSocket connection. Need to create event handlers on the client for each type of event. Might have to add version handling to prevent data races if initial data is fetched over HTTP, while events are sent on the separate WebSocket connection.
Publish/Subscribe event streaming
There are many publish/subscribe solutions available, such as Pusher, PubNub or SocketCluster. The idea is often that your servers publish events on a topic/subject to a message queue, which is listened to by WebSocket servers that forwards the events to the connected clients.
Pros
Scales more easily. The server only needs to send one message, while you can add more WebSocket servers as the number of concurrent users increases.
Cons
You most likely still have to handle recovery from events lost during disconnect. Still might require versioning to handle data races. And still need to write handlers for each type of event.
Realtime API gateway
This part is more shameless, as it covers Resgate, an open source project I've been involved in myself. But it also applies to solutions such as Firebase. With the term "realtime API gateway", I mean an API gateway that not only handles HTTP requests, but operates bidirectionally over WebSocket as well.
With web clients, you are seldom interested in events - you are interested in change of state. Events are just means to either describe the changes. By fetching the data through a gateway, it can keep track on which resources the client is currently interested in. It will then keep the client up to date for as long as the data is being used.
Pros
Scales well. Client requires no custom code for event handling, as the system updates the client data for you. Handles recovery from lost connections. No data races. Simple to work with.
Cons
Primarily for client rendered web sites (using React, Vue, Angular, etc), as it works poorly with sites with server-rendered pages. Harder to apply to already existing HTTP API's.

Why do you need a message queue for a chat with web sockets?

I have seen a lot of examples on the internet of chats using web sockets and RabbitMQ (https://github.com/videlalvaro/rabbitmq-chat), however I do not understand why it is need it a message queue for a chat application.
Why it is not ok to send the message from the browser via web sockets to the server and then the server to broadcast that message to the rest of active browsers using again web sockets with broadcast method? (maybe I am missing something)
Pseudo code examples (using socket.io):
// client (browser)
socket.emit("message","my great message that will be received by all"
// server (any server can be, but let's just say that it is also written in JavaScript
socket.on("message", function(msg) {
socket.broadcast.emit(data);
});
// the rest of the browsers
socket.on("message", function(msg) {
// display on the screen the message
});
i don't think RabbitMQ should be used for a chat room, personally. at least, not in the "chat" or "room" part of the application.
unless your chat rooms don't care about history at all - and i think most do care about that - a message queue like RMQ doesn't make much sense.
you would be better off storing the message in a database and keeping a marker for each user to say what message they last saw.
now, you may end up needing something like RMQ to facilitate the process of the chat application. you can offload process from the web servers, for example, and push all messages through RMQ to a back-end service that updates the database and cache layers, for example.
this would allow you to scale the front-end web servers much faster, and support more users per web server. and that sounds like a good use of RMQ, but is not specific to chat apps. it's just good practice for scaling web apps / systems.
the key, in my experience, is that RMQ is not responsible for delivery of the messages to the users / chat rooms. that happens through websockets or similar technologies that are designed to be used per user.
Simple answer ...
For a simple chat app you don't need a queue (e.g. signalr would do exactly this without the queue).
Typically though real world applications are not just "a simple chat app", the queue might represent the current state of the room for new users joining perhaps, so the server knows what list of messages to serve up when that happens.
Also it's worth noting that message queues are often implemented when you want reliable messaging (e.g. Service bus) to ensure that all messages definitely get to where they should go even if the first attempt fails. So it's likely that the queue is included in many examples as a default primer in to later problem solving.
I may be late for the answer as the messaging domain changed rapidly in last few years. Applications like WhatsApp do not store messages in their database, and also provide E2E encryption.
Coming to RabbitMQ, they support MQTT protocol which is ideal for low latency high scalability applications. Thus using such queuing services offload the heavy work from your server and provide features like scalability and security.
uhmm I didn't understand exactly for are you looking for...
but In RabbiMQ you always publish a messages to an exchange and consume the message using a queue.
to "broadcast that message" you need to consume it.
hope it helps

Web server and ZeroMQ patterns

I am running an Apache server that receives HTTP requests and connects to a daemon script over ZeroMQ. The script implements the Multithreaded Server pattern (http://zguide.zeromq.org/page:all#header-73), it successfully receives the request and dispatches it to one of its worker threads, performs the action, responds back to the server, and the server responds back to the client. Everything is done synchronously as the client needs to receive a success or failure response to its request.
As the number of users is growing into a few thousands, I am looking into potentially improving this. The first thing I looked at is the different patterns of ZeroMQ, and whether what I am using is optimal for my scenario. I've read the guide but I find it challenging understanding all the details and differences across patterns. I was looking for example at the Load Balancing Message Broker pattern (http://zguide.zeromq.org/page:all#header-73). It seems quite a bit more complicated to implement than what I am currently using, and if I understand things correctly, its advantages are:
Actual load balancing vs the round-robin task distribution that I currently have
Asynchronous requests/replies
Is that everything? Am I missing something? Given the description of my problem, and the synchronous requirement of it, what would you say is the best pattern to use? Lastly, how would the answer change, if I want to make my setup distributed (i.e. having the Apache server load balance the requests across different machines). I was thinking of doing that by simply creating yet another layer, based on the Multithreaded Server pattern, and have that layer bridge the communication between the web server and my workers.
Some thoughts about the subject...
Keep it simple
I would try to keep things simple and "plain" ZeroMQ as long as possible. To increase performance, I would simply to change your backend script to send request out from dealer socket and move the request handling code to own program. Then you could just run multiple worker servers in different machines to get more requests handled.
I assume this was the approach you took:
I was thinking of doing that by simply creating yet another layer, based on the Multithreaded Server pattern, and have that layer bridge the communication between the web server and my workers.
Only problem here is that there is no request retry in the backend. If worker fails to handle given task it is forever lost. However one could write worker servers so that they handle all the request they got before shutting down. With this kind of setup it is possible to update backend workers without clients to notice any shortages. This will not save requests that get lost if the server crashes.
I have the feeling that in common scenarios this kind of approach would be more than enough.
Mongrel2
Mongrel2 seems to handle quite many things you have already implemented. It might be worth while to check it out. It probably does not completely solve your problems, but it provides tested infrastructure to distribute the workload. This could be used to deliver the request to be handled to multithreaded servers running on different machines.
Broker
One solution to increase the robustness of the setup is a broker. In this scenario brokers main role would be to provide robustness by implementing queue for the requests. I understood that all the requests the worker handle are basically the same type. If requests would have different types then broker could also do lookups to find correct server for the requests.
Using the queue provides a way to ensure that every request is being handled by some broker even if worker servers crashed. This does not come without price. The broker is by itself a single point of failure. If it crashes or is restarted all messages could be lost.
These problems can be avoided, but it requires quite much work: the requests could be persisted to the disk, servers could be clustered. Need has to be weighted against the payoffs. Does one want to use time to write a message broker or the actual system?
If message broker seems a good idea the time which is required to implement one can be reduced by using already implemented product (like RabbitMQ). Negative side effect is that there could be a lot of unwanted features and adding new things is not so straight forward as to self made broker.
Writing own broker could covert toward inventing the wheel again. Many brokers provide similar things: security, logging, management interface and so on. It seems likely that these are eventually needed in home made solution also. But if not then single home made broker which does single thing and does it well can be good choice.
Even if broker product is chosen I think it is a good idea to hide the broker behind ZeroMQ proxy, a dedicated code that sends/receives messages from the broker. Then no other part of the system has to know anything about the broker and it can be easily replaced.
Using broker is somewhat developer time heavy. You either need time to implement the broker or time to get use to some product. I would avoid this route until it is clearly needed.
Some links
Comparison between broker and brokerless
RabbitMQ
Mongrel2

What is the best way to deliver real-time messages to Client that can not be requested

We need to deliver real-time messages to our clients, but their servers are behind a proxy, and we cannot initialize a connection; webhook variant won't work.
What is the best way to deliver real-time messages considering that:
client that is behind a proxy
client can be off for a long period of time, and all messages must be delivered
the protocol/way must be common enough, so that even a PHP developer could easily use it
I have in mind three variants:
WebSocket - client opens a websocket connection, and we send messages that were stored in DB, and messages comming in real time at the same time.
RabbitMQ - all messages are stored in a durable, persistent queue. What if partner will not read from a queue for some time?
HTTP GET - partner will pull messages by blocks. In this approach it is hard to pick optimal pull interval.
Any suggestions would be appreciated. Thanks!
Since you seem to have to store messages when your peer is not connected, the question applies to any other solution equally: what if the peer is not connected and messages are queueing up?
RabbitMQ is great if you want loose coupling: separating the producer and the consumer sides. The broker will store messages for you if no consumer is connected. This can indeed fill up memory and/or disk space on the broker after some time - in this case RabbitMQ will shut down.
In general, RabbitMQ is a great tool for messaging-based architectures like the one you describe:
Load balancing: you can use multiple publishers and/or consumers, thus sharing load.
Flexibility: you can configure multiple exchanges/queues/bindings if your business logic needs it. You can easily change routing on the broker without reconfiguring multiple publisher/consumer applications.
Flow control: RabbitMQ also gives you some built-in methods for flow control - if a consumer is too slow to keep up with publishers, RabbitMQ will slow down publishers.
You can refactor the architecture later easily. You can set up multiple brokers and link them via shovel/federation. This is very useful if you need your app to work via multiple data centers.
You can easily spot if one side is slower than the other, since queues will start growing if your consumers can't read fast enough from a queue.
High availability and fault tolerance. RabbitMQ is very good at these (thanks to Erlang).
So I'd recommend it over the other two (which might be good for a small-scale app, but you might grow it out quickly is requirements change and you need to scale up things).
Edit: something I missed - if it's not vital to deliver all messages, you can configure queues with a TTL (message will be discarded after a timeout) or with a limit (this limits the number of messages in the queue, if reached new messages will be discarded).

Resources