Why do you need a message queue for a chat with web sockets? - socket.io

I have seen a lot of examples on the internet of chats using web sockets and RabbitMQ (https://github.com/videlalvaro/rabbitmq-chat), however I do not understand why it is need it a message queue for a chat application.
Why it is not ok to send the message from the browser via web sockets to the server and then the server to broadcast that message to the rest of active browsers using again web sockets with broadcast method? (maybe I am missing something)
Pseudo code examples (using socket.io):
// client (browser)
socket.emit("message","my great message that will be received by all"
// server (any server can be, but let's just say that it is also written in JavaScript
socket.on("message", function(msg) {
socket.broadcast.emit(data);
});
// the rest of the browsers
socket.on("message", function(msg) {
// display on the screen the message
});

i don't think RabbitMQ should be used for a chat room, personally. at least, not in the "chat" or "room" part of the application.
unless your chat rooms don't care about history at all - and i think most do care about that - a message queue like RMQ doesn't make much sense.
you would be better off storing the message in a database and keeping a marker for each user to say what message they last saw.
now, you may end up needing something like RMQ to facilitate the process of the chat application. you can offload process from the web servers, for example, and push all messages through RMQ to a back-end service that updates the database and cache layers, for example.
this would allow you to scale the front-end web servers much faster, and support more users per web server. and that sounds like a good use of RMQ, but is not specific to chat apps. it's just good practice for scaling web apps / systems.
the key, in my experience, is that RMQ is not responsible for delivery of the messages to the users / chat rooms. that happens through websockets or similar technologies that are designed to be used per user.

Simple answer ...
For a simple chat app you don't need a queue (e.g. signalr would do exactly this without the queue).
Typically though real world applications are not just "a simple chat app", the queue might represent the current state of the room for new users joining perhaps, so the server knows what list of messages to serve up when that happens.
Also it's worth noting that message queues are often implemented when you want reliable messaging (e.g. Service bus) to ensure that all messages definitely get to where they should go even if the first attempt fails. So it's likely that the queue is included in many examples as a default primer in to later problem solving.

I may be late for the answer as the messaging domain changed rapidly in last few years. Applications like WhatsApp do not store messages in their database, and also provide E2E encryption.
Coming to RabbitMQ, they support MQTT protocol which is ideal for low latency high scalability applications. Thus using such queuing services offload the heavy work from your server and provide features like scalability and security.

uhmm I didn't understand exactly for are you looking for...
but In RabbiMQ you always publish a messages to an exchange and consume the message using a queue.
to "broadcast that message" you need to consume it.
hope it helps

Related

Send, Publish and Request/Response in MasstTransit

Recently I am trying to use MassTransit in our microservice ecosystem.
According to MassTransit vocabulary and from documents my understanding is :
Publish: Sends a message to 1 or many subscribers (Pub/Sub Pattern) to propagate the message.
Send: Used to send messages in fire and forget fashion like publish, but instead It is just used for one receiver. The main difference with Publish is that in Send if your destination didn't receive a message, it would return an exception.
Requests: uses request/reply pattern to just send a message and get a response in a different channel to be able to get response value from the receiver.
Now, my question is according to the Microservice concept, to follow the event-driven design, we use Publish to propagate messages(Events) to the entire ecosystem. but what is exactly the usage (use case) of Send here? Just to get an exception if the receiver doesn't exist?
My next question is that is it a good approach to use Publish, Send and Requests in a Microservices ecosystem at the same time? like publish for propagation events, Send for command (fire and forget), and Requests for getting responses from the destination.
----- Update
I also found here which Chris Patterson clear lots of things. It also helps me a lot.
Your question is not related to MassTransit. MassTransit implements well-known messaging patterns thoughtfully described on popular resources such as Enterprise Integration Patterns
As Eben wrote in his answer, the decision of what pattern to use is driven by intent. There are also technical differences in the message delivery mechanics for each pattern.
Send is for commands, you tell some other service to do something. You do not wait for a reply (fire and forget), although you might get a confirmation of the action success or failure by other means (an event, for example).
It is an implementation of the point-to-point channel, where you also can implement competing consumers to scale the processing, but those will be instances of the same service.
With MassTransit using RabbitMQ it's done by publishing messages to the endpoint exchange rather than to the message type exchange, so no other endpoints will get the message even though they can consume it.
Publish is for events. It's a broadcast type of delivery or fan-out. You might be publishing events to which no one is listening, so you don't really know who will be consuming them. You also don't expect any response.
It is an implementation of the publish-subscribe channel.
MassTransit with RabbitMQ creates exchanges for each message type published and publishes messages to those exchanges. Consumers create bindings between their endpoint exchanges and message exchanges, so each consumer service (different apps) will get those in their independent queues.
Request-response can be used for both commands that need to be confirmed, or for queries.
It is an implementation of the request-reply message pattern.
MassTransit has nice diagrams in the docs explaining the mechanics for RabbitMQ.
Those messaging patterns are frequently used in a complex distributed system in different combinations and variations.
The difference between Send and Publish has to do with intent.
As you stated, Send is for commands and Publish is for events. I worked on a large enterprise system once running on webMethods as the integration engine/service bus and only events were used. I can tell you that it was less than ideal. If the distinction had been there between commands and events it would've made a lot more sense to more people. Anyway, technically one needs a message enqueued and on that level it doesn't matter, which is why a queueing mechanism typically would not care about such semantics.
To illustrate this with a silly example: Facebook places and Event on my timeline that one of my friends is having a birthday on a particular day. I can respond directly (send a message) or I could publish a message on my timeline and hope my friend sees it. Another silly example: You send an e-mail to PersonA and CC 4 others asking "Please produce report ABC". PersonA would be expected to produce the report or arrange for it to be done. If that same e-mail went to all five people as the recipient (no CC) then who gets to do it? I know, even for Publish one could have a 1-1 recipient/topic but what if another endpoint subscribed? What would that mean?
So the sender is responsible, still configurable as subscriptions are, to determine where to Send the message to. For my own service bus I use an implementation of an IMessageRouteProvider interface. A practical example in a system I once developed was where e-mails received had to have their body converted to an image for a content store (IBM FileNet P8 if memory serves). For reasons I will not go into the systems were stopped each night at 20h00 and restarted at 6h00 in the morning. This led to a backlog of usually around 8000 e-mails that had to be converted. The conversion endpoint would process a conversion in about 2 seconds but that still takes a while to work through. In the meantime the web front-end folks could request PDF files for conversion to paged TIFF files. Now, these ended up at the end of the queue and they would have to wait hours for that to come back. The solution was to implement another conversion endpoint, with its own queue, and have the web front-end configured to send the same message type, e.g. ConvertDocumentCommand to that "priority" queue for processing. Pretty easy to do. Now, if that had been a publish how would I do that split? The same event going to 2 different endpoints under different circumstances? Well, you could have another subscription store for your system but now you'd need to maintain both. There could be another answer such as coding this logic into the send bit but that is a design choice and would require coding changes.
In my own Shuttle.Esb service bus I only have Send and Publish. For request/response both the sender and receiver have an inbox and a request would be sent (Send) to the receiver and it in turn could reply (also a Send but uses the sender's URI).

Should a websocket connection be general or specific?

Should a websocket connection be general or specific?
e.g. If I was building a stock trading system, I'd likely to have real time stock prices, real time trade information, real time updates to the order book, perhaps real time chat to enable traders to collude and manipulate the market. Should I have one websocket to handle all the above data flow or is it better to have several websocket to handle different topics?
It all depends. Let's look at your options, assuming your stock trader, your chat, and your order book are built as separate servers/micro-services.
One WebSocket for each server
You can have each server running their own WebSocket server, streaming events relevant to that server.
Pros
It is a simple approach. Each server is independent.
Cons
Scales poorly. The number of open TCP connections will come at a price as the number of concurrent users increases. Increased complexity when you need to replicate the servers for redundancy, as all replicas needs to broadcast the same events. You also have to build your own fallback for recovering from client data going stale due to lost WebSocket connection. Need to create event handlers on the client for each type of event. Might have to add version handling to prevent data races if initial data is fetched over HTTP, while events are sent on the separate WebSocket connection.
Publish/Subscribe event streaming
There are many publish/subscribe solutions available, such as Pusher, PubNub or SocketCluster. The idea is often that your servers publish events on a topic/subject to a message queue, which is listened to by WebSocket servers that forwards the events to the connected clients.
Pros
Scales more easily. The server only needs to send one message, while you can add more WebSocket servers as the number of concurrent users increases.
Cons
You most likely still have to handle recovery from events lost during disconnect. Still might require versioning to handle data races. And still need to write handlers for each type of event.
Realtime API gateway
This part is more shameless, as it covers Resgate, an open source project I've been involved in myself. But it also applies to solutions such as Firebase. With the term "realtime API gateway", I mean an API gateway that not only handles HTTP requests, but operates bidirectionally over WebSocket as well.
With web clients, you are seldom interested in events - you are interested in change of state. Events are just means to either describe the changes. By fetching the data through a gateway, it can keep track on which resources the client is currently interested in. It will then keep the client up to date for as long as the data is being used.
Pros
Scales well. Client requires no custom code for event handling, as the system updates the client data for you. Handles recovery from lost connections. No data races. Simple to work with.
Cons
Primarily for client rendered web sites (using React, Vue, Angular, etc), as it works poorly with sites with server-rendered pages. Harder to apply to already existing HTTP API's.

Web Notifications (HTML5) - How it works?

I'm trying to understand whether the HTML5 Web Notifications API can help me out, but I'm falling short in understanding how it works.
I'd like user_a to be able to send user_b a message within my webapp.
I'd like user_b to receive a notification of this.
Can the web notifications API help here? Does it let me specifically target a user (rather than notify everyone the site has been updated_? I can't see how I would create an alert for one person.
Can anyone help me understand a little more?
The notifications API is client side, so it needs to get events from another client-side technology. Here, read THIS: http://nodejs.org/api/. Just kidding. Node.js+socket.io is probably the best way to go here, you can emit events to one or all clients (broadcast). That's a push scenario. Or each user could be pulling their notifications from the server.
HTML5 Web Notifications API gives you ability to display desktop notifications that your application has generated.
What you are trying to achieve is a different thing and web notification is just a part of your scenario.
Depending upon how you are managing your application, for chat and messaging purpose as humbolight mentioned, you should look into node.js. it will provide you the necessary back-end to manage sending and receiving messages between users.
To notify a user that (s)he has received a message, you can opt for ajax polling on client side.
Simply create a javascript that pings the server every x seconds and checks if there is any notification or new message available for this user.
If response is successful, then you can use HTML5 notification API to show a message to user that (s)he has a new message.
The main problem with long polling is server load, and bandwidth usage even when there are no messages, and if number of users are in thousands then you can expect your server always busy responding to poll calls.
An alternate is to use Server Sent Events API, where you send a request to server and then server PUSHES the notifications/messages to the client as soon as they are available.
This reduces the unnecessary client->server polling and seems much better option in your case.
To get started you can check a good tutorial at
HTML5Rocks
What you're looking for is WebSocket. It's the technology that allows a client (browser) to open a persistent connection to the server and receive data from it at the server's whim, rather than having to "poll" the server to see if there's anything new.
Other answers here have already mentioned node.js, but Node is simply one (though arguably the best) option for implementing websockets on your server. You might also be comfortable with Ratchet, which is a websocket server library for PHP, or Tornado which is in Python.
How you handle your real-time communication is up to you. Websockets are merely the underlying technology that you can use to pass data back and forth. The client side of this will be fairly easy, but on the server side, you'll need a mechanism for websocket handlers to get information from each other. Look at tools like ZeroMQ for handling queues, and Memcached or Redis to handle large swaths of data which don't need to be stored permanently.

Factors Affected for Low Performance of middleware Messaging Softwares

I am planning to inegrate messaging middleware in my web application. Right now I am tesing different messaging middleware software like RabbitMQ,JMS, HornetQ, etc..
Examples provided with this softwares are working but its not giving as desired results.
So, I want to know that which are the factors which are responsible to improve peformance that one should keep in eyes?
Which are the areas, a developer should take care of to improve the performance of middleware messaging software?
I'm the project lead for HornetQ but I will try to give you a generic answer that could be applied to any message system you choose.
A common question that I see is people asking why a single producer / single consumer won't give you the expected performance.
When you send a message, and are asking confirmation right away, you need to wait:
The message transfer from client to server
The message being persisted on the disk
The server acknowledging receipt of the message by sending a callback to the client
Similarly when you are receiving a message, you ACK to the server:
The ACK is sent from client to server
The ACK is persisted
The server sends back a callback saying that the callback was achieved
And if you need confirmation for all your message-sends and mesage-acks you need to wait these steps as you have a hardware involved on persisting the disk and sending bits on the network.
Message Systems will try to scale up with many producers and many consumers. That is if many are producing they should all use the resources available at the server shared for all the consumers.
There are ways to speed up a single producer or single consumer:
One is by using transactions. So, you minimize the blocks and syncs you perform on disk while persisting at the server and roundtrips on the network. (This is actually the same on any database)
Another one, is by using Callbacks instead of blocking at the consumer. (JMS 2 is proposing a Callback similar to the ConfirmationHandler on HornetQ).
Also: most providers I know will have a performance section on their docs with requirements and suggestions for that specific product. You should look individually at each product

Progress notifications from HTTP/REST service

I'm working on a web application that submits tasks to a master/worker system that farms out the tasks to any of a series of worker instances. The work queue master runs as a separate process (on a separate machine altogether) and tasks are submitted to the master via HTTP/REST requests. Once tasks are submitted to the work queue, client applications can submit another HTTP request to get status information about tasks.
For my web application, I'd like it to provide some sort of progress bar view that gives the user some indication of how far along task processing has come. The obvious way to implement this would be an AJAX progress meter widget that periodically polls the work queue for status on the tasks that have been submitted. My question is, is there a better way to accomplish this without the frequent polling?
I've considered having the client web application open up a server socket on which it could listen for notifications from the work master. Another similar thought I've had is to use XMPP or a similar protocol for the status notifications. (Of course, the master/worker system would need to be updated to provide notifications either way but I own the code for that so can make any necessary updates myself.)
Any thoughts on the best way to set up a notification system like this? Is the extra effort involved worth it, or is the simple polling solution the way to go?
Polling
The client keeps polling the server to get the status of the response.
Pros
Being really RESTful means cacheable and scaleable.
Cons
Not the best responsiveness if you do not want to poll your server too much.
Persistent connection
The server does not close its HTTP connection with the client until the response is complete. The server can send intermediate status through this connection using HTTP multiparts.
Comet is the most famous framework to implement this behaviour.
Pros
Best responsiveness, almost real-time notifications from the server.
Cons
Connection limit is limited on a web server, keeping a connection open for too long might, at best load your server, at worst open the server to Denial of Service attacks.
Client as a server
Make the server post status updates and the response to the client as if it were another RESTful application.
Pros
Best of every worlds, no resources are wasted waiting for the response, either on the server or on the client side.
Cons
You need a full HTTP server and web application stack on the client
Firewalls and routers with their default "no incoming connections at all" will get in the way.
Feel free to edit to add your thoughts or a new method!
I guess it depends on a few factors
How accurate the feedback can be (1 percent, 5 percent, 50 percent) Accurate feedback makes it worth pursuing some kind of progress bar and comet style push. If you can only say "Busy... hold on... almost there... done" then a simple ajax "are we there yet" poll is certainly easier to code.
How timely the Done message has to be seen by the client
How long each task takes (1 second, 10 seconds, 10 minutes)
1 second makes it a bit moot. 10 seconds makes it worth it. 10 minutes means you're better off suggesting the user goes for a coffee break :-)
How many concurrent requests there will be
Unless you've got a "special" server, live push style systems tend to eat connections and you'll be maxed out pretty quickly. Having to throw more webservers in for a fancy progress bar might hurt the budget.
I've got some sample code on 871184 that shows a hand rolled "forever frame" which seems to work out well. The project I developed that for isn't hammered all that hard though, the operations take a few seconds and we can give pretty accurate percent. The code uses asp.net and jquery, but the general techniques will work with any server and javascript framework.
edit As John points out, status reporting probably isn't the job of the RESTful service. But there's nothing that says you can't open an iframe on the client that hooks to a page on the server that polls the service. Theory says the server and the service will at least be closer to one another :-)
Look into Comet. You make a single request to the server and the server blocks and holds the connection open until an update in status occurs. Once that happens the response is sent and committed. The browser receives this response, handles it and immediately re-requests the same URL. The effect is that of events being pushed to the browser. There are pros and cons and it might not be appropriate for all use cases but would provide the most timely status updates.
My opinion is to stick with the polling solution, but you might be interested in this Wikipedia article on HTTP Push technologies.
REST depends on HTTP, which is a request/response protocol. I don't think you're going to get a pure HTTP server calling the client back with status.
Besides, status reporting isn't the job of the service. It's up to the client to decide when, or if, it wants status reported.
One approach I have used is:
When the job is posted to the server, the server responds back a pubnub-channel id (one could alternatively use Google's PUB-SUB kind of service).
The client on browser subscribes to that channel and starts listening for messages.
The worker/task server publishes status on that pubnub channel to update the progress.
On receiving messages on the subscribed pubnub-channel, the client updates the web UI.
You could also use self-refreshing iframe, but AJAX call is much better. I don't think there is any other way.
PS: If you would open a socket from client, that wouldn't change much - PHP browser would show the page as still "loading", which is not very user-friendly. (assuming you would push or flush buffer to have other things displayed before)

Resources