I would like to know if zmq already solves following problem (or) the application sitting on top of zmq needs to take care of it.
1) A central publisher which publishes data to all subscribers. This data is static in nature, something like configuration. The data can be modified at any point in time.
2) Multiple subscribers subscribe to messages from this publisher. The publisher can join at any point in time.
3) If data changes, publisher should just publish the diff to the existing subscribers.
4) If a subscriber joins later, the publisher should publish all the data (current configuration) to the new subscriber.
Zeromq guide suggests following for solving Slow Joiner syndrome but this does not solve above problem.
http://zguide.zeromq.org/page:all#Slow-Subscriber-Detection-Suicidal-Snail-Pattern
The Clone pattern from the Guide does precisely what you want.
The problem I'm seeing with your setup is that it requires all the subscribers to have the same state. If all subscribers are at version 7 and you publish the 7-to-8 diff, then they all update to version 8. But this requires a tightly-coupled state synchronization between nodes. How would you handle the case when subscribers get out of sync?
Consider this alternative setup:
the "publisher" has a single ROUTER socket that it binds
each "subscriber" has a single DEALER socket that connects to the ROUTER
can't use a REQ socket because that would prohibit the sending of "update-hints" (details to follow)
when a subscriber i joins the network, it sends an "update" request to the publisher, so that the publisher is aware of the subscriber's identity and his current version version[i]
the publisher responds with the diffs necessary to bring subscriber i up to date
if data changes on the publisher (i.e., a new version) it sends an "update-hint" to all of the known subscribers
when a subscriber receives an "update-hint," it performs an "update" request
(optional) subscribers periodically send an "update" request (infrequent polling)
This approach has the following benefits:
the publisher is server; the subscribers are clients
the publisher never initiates the sending of any actual data - it only responds to requests from clients (that is, the "update-hints" don't count as sending actual data)
the subscribers are all independently keeping themselves up to date (eventual consistency) even though they may be out of sync intermittently
Related
I am implementing a REST API that internally places a message on a message queue and receives a message as a response on a different topic.
How could API implementation handle publishing and consuming different messages and responds to the client?
What if it never receives a message?
How does the service handle this time-out scenario?
Example
I am implementing a REST API to process an order. The implementation internally publishes a series of messages to verify the payment, update inventory, and prepare shipping info. Finally, it sends the response back to the client.
Queues are too low-level abstraction to implement your requirements directly. Look at an orchestration solution like temporal.io that makes programming such async systems trivial.
Disclaimer: I'm one of the founders of the Temporal open source project.
How could API implementation handle publishing and consuming different messages and responds to the client?
Even though messaging systems can be used in RPC like fashion:
there is a request topic/queue and a reply topic/queue
with a request identifier in the messages' header/metadata
this type of communication kills the promise of the messaging system: decouple components in time and space.
Back to your example. If ServiceA receives the request then it publishes a message to topicA and returns with an 202 Accepted status code to indicate that the request is received but not yet processed completely. In the response you can indicate an url on which the consumer of ServiceA's API can retrieve the latest status of its previously issued request.
What if it never receives a message?
In that case the request related data remains in the same state as it was at the time of the message publishing.
How does the service handle this time-out scenario?
You can create scheduled jobs to clean-up never finished/got stuck requests. Based on your business requirements you can simple delete them or transfer them to manual processing by the customer service.
Order placement use case
Rather than creating a customer-facing service which waits for all the processing to be done you can define several statuses/stages of the process:
Order requested
Payment verified
Items locked in inventory
...
Order placed
You can inform your customers about these status/stage changes via websocket, push notification, e-mail, etc.. The orchestration of this order placement flow can be achieved for example via the Saga pattern.
I am facing an issue when decoupling two systems by an event/message broker like Apache Kafka. The issue is related to a frontend triggering actions in a backend:
How does the producer (frontend service) know, that the published event has been properly handled by all the backend services (as consumers), if the publisher does not know neither the "identities" nor the count of consuming backends?
To be precise: Users can change for example their email address using a frontend UI. An associated service publishes that "change request" event to an appropriate topic within Kafka. The UI form is then "locked" to prevent subsequent change requests, until the change event has been fully processed by every consumer. But it's unclear how to detect this state.
You can use another topic to publish handled jobs. So your front-end publishes to one topic and your back-end publishes to another once it is done.
In Kafka terms, neither the producer nor consumer are considered backend - they're both clients connecting to a broker, which is generally considered to be the backend.
A producer will know that it has produced a message successfully, by virtue of the acks setting. A consumer will read a message, and then at a later point, its offset will be updated to a point corresponding to the last message it read. However, there is generally no interaction between a producer and a consumer, and they are generally completely unaware of one another.
Most of the articles on the web dealing with WebSockets are about in-memory Chat.
I'm interested in kind of less instant Chat, that is persistent, like a blog's post's comments.
I have a cluster of two servers handling client requests.
I wonder what could be the best strategy to handle pushing of database update to corresponding clients.
As I'm using Heroku to handle this cluster (of 2 web dynos), I obviously read this tutorial aiming to build a Chat Room shared between all clients.
It uses Redis in order to centralize coming messages; each server listening for new messages to propagate to web clients through websocket connections.
My use case differs in that I've got a Neo4j database, persisting into it each message written by any client.
My goal is to notify each client from a specific room that a new message/comment has just been persisted by a client.
With an architecture similar to the tutorial linked above, how could I filter only new messages to propagate to user? Is there an easy and efficient way to tell Redis:
"(WebSocket saying) When my client initiates the websocket connection, I take care to make a query for all persisted messages and sent them to client, however I want you (Redis) to feed me with all NEW messages, that I didn't send to client, so that I will be able to provide them."
How to prevent Redis from publishing the whole conversation each time a websocket connection is made? It would lead to duplications since the database query already provided the existing contents at the moment.
This is actually a pretty common scenario, where you have three components:
A cluster of stateless web servers that maintain open connections with all clients (load balanced across the cluster, obviously)
A persistent main data storage - Neo4j in your case
A messaging/queueing backend for broadcasting messages across channels (thus across the server cluster) - Redis
Your requirement is for new clients to receive an initial feed of the recent messages, and any consequent messages in real-time. All of this is implemented in your connection handlers.
Essentially, this is what your (pseudo-)code should look like:
class ConnectionHandler:
redis = redis.get_connection()
def on_init():
self.send("hello, here are all the recent messages")
recent_msgs = fetch_msgs_from_neo4j()
self.send(recent_msgs)
redis.add_listener(on_msg)
self.send("now listening on new messages")
def on_msg(msg):
self.send("new message: ")
self.send(msg)
The exact implementation really depends on your environment, but this is the general flow of things.
I'm just starting understanding and trying ZeroMQ.
It's not clear to me how could I have a two way communication between more than two actors (publisher and subscriber) so that each component is able both to read and write on the MQ.
This would allow to create event-driven architecture, because each component could be listening for an event and reply with another event.
Is there a way to do this with ZeroMQ directly or I should implement my own solution on top of that?
If you want simple two-way communication then you simply set up a publishing socket on each node, and let each connect to the other.
In an many to many setup this quickly becomes tricky to handle. Basically, it sounds like you want some kind of central node that all nodes can "connect" to, receive messages from and, if some conditions on the subscriber are met, send messages to.
Since ZeroMq is a simple "power-socket", and not a message queue (hence its name, ZeroMQ - Zero Message Queue) this is not feasible out-of-the-box.
A simple alternative could be to let each node set up an UDP broadcast socket (not using ZeroMq, just regular sockets). All nodes can listen in to whatever takes place and "publish" its own messages back on the socket, effectively sending it to any nodes listening. This setup works on a LAN and in a setting where it is ok for messages to get lost (like periodical state updates). If the messages needs to be reliable (and possibly durable) you need a more advanced full-blown message queue.
If you can do without durable message queues, you can create a solution based on a central node, a central message handler, to which all nodes can subscribe to and send data to. Basically, create a "server" with one REP (Response) socket (for incoming data) and one PUB (Publisher) socket (for outgoing data). Each client then publishes data to the servers REP socket over a REQ (Request) socket and sets up a SUB (Subscriber) socket to the servers PUB socket.
Check out the ZeroMq guide regarding the various message patterns available.
To spice it up a bit, you could add event "topics", including server side filtering, by splitting up the outgoing messages (on the servers PUB socket) into two message parts (see multi-part messages) where the first part specifies the "topic" and the second part contains the payload (e.g. temp|46.2, speed|134). This way, each client can register its interest in any topic (or all) and let the server filter out only matching messages. See this example for details.
Basically, ZeroMq is "just" an abstraction over regular sockets, providing a couple of messaging patterns to build your solution on top of. However, it relieves you of a lot of tedious work and provides scalability and performance out of the ordinary. It takes some getting used to though. Check out the ZeroMq Guide for more details.
I’m writing a server/client game, a typical scenario looks like this: one client (clientA) send a message to the server, there is a MessageDrivenBean in server to handle such messages. After the MDB finished its job, it sends the result message back to another client (clientB).
In my opinion I only need two queues for such communication, one for input the other for output. Creating new queue for each connection is not a good idea, right?
The Input queue is relative clear, if more clients are sending message at the same time, the messages are just waiting in the queue, while there are more MDB instances in server, that should not a big performance issue.
But on the other side I am not quite clear about the output queue, should I use a topic instead of a queue? Every client is listening the output queue, one of them gets the new message and checks the property to determine if the message is to it, if not, it rollback the transaction, the message goes back to queue and be ready for other client … It should work but must be very slow. If I use topic instead, every client gets a copy of the message, if it’s not to it, just ignores the message. It should be better, right?
I’m new about message system. Is there any suggestion about my implementation? Thanks!
To begin with, choosing JMS as a gaming platform is, well, unusual — businesses use JMS brokers for delivery reliability and transaction support. Do you really need this heavy lifiting in a game? Shouldn't you resort to your own HTTP-based protocol, for example?
That said, two queues are a standard pattern for point-to-point communication. Creating a queue for a new connection is definitely not OK — message-driven beans are attached to queues at deployment time, so you won't be able to respond to queue creation events. Besides, queues are not meant to be created and destroyed in short cycles, they're rather designed to be long-living entities. If you need to deliver a message to one precise client, have the client listen on the server response queue with a message selector set to filter only the messages intended for this client (see javax.jms.Message API).
With topics it's exactly as you noted — each connected client will get a copy of the message — so again, it's not a good pattern to send to n clients a message that has to be discarded by n-1 clients.
MaDa;
You could stick one output queue (or topic) and simply tag the message with a header that identifies the intended client. Then, clients can listen on the queue/topic using a selector. Hopefully your JMS implementation has efficient server-side listener evaluation.