Can AMQP clients be both a publisher and subscriber? - amqp

I'm just starting to research AMQP and I'm wondering if I'd be using it for something it's not designed for. Here's something like what I want to do:
ClientA does goes about it's business
and publishes it's state to some
exchange (correct me if I use the
wrong terms anywhere).
ClientB connects to the same broker
and "says what publishers are
publishing here? I choose you,
clientB. What is going on?".
ClientA says "My foo is bar and my baz
is true"
ClientB says "OK. Set your baz to
false"
edit for a less abstract example"
ClientA talks/listens to a hardware
device, say a video projector. When
ClientB comes online, it wants to find
any projector clients (like ClientA)
that are connected and then to know
the status of the projectors (is the
lamp on?) and also change, if it needs to, the status
(turn the lamp off). So ClientA is
keeping some state (lamp is off) and
can send it out when requested, and
call also respond to commands from the
exchange and convert and pass them to
the projector (turn lamp on).

I'm finding it hard to follow your example, but it sounds like you want these A and B types to have back-and-forth conversations with each other. Is that correct?
AMQP is better suited for asynchronous message passing, and to add the kind of point-to-point style you're describing requires that you set up request and reply queues so that clients can both send and receive messages. It's certainly possible to have clients both publish and consume messages.

This is possible and it would make sense if the different actors in your example, are networked devices because AMQP would provide a loosely coupled way of messaging.
One thing to watch out for is the last abstract line where client B says "OK, set some attribute". That sounds suspiciously like a scenario where subroutine calls return some value and then the next step takes place. AMQP can certainly simulate that kind of RPC, but it works better when processes can send a message and don't have to wait for completing.
If most of your messaging doesn't involve waiting for turnaround replies, then AMQP sounds like a fit for what you are doing. But if most of your needs are RPC, then it may not be the best choice.
AMQP really shines when there are future possibilities, for instance in your scenario, if you needed to add a couple thousand projectors, 10,000 client Bs, and several other device types that also need to exchange status. The loose coupling of AMQP makes it easy to add other applications to the broker, just by declaring new exchanges.

Related

How does a microservice return data to the caller when using a message broker? or a message queue?

I am prettty new to microservices, and I am trying to figure out how to set a micro-service architecture in which my publisher that emits an event, can receive a response with data from the consumer within the publisher?
From what i have read about message-brokers and message-queues, it seems like it's one-way communication. The producer emits an event (or rather, sends a message) which is handled by the message broker, and then the consumer consumes that event and performs some action.
This allows for decoupled code, which is part of what im looking for, but i dont understand if the consumer is able to return any data to the caller.
Say for example I have a microservice that communicates with an external API to fetch data. I want to be able to send a message or emit an event from my front-facing server, which then calls the service that fetches data, parses the data, and then returns that data back to my servver1 (front-facing server)
Is there a way to make message brokers or queues bidirectional? Or is it only useable in one direction. I keep reading message brokers allow services to communicate with each other, but I only find examples in which data flow goes one way.
Even reading rabbitMQ documentation hasn't really made it very clear to me how i could do this
In general, when talking about messaging, it's one-way.
When you send a letter to someone you're not opening up a mind-meld so that they telepathically communicate their response to you.
Instead, you include a return address (or some other means of contacting you).
So to map a request-response interaction when communicating with explicit messaging (e.g. via a message queue), the solution is the same: you include some directions which the recipient can/will interpret as "send a response here". That could, for instance be, "publish a message on this queue with this correlation ID".
Your publisher then, after sending this message, subscribes to the queue it's designated and waits for a message with the expected correlation ID.
Needless to say, this is fairly elaborate: you are, in some sense, reimplementing a decent portion of a session protocol like TCP on top of a datagram protocol like IP (albeit in this case, we may have some stronger reliability guarantees than we'd get from IP). It's worth noting that this sort of request-response interaction intrinsically couples the two parties (we can't really say "sender and receiver": each is the other's audience), so we're basically putting in some effort to decouple the two sides and then some more effort to recouple them.
With that in mind, if the actual business use case calls for a request-response interaction like this, consider implementing it with an actual request-response protocol (e.g. REST over HTTP or gRPC...) and accept that you have this coupling.
Alternatively, if you really want to pursue loose coupling, go for broke and embrace the asynchronicity at the heart of the universe (maybe that way lies true enlightenment?). Have your publisher return success with that correlation ID as soon as its sent its message. Meanwhile, have a different service be tracking the state of those correlation IDs and exposing a query interface (CQRS, hooray!). Your client can then check at any time whether the thing it wanted succeeded, even if its connection to your publisher gets interrupted.
Queues are the wrong level of abstraction for request-reply. You can build an application out of them, but it would be nontrivial to support and operate.
The solution is to use an orchestration system like temporal.io or AWS Step Functions. These services out of the box provide state management, asynchronous communication, and automatic recovery in case of various types of failures.

How to get data a ZMQ_PUB service?

Can I publisher service receive data from an external source and send them to the subscribers?
In the wuserver.cpp example, the data are generated from the same script.
Can I write a ZMQ_PUBLISHER entity, which receives data from external data source / application ... ?
In this affirmation:
There is one more important thing to know about PUB-SUB sockets: you do not know precisely when a subscriber starts to get messages. Even if you start a subscriber, wait a while, and then start the publisher, the subscriber will always miss the first messages that the publisher sends. This is because as the subscriber connects to the publisher (something that takes a small but non-zero time), the publisher may already be sending messages out.
Does this mean, that a PUB-SUB ZeroMQ pattern is performed to a best effort - UDP style?
Q1: Can I write a ZMQ_PUBLISHER entity, which receives data from external data source/application?
A1: Oh sure, this is why ZeroMQ is so helping us in designing smart distributed-systems. Just imagine the PUB-side process to also have other { .bind() | .connect() }-calls, so as to establish such other links to data-feeder(s), and you are done to operate the wished to have scheme. In distributed-systems this gives you a new freedom to smart integrate heterogeneous systems to talk to each other in a very efficient way.
Q2:Does this mean, that a PUB-SUB ZeroMQ pattern is performed to a best effort - UDP style?
A2: No, it has another meaning. The newly declared subscriber entities at some uncertain moment start to negotiate their respective subscription-topic filtering and such a ( distributed ) process takes some a-priori unknown time. Unless until the new / changed topic-filter policy was established, there is nothing to go into the SUB-side exgress interface to meet a .recv()-call, so no one can indeed tell, when that will get happened, can he?
On a higher level, there is another well known dichotomy of ZeroMQ -- Zero-Warranty Principle -- expect to either get delivered a complete message or none at all, which prevents the framework users from a need to handle any kind of damaged / inconsistent message-payloads. Either OK, or None. That's a great warranty. The more for distributed-systems.

How can I limit total concurrent subscriber connections to a ZeroMQ publisher endpoint?

When building a pub-sub service using ZeroMQ on a Linux system, is there any way to enforce concurrent subscriber limits?
For example, I might want to create a ZeroMQ publisher service on a resource-limited system, and want to prevent overloading the system by setting a limit of, say, 100 concurrent connections to the tcp publisher endpoint. After that limit is reached, all subsequent connection attempts from ZeroMQ subscribers would fail.
I understand ZeroMQ doesn't provide notifications about connect/disconnect, but I've been looking for socket options that might allow such limits -- so far, no luck.
Or is this something that should be handled at some other level, perhaps within the protocol?
Yes, ZeroMQ is a Can-Do messaging framework:
Besides the trivial Formal Communication Pattern Framework elements ( the library primitives ), the strongest powers behind the ZeroMQ is the ability to develop one's own messaging system(s).
In your case, it is enough to enrich the scene with a few additional things ... a SUB-process -> PUB-process message-flow-channel, so as to allow PUB-side process to count a number of SUB-process instances concurrently connected and to allow for a disconnect ( a step delegated rather "back" to a SUB-process side suicside move, as the classical PUB-process, intentionally, has no instrumentation to manage subscriptions ) once a limit is dynamically achieved.
Plus add some dynamics for the inter-node signalling to start re-counting and/or to equip the SUB-process side(s) with a self-advertising mechanism to push-keepAliveSIG-s to the PUB-side and expect this signalling to be a weak and informative-only indication as there are many real-world collisions, where decentralised node simply fail to deliver a "guaranteed-delivery" message(s) and a well designed, distributed, low-latency, high-performance system has to cope well with this reality and have the self-healing state-recovery policies designed and in-built into own behaviour.
( Fig. courtesy imatix/ZeroMQ )
The ZeroMQ library can be thought of as a very powerful LEGO-tool-box for designing cool distributed systems, than a ready-made / batteries-included, stiff, quasi-solution-for-just-a-few-academic-cases ( well, it might be considered such, but just for some no-brainer's life, while our lives are much more colourful & teasing, aren't they ? )
So, "How to?"
Worth, definitely worth a few days to read the both of Pieter Hintjens' books & a few weeks for shifting one's mind to start designing with the ZeroMQ full-powers on one's side.
With just a few Python add-on habits ( a zmq.Context() early-setup, and not forgetting a finally: aContext.term() )
There's no way that I'm aware of to configure ZMQ to limit connections automatically... however, you have other options to accomplish what you're looking for. Perhaps the "traditional" way to accomplish this is with a second set of "network communication" sockets... perhaps REQ/REP from subscriber to publisher, asking for permission to connect.
You also have the option, depending on your version of ZMQ (and I've never used it and I can't find it in 5 minutes of searching, so I don't know how recent your version must be) to use XPUB/XSUB sockets, which can accomplish bi-directional communication. You can connect with XSUB, send a subscribe request, then receive a positive or negative response (you might have to play with your subscriber topics to communicate directly with just the single subscriber, I'm not sure), and react accordingly.
Either way, you'll be allowing a connection of some sort between the two systems and then either allowing it or terminating it depending on the situation. This could be less than completely ideal since you'll have to carve out a little overhead to handle connections that you'll be refusing... let's say you're saturated at 100 clients and all of a sudden get 100 new subscribe requests... you may or may not be able to cope with that sort of burst traffic.
You can test out the overhead in alternative communication mediums... like you could publish a webservice that indicates subscriber status that a client could check first, but that may not be any better to have clients connecting that way.
If you're absolutely at the limit of your resources, you'll have to set up a second server to handle subscriber status:
Server 1 is your publisher. You could set it up with a PUB socket and a REP socket.
Server 2 is your status server. It has a REQ socket. Have it subscribe to something like "system-status" or some such thing as that. It will also have your mechanism for communicating with new subscribers, be that a ZMQ socket or a web service or whatever else.
A client will request status from your status server. The status server will send a request to your publisher, which will increment it's subscriber count and reply with success, or keep its subscriber count and reply with failure. This success or failure will be communicated back to the subscriber, which will use that information to connect or not.
Disconnections will have to be communicated in a similar way... and you'll have to use some sort of heartbeating round-robin to confirm clients weren't a victim of catastrophic failure.
This will allow your publisher to make intelligent choices about whether it has resources or not. If you just want to set a static number, you don't even need the connection between the status server and the publisher, you can just keep count on the status server... but just to ensure the overall health of the network then it's probably best not to go that simplistic route.
Anyway, those are just some ideas to accomplish what you're looking for. ZMQ gives you options with which to craft your solutions moreso than actual solutions.

ZeroMQ distribution pattern

I currently have a pub/sub system running which allows clients to connect to a central message routing daemon, subscribe for a range of messages, and then start chattering away. The routing daemon tracks and maintains each subscriber's messages of interest (based on a simple tag) and delivers the appropriate messages of interest as each of the subscribers produce them. Essentially, each connection is considered a potential publisher OR subscriber AND USUALLY both, the daemon handles the routing and delivery as needed.
For example, three clients all connect and subscribe for their message tag(s) (MT) of interest:
Client 1(C1) subscribes to MT => 123
Client 2(C2) subscribes to MT => 123 & 456
Client 3(C3) subscribes to MT => 123 & 456 & 789
C1 produces MT 456: daemon delivers a copy to C2 and C3
C2 produces MT 123: daemon delivers a copy to C1 and C3 (not self)
C3 produces MT 999: daemon delivers it to none (nobody subscribed)
ZeroMQ came up in a discussion with a coworker and after tinkering with it for a few days I don't think I'm seeing the proper pattern for implementing/replacing the system that we currently have in place. Additionally, I would like to use EPGM in order to take advantage of the multicast gains and to eliminate the TCP based daemon, monkey in the middle, that I currently have.
Any suggestions?
It's possible to design a system like that using ZeroMQ. Basically speaking, you may create a daemon that binds two sockets: PULL to receive messages from clients and PUB to publish messages. Each of clients connects SUB socket and PUSH socket to server. EPGM might be used for PUB/SUB sockets, but PUSH/PULL sockets are still TCP.
The disadvantage of this design is that topic filtering and dropping out own messages must be done manually. For example, you might create message of three parts:
Topic
ID of producer
Message body
Client should read messages part by part immediately dropping tail of message it's not interested in. Working with PUB/SUB message envelopes is described in detail in this section of the guide: http://zguide.zeromq.org/page:all#Pub-Sub-Message-Envelopes. Client filtering shouldn't affect performance, since all PGM packets must be delivered to all connected receivers anyway.
This design is very simple yet pretty effective. It doesn't cover reliability, high availability, failure recovery and other important aspects - it's all doable with ZeroMQ and covered in the guide. Probably the best feature of ZeroMQ is the ability to start with something simple and add functionality as necessary without pain and/or major rewrites.
Something very similar (plus state snapshots, reliability and many more) is described in the chapter "Reliable Pub-Sub (Clone Pattern)" of the guide: http://zguide.zeromq.org/page:all#toc119
BTW, it's also possible to design p2p system with the central daemon used only as a name server, but it will be definitely more complex.

Is there an enterprise message queue which can drop duplicate messages (first value stays)?

I am looking looking for a message queue with these requirements. Couldn't find it; maybe the closest was the rabbitmq-lvc plugin (but I need the first value in the line to stick and stay in front).
Would anyone know a technology to support these?
message queue is FIFO
if a duplicate message is being enqueued, the message queue itself either rejects or drops it.
For example, producers put these three messages (each with a discriminator value) into the queue in this sequence: M1(discriminator=7654), M2(discriminator=2435), M3(discriminator=7654).
Now I want the message queue to see that M3 has the same discriminator value as M1 and thus drop/reject M3. Consumers receive only: M1, M2.
Thanks
Tom
I don't know the other transports but I know that WebSphere MQ doesn't do this and I believe that the explanation why would apply broadly across the category. I'd be very surprised to find that any messaging transport actually provides this. Here are a few reasons why:
Async messages are supposed to be atomic. Different vendors make their own accommodations for message affinity (a relationship between two or more messages) but as a rule, message affinity is to be avoided. Your use case not only requires the transport to deal with message affinity, but to do so over an indeterminate interval between related messages.
Message payload is a blob. For performance reasons, WMQ doesn't touch message payloads except for things like compression or code page conversion. Anything that requires parsing the message payload is a job for WebSphere Message Broker, DataPower or WebSphere ESB. I would expect any messaging transport which claims to be performant would face similar issues because parsing payloads results in longer code paths and non-linear performance degradation. The exception is message properties but WMQ uses these for selection only and I expect that is generally the case.
Stateless operation. As a transport, the state of the application may be stored in a persistent message but the state of the transport layer should not depend on the state of the application across different units of work. Again, an ESB type of product is best suited when you want to delegate management of some of the application state to the messaging layer and especially when such management spans many units of work.
Assured delivery. WMQ was designed to never lose your persistent message. If the app explicitly sets expiry the message might go away because the sender said it was OK to do so. If the message is non-persistent it might go away, but only in an exceptional condition and, again, because the sender said it was OK to do so. The use case you describe might result in a message going away not because the sender said it was OK, or even because the recipient said it was OK but because of an interaction with some unrelated 3rd party who happened to beat you to the queue with a duplicate value. What if that first message has an invalid header or code page problem and gets rolled back? What if I as an attacker spew out garbage messages with all possible 4-digit values for discriminator?
As I said, I don't know the other messaging products so there may be something out there which meets your requirement and if so I'll be interested to read about it. However in the event hat nobody replies, this post may shed some light on the reasons why.

Resources