ZeroMQ - Can we check subscribers before sending a message? - zeromq

The classic ZeroMQ PUB pattern, is something like :
format your complete message
send your message
( managed by ZMQ ) if there is a subscriber to the topic, then send it, else trash it ?
What I've noticed in one of my applications, is that the formatting of some of the messages is very heavy and takes a lot of time. When I don't have a subscriber for the topic, I do all this work for nothing.
I was wondering if there was a way to check whether a topic was subscribed before formatting the rest of the message.
I understand there'd be a TOCTOU problem :
1. check the topic is subscribed ( it's not )
2. ( ZMQ receives a subscription for the topic )
3. data is not sent...
or
1. check the topic is subscribed ( it is )
2. start formatting message
3. ( ZMQ receives a un-subscription for the topic )
4. send to socket, data is not sent ( wasted time )
... and I'm OK with both.
I've tried with multi-part messages ( sending first the "header/topic" without formatting the rest of the message ) but :
- it doesn't seem to do what I'm meaning here
- my subscribers also have to handle the multi-part messages ( can do a simple zmq_recv() ), which is a bit annoying
Any idea ? I think I see where to patch in xpub.cpp , adding a method that would copy/paste part of xpub::xsend() ( https://github.com/zeromq/libzmq/blob/656205b5f9159677d325cff5e6e26c97f95d8cd7/src/xpub.cpp#L289 ) but I'm not even sure that's something the ZMQ community would be interested into.

In case one has never worked with ZeroMQ,one may here enjoy to first look at "ZeroMQ Principles in less than Five Seconds"before diving into further details
Q : "Can we check subscribers before sending a message?"
Yes, we can.
If indeed in such a need, beware the XPUB Archetype collects incoming subscription-management messages ( if they arrive ) usable for doing something like this.
That does not mean one can stand blind and rely on this. Unless in a fully-restricted environment, where rigid version-control and enforcement policies are strong & in-place, there always may be a client, that does not use the more recent, changed, version, that performs the topic-filtering on (X)PUB-side. Given such chance, the SUB-side topic-filtering ought be fully simulated, if it delivers all the subscription-management records onto the (X)PUB-side, as the newer versions expect, before starting to blind-sightedly "believe" into such a test-before-send policy is being adopted.
Damned version management :o)
You may also know, that the topic-filtering ( since ever and hopefully will remain so ) does not require any formatting the less a multi-part messaging overheads. It works as a plain bit-field matching, the performance of which was tuned-up, so who would ever want to waste any single [ns] of some add-on overhead costs in this domain?
Welcome to the Art of Zen-of-Zero

Related

Does ZeroMQ implement total ordered multicast for message delivery?

I came across this document http://zguide.zeromq.org/page:all but couldn't find anything regarding totally ordered multicast. How does ZeroMQ order its messages?
In case one has never worked with ZeroMQ,one may here enjoy to first look at "ZeroMQ Principles in less than Five Seconds"before diving into further details
Q : "How does ZeroMQ order its messages?"
Welcome to the lands of Zen-of-Zero. ZeroMQ has been designed so as to be ultra-fast, exceptionally smart and not to do a single step beyond what is necessary.
This said, there is, since ever ( and seems to be still un-damaged & valid in 2020/Q2 ), Zero-Warranty for a message to be delivered - i.e. in a symmetrically reflected point of view, users receive a Warranty that any message, that was delivered is a binary copy of the originator-side message payload pull stop. No other warranties ( i.e. the very same is thus valid for any (re)-order-ing ).

How to get data a ZMQ_PUB service?

Can I publisher service receive data from an external source and send them to the subscribers?
In the wuserver.cpp example, the data are generated from the same script.
Can I write a ZMQ_PUBLISHER entity, which receives data from external data source / application ... ?
In this affirmation:
There is one more important thing to know about PUB-SUB sockets: you do not know precisely when a subscriber starts to get messages. Even if you start a subscriber, wait a while, and then start the publisher, the subscriber will always miss the first messages that the publisher sends. This is because as the subscriber connects to the publisher (something that takes a small but non-zero time), the publisher may already be sending messages out.
Does this mean, that a PUB-SUB ZeroMQ pattern is performed to a best effort - UDP style?
Q1: Can I write a ZMQ_PUBLISHER entity, which receives data from external data source/application?
A1: Oh sure, this is why ZeroMQ is so helping us in designing smart distributed-systems. Just imagine the PUB-side process to also have other { .bind() | .connect() }-calls, so as to establish such other links to data-feeder(s), and you are done to operate the wished to have scheme. In distributed-systems this gives you a new freedom to smart integrate heterogeneous systems to talk to each other in a very efficient way.
Q2:Does this mean, that a PUB-SUB ZeroMQ pattern is performed to a best effort - UDP style?
A2: No, it has another meaning. The newly declared subscriber entities at some uncertain moment start to negotiate their respective subscription-topic filtering and such a ( distributed ) process takes some a-priori unknown time. Unless until the new / changed topic-filter policy was established, there is nothing to go into the SUB-side exgress interface to meet a .recv()-call, so no one can indeed tell, when that will get happened, can he?
On a higher level, there is another well known dichotomy of ZeroMQ -- Zero-Warranty Principle -- expect to either get delivered a complete message or none at all, which prevents the framework users from a need to handle any kind of damaged / inconsistent message-payloads. Either OK, or None. That's a great warranty. The more for distributed-systems.

ZMQ pattern for requests without replies

I am using ZMQ to allow clients to connect to a server and send commands to it. The commands come in at high frequency, and do not need any reply. I am considering using a REQ/REP socket, but it feels wasteful to send empty replies. I do not wish to use PUB/SUB or PUSH/PULL because I want the clients to initiate the connection. Is there a more suitable pattern than REQ/REP to use here?
(cit.:) because I want the clients to initiate the connection. ( ? )
One can always let clients to initiate the connection, so using PUSH/PULL Scalable Formal Communication Pattern seems very on target, even with reverse .bind()/.connect() calls, or have you meant something else?
If remaining negative about the PUSH/PULL ( as observed so far ) for some other reason, one may escape from the strict hard-wired steplocking ( and also from it's risk of falling into unsalvageable deadlocks, associated per-se with it ) of the REQ/REP-- firstby an extended archetype XREQ/XREP ( see API documentation for implementation details ) or( if using API 4.2+ )by unlocking the REQ-hardwired FSA duties via .setsockopt( ZMQ_REQ_RELAXED, 1 ), given the fact noted above, that REP answers will never be sent from the server-side / processed on the REQ-side client(s). In case of going this way, be cautious as ZMQ_REQ_CORRELATE may get set to 1, where the messages will happen to become multi-frame(d), as the REQ-id# will get loaded into the newly injected "service"-frame, before the REQ's client-payload gets onto wire. This may confuse the server-part of the message-receiving / processing code.
For more couragefull designers, may use PAIR/PAIR Formal Pattern archetype, as it does not indoctrinate any strict formal behaviour, but read carefully the API specs.

What is the ZeroMQ PUB/SUB internal behaviour?

I'm trying to get my head around to the behaviour of zmq with PUB/SUB.
Q1: I can't find a real reason why with the PUSH/PULL sockets combo I can create a queue that actually queue in memory messages that it can't get delivered (the consumer is not available) when with the PUB/SUB not.
Q2: Is there any technical whitepaper or document that describes in detail the internals of the sockets?
EDIT:
This example of PUSH/PULL streamer works as expected (the worker join late or restart and gets the queued messages in the feeder. PUB/SUB forwarder does not behave in the same way.
While Q1 is hard to be answered / fully addressed without a SLOC ...
there is still a chance your code ( though yet unpublished,which StackOverflow so much encourages user to include in a form aka MCVEand you may already have felt or soon might feel some flames for not doing so ) just forgotten to set a subscription topic-filter
aSubSOCKET.setsockopt( zmq.SUBSCRIBE = "" ) # ->recv "EVERYTHING" / NO-TOPIC-FILTER
aSubSOCKET.setsockopt( zmq.SUBSCRIBE = "GOOD-NEWS" ) # ->recv "GOOD-NEWS" MESSAGES to be received only
A2: yes, there are exhaustive descriptions of all ZeroMQ API calls +
besides the API manpage collection for ØMQ/2.1.1 and other versions,there is a great online published pdf book "Code Connected, Vol.1" from Pieter HINTJENS himself.
Worth reading. A lot of insights into general distributed-processing area and ZeroMQ way.

Detect dropped messages in ZeroMQ Queues

Since it does not seem to be possible to query/inspect the underlying ZeroMQ queues/buffers sockets to see how much they are utilized, is there some way to detect when a message is dropped due to full buffers in a Publisher socket when sent/queued?
For example, if the publisher queue is full, the zmq_send operation will simply drop the message.
Basically, what I want to achieve is a way to detect situations where the queues are getting stressed and/or full to be able to (later on) tune the solution to work better. One alternative way would be to add a sequence number to each message and do a simple calculation in the subscriber but I can never be sure that a message was lost due to full buffers in the publisher.
There is an example for this in the ZeroMQ Guide (which you should read and digest if you want to use 0MQ happily): http://zguide.zeromq.org/page:all#Slow-Subscriber-Detection-Suicidal-Snail-Pattern
The mechanism is as you answered yourself, to add a sequence number in the message, and allow the subscriber to detect gaps and take appropriate action. For most pubsub scenarios you can raise the default HWM, which is 1,000, to something much higher; it depends on your average message size.
I know this is an old post but here is what I did when recently facing the same issue.
I opted to use a DEALER/ROUTER and set the ZMQ_SNDHWM option to 1. Also I provided the timeout parameter on each zmq_send(). The timeout could be anything between 10 ms to 3 seconds, depending on what your scenario is ( a local or remote send ).
If the message is not sent within the timeout or the send-buffer is full the zmq_send() will return false. That enabled me to set up a retry queue in front of zmq. I know it's not a perfect solution but for me it worked just fine. What puzzles me though is the meaning of true/false returned by the DEALER-socket zmq_send(). I have not been able to find the answer to that question. Whether it indicates that the message has been buffered or that the message has been delivered to the ROUTER has eluded me. In my case I got the results needed anyway.
Just for the record this was done using netmq but I guess it applies to ZeroMQ as well.
I do agree wtih james though. ZeroMQ ( and netmq ) should at least provide a way to inspect the queue ( and get the messages out ) and also a way to tell the various sockets not to drop messages. The best option would be to send messages not delivered in timely fashion according to the configured options to some sort of deadletter queue. The deadletter queue could then be handled separately.

Resources