Detect dropped messages in ZeroMQ Queues - zeromq

Since it does not seem to be possible to query/inspect the underlying ZeroMQ queues/buffers sockets to see how much they are utilized, is there some way to detect when a message is dropped due to full buffers in a Publisher socket when sent/queued?
For example, if the publisher queue is full, the zmq_send operation will simply drop the message.
Basically, what I want to achieve is a way to detect situations where the queues are getting stressed and/or full to be able to (later on) tune the solution to work better. One alternative way would be to add a sequence number to each message and do a simple calculation in the subscriber but I can never be sure that a message was lost due to full buffers in the publisher.

There is an example for this in the ZeroMQ Guide (which you should read and digest if you want to use 0MQ happily): http://zguide.zeromq.org/page:all#Slow-Subscriber-Detection-Suicidal-Snail-Pattern
The mechanism is as you answered yourself, to add a sequence number in the message, and allow the subscriber to detect gaps and take appropriate action. For most pubsub scenarios you can raise the default HWM, which is 1,000, to something much higher; it depends on your average message size.

I know this is an old post but here is what I did when recently facing the same issue.
I opted to use a DEALER/ROUTER and set the ZMQ_SNDHWM option to 1. Also I provided the timeout parameter on each zmq_send(). The timeout could be anything between 10 ms to 3 seconds, depending on what your scenario is ( a local or remote send ).
If the message is not sent within the timeout or the send-buffer is full the zmq_send() will return false. That enabled me to set up a retry queue in front of zmq. I know it's not a perfect solution but for me it worked just fine. What puzzles me though is the meaning of true/false returned by the DEALER-socket zmq_send(). I have not been able to find the answer to that question. Whether it indicates that the message has been buffered or that the message has been delivered to the ROUTER has eluded me. In my case I got the results needed anyway.
Just for the record this was done using netmq but I guess it applies to ZeroMQ as well.
I do agree wtih james though. ZeroMQ ( and netmq ) should at least provide a way to inspect the queue ( and get the messages out ) and also a way to tell the various sockets not to drop messages. The best option would be to send messages not delivered in timely fashion according to the configured options to some sort of deadletter queue. The deadletter queue could then be handled separately.

Related

ZeroMQ PUSH / PULL - how to know which events are pending in SEND BUFFER queue?

We have a service pair doing PUSH/PULL pattern of message communication. As mentioned in the docs, if the PULL service is down or not running, then a sender will queue up to high water mark number of events and by default a .send() after that will block.
Now, while an app is in the blocking state, the app could be killed or something else may happen, leading up to loosing those messages in the queue.
I understand PUSH/PULL is not the best method if we want that kind of reliability and should probably use some of the other method listed at: https://zguide.zeromq.org/docs/chapter4/ but is there a way in PUSH/PULL method to get event call back on the events still on queue on say app exit/periodic callbacks/signals?
I also understand, that I could use NOBLOCK or ZMQ_IMMEDIATE or ZMQ_SNDTIMEO in such situation and catch the error and use application level recovery (similar to DLQ pattern) but I was looking into things available from the ZeroMQ library.
Q : "... how to know which events are pending in SEND BUFFER queue ?"
A :Well,having used ZeroMQ since v2.1, v3.x, till v4.x in 2022-Q1, there has never been a way, how a user-level code may interact with ZeroMQ internal queues and/or state(s) as there was no such method in c-API to do so.
Q : "... is there a way in PUSH/PULL method to get event call back on the events still on queue on say app exit/periodic callbacks/signals?"
A :Well, let's solve this by using a concurrently operated signalling-socket, for receiving POSACK-messages from "live"-clients, i.e. those, that can and do receive messages - thus being able to back-throttle messages for those, that did not respond in reasonable TAT. Using a mix of several, properly selected Scalable Formal Communications Patterns archetypes to work in cooperation, helps solve this "soft"-signalling control. Without an ambition to solve all details, a set of one-PUB.bind() / many-SUB.connect()-sockets for selectively directed payload-transport with subscription-based controls and one-PULL.bind() / many-PUSH.connect()-s for "soft"-control signalling of still-alive-heartbeats, traffic back-throttling and similar services

How to get data a ZMQ_PUB service?

Can I publisher service receive data from an external source and send them to the subscribers?
In the wuserver.cpp example, the data are generated from the same script.
Can I write a ZMQ_PUBLISHER entity, which receives data from external data source / application ... ?
In this affirmation:
There is one more important thing to know about PUB-SUB sockets: you do not know precisely when a subscriber starts to get messages. Even if you start a subscriber, wait a while, and then start the publisher, the subscriber will always miss the first messages that the publisher sends. This is because as the subscriber connects to the publisher (something that takes a small but non-zero time), the publisher may already be sending messages out.
Does this mean, that a PUB-SUB ZeroMQ pattern is performed to a best effort - UDP style?
Q1: Can I write a ZMQ_PUBLISHER entity, which receives data from external data source/application?
A1: Oh sure, this is why ZeroMQ is so helping us in designing smart distributed-systems. Just imagine the PUB-side process to also have other { .bind() | .connect() }-calls, so as to establish such other links to data-feeder(s), and you are done to operate the wished to have scheme. In distributed-systems this gives you a new freedom to smart integrate heterogeneous systems to talk to each other in a very efficient way.
Q2:Does this mean, that a PUB-SUB ZeroMQ pattern is performed to a best effort - UDP style?
A2: No, it has another meaning. The newly declared subscriber entities at some uncertain moment start to negotiate their respective subscription-topic filtering and such a ( distributed ) process takes some a-priori unknown time. Unless until the new / changed topic-filter policy was established, there is nothing to go into the SUB-side exgress interface to meet a .recv()-call, so no one can indeed tell, when that will get happened, can he?
On a higher level, there is another well known dichotomy of ZeroMQ -- Zero-Warranty Principle -- expect to either get delivered a complete message or none at all, which prevents the framework users from a need to handle any kind of damaged / inconsistent message-payloads. Either OK, or None. That's a great warranty. The more for distributed-systems.

ZeroMQ: Can I use kvmsg class to add sequence numbers to messages? If yes, how?

I'm implementing a mechanism to detect packet loss in ZeroMQ PUSH/PULL socket type.
1) I was wondering if kvmsg can be used for the same?
2) I would like the client to detect gaps in sequence numbers if there are any loss of packets and implement a resend mechanism accordingly.
Assuming that kvmsg can cope with arbitrary message structures, then yes. Alternatives include Google Protocol Buffers, XML, etc.
In general one does this by adding a field to the messages that you send, perhaps called "sequence". The software you've written for the PUSH end will set this to zero for the very first message, 1 for the next, incrementing by 1 for each message. The PULL end then simply checks the sequence.
However, the real question is, why is this required by your application? ZMQ guarantees ( in normal circumstances ) delivery of messages. That's kinda the whole point of it. PUSH/PULL means that exactly one PULLer will receive a PUSHed message. If you have one PUSH and one PULL, every PUSHed message will be delivered in the correct order with no loss to the PULLer, barring catastrophic network failures. AFAIK it will even deal with temporary network problems for you, managing reconnection, etc, and still deliver messages in the correct order.
Messages that cannot be sent because the outgoing queue on the PUSH end is full will result in the zmq_send() returning an error, so the PUSH end already knows that a message wasn't sent.
Is there something else more complex about the application?

read messages from JMS MQ or In-Memory Message store by count

I want to read messages from JMS MQ or In-memory message store based on count.
Like I want to start reading the messages when the message count is 10, until that i want the message processor to be idle.
I want this to be done using WSO2 ESB.
Can someone please help me?
Thanks.
I'm not familiar with wso2, but from an MQ perspective, the way to do this would be to trigger the application to run once there are 10 messages on the queue. There are trigger settings for this, specifically TRIGTYPE(DEPTH).
To expand on Morag's answer, I doubt that WS02 has built-in triggers that would monitor the queue for depth before reading messages. I suspect it just listens on a queue and processes messages as they arrive. I also doubt that you can use MQ's triggering mechanism to directly execute the flow conveniently based on depth. So although triggering is a great answer, you need a bit of glue code to make that work.
Conveniently, there's a tutorial that provides almost all the information necessary to do this. Please see Mission:Messaging: Easing administration and debugging with circular queues for details. That article has the scripts necessary to make the Q program work with MQ triggering. You just need to make a couple changes:
Instead of sending a command to Q to delete messages, send a command to move them.
Ditch the math that calculates how many messages to delete and either move them in batches of 10, or else move all messages until the queue drains. In the latter case, make sure to tell Q to wait for any stragglers.
Here's what it looks like when completed: The incoming messages land on some queue other than the WS02 input queue. That queue is triggered based on depth so that the Q program (SupportPac MA01) copies the messages to the real WS02 input queue. After the messages are copied, the glue code resets the trigger. This continues until there are less than 10 messages on the queue, at which time the cycle idles.
I got it by pushing the message to db and get as per the count required as in this answer of me take a look at my answer

Is there an enterprise message queue which can drop duplicate messages (first value stays)?

I am looking looking for a message queue with these requirements. Couldn't find it; maybe the closest was the rabbitmq-lvc plugin (but I need the first value in the line to stick and stay in front).
Would anyone know a technology to support these?
message queue is FIFO
if a duplicate message is being enqueued, the message queue itself either rejects or drops it.
For example, producers put these three messages (each with a discriminator value) into the queue in this sequence: M1(discriminator=7654), M2(discriminator=2435), M3(discriminator=7654).
Now I want the message queue to see that M3 has the same discriminator value as M1 and thus drop/reject M3. Consumers receive only: M1, M2.
Thanks
Tom
I don't know the other transports but I know that WebSphere MQ doesn't do this and I believe that the explanation why would apply broadly across the category. I'd be very surprised to find that any messaging transport actually provides this. Here are a few reasons why:
Async messages are supposed to be atomic. Different vendors make their own accommodations for message affinity (a relationship between two or more messages) but as a rule, message affinity is to be avoided. Your use case not only requires the transport to deal with message affinity, but to do so over an indeterminate interval between related messages.
Message payload is a blob. For performance reasons, WMQ doesn't touch message payloads except for things like compression or code page conversion. Anything that requires parsing the message payload is a job for WebSphere Message Broker, DataPower or WebSphere ESB. I would expect any messaging transport which claims to be performant would face similar issues because parsing payloads results in longer code paths and non-linear performance degradation. The exception is message properties but WMQ uses these for selection only and I expect that is generally the case.
Stateless operation. As a transport, the state of the application may be stored in a persistent message but the state of the transport layer should not depend on the state of the application across different units of work. Again, an ESB type of product is best suited when you want to delegate management of some of the application state to the messaging layer and especially when such management spans many units of work.
Assured delivery. WMQ was designed to never lose your persistent message. If the app explicitly sets expiry the message might go away because the sender said it was OK to do so. If the message is non-persistent it might go away, but only in an exceptional condition and, again, because the sender said it was OK to do so. The use case you describe might result in a message going away not because the sender said it was OK, or even because the recipient said it was OK but because of an interaction with some unrelated 3rd party who happened to beat you to the queue with a duplicate value. What if that first message has an invalid header or code page problem and gets rolled back? What if I as an attacker spew out garbage messages with all possible 4-digit values for discriminator?
As I said, I don't know the other messaging products so there may be something out there which meets your requirement and if so I'll be interested to read about it. However in the event hat nobody replies, this post may shed some light on the reasons why.

Resources