ZeroMQ high message latency from send to receive of pub/sub ipc messages

ZeroMQ high message latency from send to receive of pub/sub ipc messages - zeromq

I have been having an issue with some of our new ZeroMQ ipc pub/sub applications all on the same machine. I was doing a timing analysis and realized that if the process had lost the CPU it would not wake back up to receive the new message for 200-450 us after the message was sent. To place it in perspective, same test using UDP wakes up and processes the message in 40us. I also did a test where I have the sender send the Zmq message and then a udp message and the receiver waits on the UDP recvfrom call and once received it receives the Zmq message. This would happen within 50us consistently. So the Zmq message is available, but the thread hasn't been woken back up to pull it off for what I would consider a long time when compared to other messaging protocols. So my question is if this is expected Zmq behavior, or is it possible there is a configuration setting or something that would impact this behavior. If we keep the process in a spin wait, the issue goes away, so it seems as though the mechanism used to tell the process that a message is available and it is time to run and pull it off the queue is delayed for whatever reason. I am running RH 8 and I believe ZMQ 4.x. I don't understand what I am missing since the documentation says for the size messages I am using, it should be around 40us. Thanks in advance.
I broke it down to a simple ZMQ broker sender and receiver and ran the test as stated above. Message gets time stamped upon sending them 1 every 200ms as to allow the client to loose the CPU while blocking on the recv call, we receive the message at the client and store the delta times in an array. After a thousand, we print out the results as to not have impacted our timing with the information and then that is where I am seeing they are all 250us +- 150us.

Related

How to grab the latest message sent from each connection

I have a ZMQ_PULL/ZMQ_PUSH socket connection.
I have multiple ZMQ_PUSH connections pushing to a single ZMQ_PULL connection.
ZMQ_PUSH connection 1----->
ZMQ_PUSH connection 2-----> ZMQ_PULL
ZMQ_PUSH connection N----->
I do not need every message, I just need the latest message that was sent. I am doing some inference on the back end and am streaming the results to the ZMQ_PULL socket.
I have set the ZMQ_PULL socket to Conflate=true
"If set, a socket shall keep only one message in its inbound/outbound queue, this message being the last message received/the last message to be sent. Ignores ZMQ_RCVHWM and ZMQ_SNDHWM options."
But after testing I realize I actually need the last message of each connection, not just the last message. So, if 3 connections, it grabs in a round robin from each connection, so I constantly have the latest from each connection.
Is there an option that is like Conflate, but instead of for all messages, it is for each connection?
Docs: http://api.zeromq.org/4-0:zmq-setsockopt

Is there an option that is like Conflate, but instead of for all messages, it is for each connection?
No.
The documentation you cite explains that 0MQ does not currently
offer direct support for such a single-socket use case.
You could certainly code it up and submit an upstream PR
so that future revs of 0MQ offer such functionality.
Given that you'll need app-level support to make
this work with 0MQ 4.3, simplest approach would
be to maintain N ZMQ_PULL sockets with ZMQ_CONFLATE
set, as you're already aware.
An alternate approach would be to assign a dedicated
thread or process to keep draining the existing muxed
socket, and update a shared memory data structure
that interested clients could consult.
The idea is to burn a core on keeping the queue
mostly empty, while doing no processing,
just focusing on communications.
Then other cores can examine "most recent message"
and each one then embarks on some expensive processing,
while another core continues to keep the queue drained.
This is essentially offering the 0MQ service proposed
above but at a different place in the stack,
up a level, within your application.
To do this in a distributed way,
the "queue draining service" would need to
know about idle workers.
That is, a worker could publish a brief
"I just completed an expensive task" message,
which would trigger the drainer to post
a fresh work item, never using shared memory at all.
This lets the drainer worry about eliding dup messages
that arrived when no one was available to immediately
start work on them, which have been superseded by a
more recent message.

JMS consumer inside a Netty handler?

I'm designing a quite complicated system and was wondering what the best way is to put a jms consumer (activemq, vm protocol, non persitent) inside a netty handler.
Let me explain, i have several clients connecting to my netty server using websockets. For every client connection i create a jms consumer that listens for interesting messages on one or more topics. If a interesting message arrives i need to do a extra step (additional filtering) before sending the message to the client using the websocket.
Is the following a good way to do this:
inside a SimpleChannelInboundHandler i declare a private non static consumer
the consumer is initialized in channelActive
the consumer is destroyed in channelInactive
when a message is received by consumer i do the extra filter a send it using ctx.channel().write()
In this setup i'm a bit worried that the consumer might turn into slow consumer and slow everything down, cause the websocket goes over the internet.
I came up with a more complex one to decouple the "receiving of message by consumer" and "sending of message through a websocket".
inside a SimpleChannelInboundHandler i declare a private non static consumer
the consumer is initialized in channelActive
the consumer is destroyed in channelInactive
when a message is received by consumer i put it in a blockedqueue
every minute i let a thread (created for every client) look in the queue and send the found messages to the client using ctx.channel().write().
At this point i'm a bit worried about the extra thread per client.
Or is there maybe a better way to accomplish this task?

This is a classic slow consumer problem and the first step to resolving it is to determine what the appropriate action is when a slow consumer is detected. If it is acceptable that the slow consumer misses messages then the solution is some variation on dropping messages or unsubscribing them from the feed. For example, if it's acceptable that the client misses messages then, when one is received from JMS, check if the channel is writable. If it isn't, drop the message. If you want to give yourself a bit more of a buffer (although OS buffers are quite large) you can track the number of write completion future's that haven't completed (ie the messages haven't been written to the OS send buffer) and drop messages if there are too many outstanding write requests.
If the client may not miss messages, and is consistently slow, then the problem is more difficult. One option might be to divert messages to a JMS queue with a specific header value, then open a new consumer that reads messages from that queue using a JMS selector. This will put more load on the JMS server but might be appropriate for temporary slowness and hopefully it won't interfere with you main topic feeds. Alternatively you might want to stash the messages in a different store, such as a database, so you can poll for messages when they can be sent. If you do this right a single polling thread can cope with many clients (query for clients which have outstanding messages, then for each client, load a bunch of messages). However this isn't as convenient as using JMS.
I wouldn't go with option 2 because the blocking queue is only going to solve the problem temporarily, and you can achieve the same thing by tracking how many write operations are waiting to complete.

Can client receive multiple messages of the queue before acknowledging them?

My program will be receiving messages rather slowly; and I want to them to persist in the queue until I have receive all of them and acknowledge all of them. I don't know if I have enough messages until I receive a bunch of them.
My question: will the queue block, waiting for the acknowledgement from the first message before delivering the second?

Well I ran a test one this using the sample producer/consumer code. The consumer actually has some code (if you switch over to ClientAcknowledge). It receives a bunch of messages (10 of them) and only acks the last one.

When setting the acknowledge mode to Session.CLIENT_ACKNOWLEDGE you can get as many messages you need. The messages will be locked on the server, so no other consumer can retrieve them meanwhile. So the answer is no, the queue won't block (even thu there might be provider-specific settings that can do that, which I don't know).
However, you can acknowledge only all at once. So when you have received 10 messages, and you acknowledge one of them (doesn't matter which), all messages will be acknowledged.
Check for your reference Controlling Message Acknowledgment

Shut down ZeroMQ receiving end without loss

I am developing a (Python/pyzmq)) ZeroMQ server that receives incoming messaging through a PULL socket.
Now, there will be times when I will make a clean restart of the server to upgrade it. My question is; Can I somehow stop receiving incoming messages (on my PULL socket) so that a restart does not loose any messages? I am thinking of something like calling close() no the socket, and then recv()ing the last message. Possibly setting high water mark to zero would yield a similar result.
If none of the above solutions works, I might be better off converting my socket to a REP socket and fetch each message on by one, ACK:ing them every time. Since this would be synchronous, I guess this would be slower.

Yes, 0mq wont offer such type of reliable delivery itself. You should use scheme with ACK's for sure.
See Chapter Four - Reliable Request-Reply of zguide.

I'm using clrzmq with ZMQ 3.2.2 and I got the above functionality by setting the following properties on the PULL socket:
Setting the receive high watermark on the pull socket to the number of msgs I'm willing to keep in memory.
Setting the buffer size to the appropriate size.
When I no longer wish to receive messages I call socket.disconnect() on the receiving channel.
After disconnect, the channel will no longer get new msgs. If you set the high watermark on the sender side it will start keeping the msgs in the sender queue (So that events will not get lost)
When channel is disconnected, calling receive events will succeed while there are msgs in the receiver queue. I'm using receive with a timeout so if it fails after the timeout, when the channel is disconnected, I'm assuming the queue is empty and I can dispose the channel and restart the service.
When the service is back up all msgs stored in the sender queue will get dispatched.

ActiveMq, what happens if Client terminates before Ack

I have a persistent queue, non-transacted, client-acknowledge, the consumers read with jms.prefetchPolicy.queuePrefetch=1&wireFormat.maxInactivityDuration=50000
and once a consumer processes a message, it ack's the message.
If the consumer reads the message, and before it can send an ack, the process terminates abruptly, what happens in ActiveMq? (What ActiveMq parameters come into play here?)
How is that different than if the the consumer will take 10 minutes to process the message (so the consumer task is alive and working), how does ActiveMq know the message is still being worked on? (Does it monitor the TCP/IP connection, if the connection dies, it assumes the message will not be Ack'ed?)
How do I determine if a message is a "poison pill", i.e. it makes the consumers crash? (the redelivery count seems to be valid if the consumer task does not die; is there an internal counter in the message that says "it was been read n times without being successfully ack'ed?")
As an experiment, I sent 6 messages, one of them being a "poison pill" (kills the consumer before the consumer can send the ack), with 2 simultaneous consumers running (and automatically restarting consumers to bring the count to 2 whenever a consumer dies). Looking at the queue (using jconsole, I enable jmx using broker.setUseJmx(true)), 4 messages were delivered, 2 are in-flight. Why would there be 2 in-flight instead of just one?
I've been reading the ActiveMq and JMS specs for a while without clear/conclusive answers, so any insights on what parameters come into play, and if there are any known bugs, will be greatly useful.

This is purely based on my understanding of JMS - may not be completely correct:
If the consumer reads the message, and before it can send an ack, the process terminates abruptly, what happens in ActiveMq
My understanding is that since this happens in the context of a session with the JMS provider, JMS provider knows if the session is no longer active or has failed and any message not acknowledged as part of the session will be redelivered when the session is re-established.
How do I determine if a message is a "poison pill", i.e. it makes the consumers crash?
Like you have mentioned, the JMS provider keeps track of the # of times the message was redelivered possibly in the header of the message
4 messages were delivered, 2 are in flight
Not sure about this point

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio