I have a simple C# application that sends a message via MSMQ to a remote server over HTTPS.
I have specified a TimeToReachQueue value for the message, and an administration queue so I can receive positive or negative acknowledgments regarding delivery of the message (I specified "FullReachQueue" as the AcknowledgeType.)
About 10 seconds after I send the message, I get a message in my local administration queue saying "The message reached the queue." However, after the TimeToReachQueue interval expires--be it 30 seconds or 5 minutes-- I get a second message saying "The time-to-reach-queue has elapsed."
Every reference to the "TimeToReachQueue" interval I can find says you will only get a negative acknowledgment if the message didn't reach the queue within the specified interval.
When I check the remote server's queue, the message is actually in the destination queue, so how can I receive messages saying both that it did and didn't arrive?
Are there any outgoing queues on the remote server pointing back to the original sending machine that contain ack messages of any kind?
Related
We have a PHP app that forwards messages from RabbitMQ to connected devices down a WebSocket connection (PHP AMQP pecl extension v1.7.1 & RabbitMQ 3.6.6).
Messages are consumed from an array of queues (1 per websocket connection), and are acknowledged by the consumer when we receive confirmation over the websocket that the message has been received (so we can requeue messages that are not delivered in an acceptable timeframe). This is done in a non-blocking fashion.
99% of the time, this works perfectly, but very occasionally we receive an error "RabbitMQ PRECONDITION_FAILED - unknown delivery tag ". This closes the channel. In my understanding, this exception is a result of one of the following conditions:
The message has already been acked or rejected.
An ack is attempted over a channel the message was not delivered on.
An ack is attempted after the message timeout (ttl) has expired.
We have implemented protections for each of the above cases but yet the problem continues.
I realise there are number of implementation details that could impact this, but at a conceptual level, are there any other failure cases that we have not considered and should be handling? or is there a better way of achieving the functionality described above?
"PRECONDITION_FAILED - unknown delivery tag" usually happens because of double ack-ing, ack-ing on wrong channels or ack-ing messages that should not be ack-ed.
So in same case you are tying to execute basic.ack two times or basic.ack using another channel
(Solution below)
Quoting Jan Grzegorowski from his blog:
If you are struggling with the 406 error message which is included in
title of this post you may be interested in reading the whole story.
Problem
I was using amqplib for conneting NodeJS based messages processor with
RabbitMQ broker. Everything seems to be working fine, but from time to
time 406 (PRECONDINTION-FAILED) message shows up in the log:
"Error: Channel closed by server: 406 (PRECONDITION-FAILED) with message "PRECONDITION_FAILED - unknown delivery tag 1"
Solution <--
Keeping things simple:
You have to ACK messages in same order as they arrive to your system
You can't ACK messages on a different channel than that they arrive on If you break any of these rules you will face 406
(PRECONDITION-FAILED) error message.
Original answer
It can happen if you set no-ack option of a Consumer to true that means you souldn't call ack function manually:
https://www.rabbitmq.com/amqp-0-9-1-reference.html#basic.consume.no-ack
The solution: set no-ack flag to false.
If you aknowledge twice the same message you can have this error.
A variation of what they said above about acking it twice:
there is an "obscure" situation where you are acking a message more than once, which is when you ack a message with multiple parameter set to true, which means all previous messages to the one you are trying to ack, will be acked too.
And so if you try to ack one of the messages that were "auto acked" by setting multiple to true then you would be trying to "ack" it multiple times and so the error, confusing but hope you understand it after a few reads.
Make sure you have the correct application.properties:
If you use the RabbitTemplate without any channel configuration, use "simple":
spring.rabbitmq.listener.simple.acknowledge-mode=manual
In this case, if you use "direct" instead of "simple", you will get the same error message. Another one looks like this:
spring.rabbitmq.listener.direct.acknowledge-mode=manual
I see SENDER channel goes into RETRY mode after LONGRTS start. It remains in RETRY mode and re-started after LONGMTR(1200) seconds. My question is - does Sender channel comes back to RUNNING as soon as message come, without completion of LONGMTR or it waits for LONGMTR time?
A SENDER channel will go into STATUS(RETRY) - a.k.a. Retry Mode - when the connection to its partner fails.
To begin with, on the assumption that many network failures are very short lived, a SENDER channel will try a small number of fairly close together attempts to re-make the network connection. It will try 10 times at 60 seconds apart, to re-make the connection. This is known as the "short retries".
This 10 times and 60 seconds apart, are coded in the SENDER channel fields called SHORTRTY and SHORTTMR.
If after these first 10 attempts, the SENDER channel has still not managed to get reconnected to the network partner, it will now move to "long retries". It is now operating with the assumption that the network outage is a longer one, for example the partner queue manager machine is having maintenance applied, or there has been some other major outage, and not just a network blip.
The SENDER channel will now try what it hopes is an infinite number of slightly more spaced apart attempts to re-make the connection. It will try 999999999 times at 1200 seconds apart, to re-make the connection.
This 999999999 and 1200, are coded in the SENDER channel fields called LONGRTY and LONGTMR.
You can see how many attempts are left by using the DISPLAY CHSTATUS command and looking at the SHORTRTS and LONGRTS fields. These should how many of the 10 or 999999999 are left. If SHORTRTS(0) then you know the SENDER is into "long retry mode".
If, on any of these attempts to re-make the connection, it is successful, it will stop retrying and you will see the SENDER channel show STATUS(RUNNING). Note that the success is due to the network connection having been successfully made, and is nothing to do with whether a message arrives or not.
It will not continue making retry attempts after it successfully connects to the partner (until the next time the connection is lost of course).
If your channel is in STATUS(RETRY) you should look in the AMQERR01.LOG to discover the reason for the failure. It may be something you can fix at the SENDER end or it may be something that needs to be fixed at the RECEIVER end, for example restarting the queue manager or the listener.
If I acknowledge the same message twice using the Delivery.Ack method, my consumer channel just closes by itself.
Is this expected behaviour? Has anyone experienced this ?
The reason I am acknowledging the same message twice is a special case where I have to break the original message into copies and process them on the consumer. Once the consumer processes everything, it loops and acks everything. Since there are copies of the entity, it acks the same message twice and my consumer channel shuts down
According to the AMQP reference, a channel exception is raised when a message gets acknowledged for the second time:
A message MUST not be acknowledged more than once. The receiving peer
MUST validate that a non-zero delivery-tag refers to a delivered
message, and raise a channel exception if this is not the case.
Second call to Ack(...) for the same message will not return an error, but the channel gets closed due to this exception received from server:
Exception (406) Reason: "PRECONDITION_FAILED - unknown delivery tag ?"
It is possible to register a listener via Channel.NotifyClose to observe this exception.
I am using a Huawei E303 modem and I am not seeing new messages via AT commands. Huawei's application "Mobile Partner" is able to read these messages but I don't understand how.
I have set AT+CMGF=1 and when I try AT+CPMS? I get +CPMS:"SM",0,30,"SM",0,30,"SM",0,30
When a new message arrives I get a +CTMI: "SM", 0 every single time. That value never increments. AT+CMGL="ALL" returns response OK but no messages.
I am now out of ideas. How can I read a message if counting them always returns zero?
What is amazing is that the Huawei application can read incoming messages, can send messages without any problems.
I think I tried to start a channel that is already running or whatever. Whenever I start the sender channel, the receiver channel goes to a PAUSED state. I looked it up and found something about AdoptNewMCA configuration, not sure how to set it at the queue manager level. How do I fix this smoothly. Merely stopping and restarting the channels does not do it.
Error log says:
/02/2012 12:38:41 PM - Process(19161.269) User(mqm) Program(amqrmppa)
Host() Installation(Installation1)
VRMF(7.1.0.0) QMgr(QM_TEST2)
AMQ9514: Channel 'QM_TEST1.TO.QM_TEST2' is in use.
EXPLANATION: The requested operation failed because channel
''QM_TEST1.TO.QM_TEST2' is currently active. ACTION: Either end the channel
manually, or wait for it to close, and retry the operation.
----- amqrcsia.c : 1042 -------------------------------------------------------
08/02/2012 12:38:41 PM - Process(19161.269) User(mqm) Program(amqrmppa)
Host(...) Installation(Installation1)
VRMF(7.1.0.0) QMgr(QM_TEST2)
AMQ9999: Channel ''QM_TEST1.TO.QM_TEST2' to host '17.2.33.44' ended abnormally.
EXPLANATION: The channel program running under process ID 19161 for
channel ''QM_TEST1.TO.QM_TEST2' ended abnormally. The host name is
'17.2.33.44'; in some cases the host name cannot be
determined and so is shown as '????'. ACTION: Look at previous error
messages for the channel program in the error logs to determine the
cause of the failure. Note that this message can be excluded
completely or suppressed by tuning the "ExcludeMessage" or
"SuppressMessage" attributes under the "QMErrorLog" stanza in qm.ini.
Further information can be found in the System Administration Guide.
----- amqrmrsa.c : 887 --------------------------------------------------------
When looking these things up, I'd start first with the product manuals. In this case, the Infocenter topic on channel states says that a channel in PAUSED state is waiting on a retry interval. The sub-topic on channel errors explains why sending or receiving channels can be in retry:
If a channel is unable to put a message to the target queue because
that queue is full or put inhibited, the channel can retry the
operation a number of times (specified in the message-retry count
attribute) at a time interval (specified in the message-retry interval
attribute). Alternatively, you can write your own message-retry exit
that determines which circumstances cause a retry, and the number of
attempts made. The channel goes to PAUSED state while waiting for the
message-retry interval to finish.
So if you stop your channels, you should see a message in the XMitQ on the sending side. If you GET-enable that queue you can browse the message, look at the header and see which queue it is destined for. On the receiving side, look to see if that queue is full.
Classic fast-sender, slow-consumer problem here. If the consumer can't keep up, the messages back up on the receiving QMgr, then the channel goes to retry and they begin to back up on the sending QMgr. Got to monitor depth and input handles on request queues.
Make sure a DLQ is set.
Try reducing the message retry count to 1 to speed up use of the DLQ.