I'm seeing a bunch of messages on the dead letter queue and I don't understand what causes this.
I'm using MQ Explorer to browse such messages. Here's what I see in the Dead-Letter Header:
This doesn't tell me what the real cause of the problem is. How can I find out ?
I've read this article from IBM and it tells that the reason is likely a badly formatted message. In what way badly formatted?
(note: I'm in control of both producer and consumer)
you miss the forest for the trees :-)
The cause is in the field 'Reason' .
MQRC_BACKOUT_THRESHOLD_REACHED
This is described here in Knowledge Center
MQRC_BACKOUT_THRESHOLD_REACHED (0x93A; 2362)
Cause
The message has reached the Backout Threshold defined on the QLOCAL, but no Backout
Queue is defined. On platforms where you cannot define the Backout
Queue, the message has reached the JMS-defined backout threshold of
20.
Action
If this is not wanted, define the Backout Queue for the relevant QLOCAL. Also look for the cause of the multiple backouts.
As I expected the MQRC_BACKOUT_THRESHOLD_REACHED reason is really just a knock-on effect. To find the real reason you would need to look into the logs on either the consumer-side or the producer-side depending on what you see in the Put application name field in the screenshot from the dead-letter header (above).
I've now learned that MQ Classes for JMS produces log files in current dir named mqjms.log.x. By looking at this I can see the true reason for the problem:
July 19, 2016 4:48:33 PM CEST[Queue Service thread] com.ibm.msg.client.wmq.common.internal.messages.WMQReceiveMarshal
A received message could not be correctly parsed.
EXPLANATION:
The message with messageID = '414D512064657620202020202020202012048D5720064E04' and correlationID = '310000000000000000000000000000000000000000000000' could not be correctly parsed. The last successful data read from the message was at position '192' in buffer '0000: 5246 4820 0000 0002 0000 00c0 0000 0111 RFH ............
0010: 0000 04b8 4d51 5354 5220 2020 0000 0000 ....MQSTR ....
0020: 0000 04b8 0000 0020 3c6d 6364 3e3c 4d73 ....... <mcd><Ms
0030: 643e 6a6d 735f 7465 7874 3c2f 4d73 643e d>jms_text</Msd>
0040: 3c2f 6d63 643e 2020 0000 0074 3c6a 6d73 </mcd> ...t<jms
0050: 3e3c 4473 743e 7175 6575 653a 2f2f 2f6d ><Dst>queue:///m
0060: 7971 7565 7565 3c2f 4473 743e 3c54 6d73 yqueue</Dst><Tms
0070: 3e31 3436 3839 3339 3731 3338 3234 3c2f >1468939713824</
0080: 546d 733e 3c45 7870 3e31 3436 3839 3339 Tms><Exp>1468939
0090: 3734 3338 3234 3c2f 4578 703e 3c43 6964 743824</Exp><Cid
00a0: 3e49 443a 3331 3c2f 4369 643e 3c44 6c76 >ID:31</Cid><Dlv
00b0: 3e31 3c2f 446c 763e 3c2f 6a6d 733e 2020 >1</Dlv></jms>
00c0: 3c64 6174 614d 7367 2073 656e 7454 696d <mymessage .....
.................
' with exception '
Message : java.lang.Exception
Class : class java.lang.Exception
Stack : com.ibm.msg.client.wmq.internal.WMQConsumerShadow.getMsg(WMQConsumerShadow.java:1900)
: com.ibm.msg.client.wmq.internal.WMQSyncConsumerShadow.receiveInternal(WMQSyncConsumerShadow.java:231)
: com.ibm.msg.client.wmq.internal.WMQConsumerShadow.receive(WMQConsumerShadow.java:1471)
: com.ibm.msg.client.wmq.internal.WMQMessageConsumer.receive(WMQMessageConsumer.java:659)
: com.ibm.msg.client.jms.internal.JmsMessageConsumerImpl.receiveInboundMessage(JmsMessageConsumerImpl.java:1036)
: com.ibm.msg.client.jms.internal.JmsMessageConsumerImpl.receive(JmsMessageConsumerImpl.java:461)
: com.ibm.msg.client.jms.internal.JmsMessageConsumerImpl.receive(JmsMessageConsumerImpl.java:495)
: com.ibm.mq.jms.MQMessageConsumer.receive(MQMessageConsumer.java:209)
: org.mycorp.client.base.connection.QueueService.recvMessages(QueueService.java:129)
: org.mycorp.client.base.connection.QueueService.run(QueueService.java:92)
: java.lang.Thread.run(Thread.java:745)
' with MQMD 'version:2(0x2) report:0(0x0) msgType:8(0x8) expiry:300(0x12c) feedback:0(0x0) encoding:273(0x111) codedCharSetId:1208(0x4b8) format:'MQHRF2 ' priority:4(0x4) persistence:0(0x0) msgId:414D512064657620202020202020202012048D5720064E04 correlId:310000000000000000000000000000000000000000000000 backoutCount:0(0x0) replyToQ:' ' replyToQMgr:'dev ' userIdentifier:'peter ' accountingToken:160105150000008D3439C9CC13CC025B66F34BE903000000000000000000000B applIdentityData:' ' putApplType:28(0x1c) putApplName:'Carrefour Server ' putDate:'20160719' putTime:'14483382' applOriginData:' ' groupId:000000000000000000000000000000000000000000000000 msgSeqNumber:1(0x1) physicalMsgOffset:0(0x0) msgFlags:0(0x0) originalLength:-1(0xffffffff) '
EXPLANATION:
null
ACTION:
null
So there you have it. Somehow I've managed to create a JMS message which is invalid and without the producer-side noticing that there's a problem.
I'll need to figure that out but that'll be a topic for another post.
In short the answer to the question is: The backout is just a knock-on effect. The real reason is - as the doc from IBM says - a badly formatted message. The only method to figure that out is to look into whatever logs are dumped by either the message producer or (more likely) the message consumer. In my case - since no remote queues are involved - my message consumer is not an intermediary Queue Manager but my actual destination application. That's where I found the log.
Related
I know the same question title has been used but this is a different one.
I tried to integrate the implementation of spring boot keyvault into my own project by copying all the jars without using a maven pom.
I met the errors shown at the bottom of this post.
I can solve the SLF4j multiple binding error by removing an SLF4J error but java.lang.VerifyError remains there. This error is probably caused by jar conflicts but I am totally at a loss how to troubleshoot the issue.
The original application can run without issues even with all the newly added jars that are used for spring boot keyvault solution.
The Spring boot keyvalut solution also works if it is not integrated into my application.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/C:/Users/****/Documents/japserpublic-master/japserpublic-master/TestJasperReports2/keyVaultLib/slf4j-simple-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/C:/Users/****/Documents/japserpublic-master/japserpublic-master/TestJasperReports2/keyVaultLib/slf4j.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.SimpleLoggerFactory]
Exception in thread "main" java.lang.VerifyError: Stack map does not match the one at exception handler 77
Exception Details:
Location:
com/fasterxml/jackson/databind/deser/std/StdDeserializer._parseDate(Lcom/fasterxml/jackson/core/JsonParser;Lcom/fasterxml/jackson/databind/DeserializationContext;)Ljava/util/Date; #77: astore
Reason:
Type 'com/fasterxml/jackson/core/JsonParseException' (current frame, stack[0]) is not assignable to 'com/fasterxml/jackson/core/exc/StreamReadException' (stack map, stack[0])
Current Frame:
bci: #69
flags: { }
locals: { 'com/fasterxml/jackson/databind/deser/std/StdDeserializer', 'com/fasterxml/jackson/core/JsonParser', 'com/fasterxml/jackson/databind/DeserializationContext' }
stack: { 'com/fasterxml/jackson/core/JsonParseException' }
Stackmap Frame:
bci: #77
flags: { }
locals: { 'com/fasterxml/jackson/databind/deser/std/StdDeserializer', 'com/fasterxml/jackson/core/JsonParser', 'com/fasterxml/jackson/databind/DeserializationContext' }
stack: { 'com/fasterxml/jackson/core/exc/StreamReadException' }
Bytecode:
0x0000000: 2bb6 0035 aa00 0000 0000 0081 0000 0003
0x0000010: 0000 000b 0000 007a 0000 0081 0000 0081
0x0000020: 0000 0034 0000 0041 0000 0081 0000 0081
0x0000030: 0000 0081 0000 0071 2a2b b600 11b6 0012
0x0000040: 2cb6 006b b02b b600 4742 a700 223a 052c
0x0000050: 2ab4 0002 2bb6 006e 126f 03bd 0004 b600
0x0000060: 70c0 002d 3a06 1906 b600 4c42 bb00 7159
0x0000070: 21b7 0072 b02a 2cb6 0073 c000 71b0 2a2b
0x0000080: 2cb6 0074 b02c 2ab4 0002 2bb6 0025 c000
0x0000090: 71b0
Exception Handler Table:
bci [69, 74] => handler: 77
bci [69, 74] => handler: 77
Stackmap Table:
same_frame(#56)
same_frame(#69)
same_locals_1_stack_item_frame(#77,Object[#367])
append_frame(#108,Long)
chop_frame(#117,1)
same_frame(#126)
same_frame(#133)
at com.fasterxml.jackson.datatype.jsr310.JavaTimeModule.<init>(JavaTimeModule.java:118)
at com.azure.core.implementation.serializer.jackson.JacksonAdapter.initializeObjectMapper(JacksonAdapter.java:258)
at com.azure.core.implementation.serializer.jackson.JacksonAdapter.<init>(JacksonAdapter.java:74)
at com.azure.core.implementation.serializer.jackson.JacksonAdapter.createDefaultSerializerAdapter(JacksonAdapter.java:108)
at com.azure.identity.implementation.IdentityClient.<clinit>(IdentityClient.java:60)
at com.azure.identity.implementation.IdentityClientBuilder.build(IdentityClientBuilder.java:50)
at com.azure.identity.ManagedIdentityCredential.<init>(ManagedIdentityCredential.java:33)
at com.azure.identity.DefaultAzureCredential.<init>(DefaultAzureCredential.java:37)
at com.azure.identity.DefaultAzureCredentialBuilder.build(DefaultAzureCredentialBuilder.java:18)
at utilities.Test.main(Test.java:34)
Print your dependencies tree, and find out which lib related to the conflict jar and try to omit the old version. For maven: Maven dependency resolution (conflicted)
Adding the following to pom.xml fixed the issue for me
<dependencyManagement>
<dependencies>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.9.8</version>
</dependency>
</dependencies>
</dependencyManagement>
Only add jackson-databind maybe not work. I add all confict jackson jar to solve this exception.
I created a kafka cluster with 3 brokers and following details:
Created 3 topics, each one with replication factor=3 and partitions=2.
Created 2 producers each one writing to one of the topics.
Created a Streams application to process messages from 2 topics and write to the 3rd topic.
It was all running fine till now but I suddenly started getting the following warning when starting the Streams application:
[WARN ] 2018-06-08 21:16:49.188 [Stream3-4f7403ad-aba6-4d34-885d-60114fc9fcff-StreamThread-1] org.apache.kafka.clients.consumer.internals.Fetcher [Consumer clientId=Stream3-4f7403ad-aba6-4d34-885d-60114fc9fcff-StreamThread-1-restore-consumer, groupId=] Attempt to fetch offsets for partition Stream3-KSTREAM-OUTEROTHER-0000000005-store-changelog-0 failed due to: Disk error when trying to access log file on the disk.
Due to this warning, Streams application is not processing anything from the 2 topics.
I tried following things:
Stopped all brokers, deleted kafka-logs directory for each broker and restarted the brokers. It didn't solve the issue.
Stopped zookeeper and all brokers, deleted zookeeper logs as well as kafka-logs for each broker, restarted zookeeper and brokers and created the topics again. This too didn't solve the issue.
I am not able to find anything related to this error on official docs or web. Does anyone have an idea of why am I getting this error suddenly?
EDIT:
Out of 3 brokers, 2 brokers(broker-0 and broker-2) continously emit these logs:
Broker-0 logs:
[2018-06-09 02:03:08,750] INFO [ReplicaFetcher replicaId=0, leaderId=1, fetcherId=0] Retrying leaderEpoch request for partition initial11_topic-1 as the leader reported an error: NOT_LEADER_FOR_PARTITION (kafka.server.ReplicaFetcherThread)
[2018-06-09 02:03:08,750] INFO [ReplicaFetcher replicaId=0, leaderId=1, fetcherId=0] Retrying leaderEpoch request for partition initial12_topic-0 as the leader reported an error: NOT_LEADER_FOR_PARTITION (kafka.server.ReplicaFetcherThread)
Broker-2 logs:
[2018-06-09 02:04:46,889] INFO [ReplicaFetcher replicaId=2, leaderId=1, fetcherId=0] Retrying leaderEpoch request for partition initial11_topic-1 as the leader reported an error: NOT_LEADER_FOR_PARTITION (kafka.server.ReplicaFetcherThread)
[2018-06-09 02:04:46,889] INFO [ReplicaFetcher replicaId=2, leaderId=1, fetcherId=0] Retrying leaderEpoch request for partition initial12_topic-0 as the leader reported an error: NOT_LEADER_FOR_PARTITION (kafka.server.ReplicaFetcherThread)
Broker-1 shows following logs:
[2018-06-09 01:21:26,689] INFO [GroupMetadataManager brokerId=1] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
[2018-06-09 01:31:26,689] INFO [GroupMetadataManager brokerId=1] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
[2018-06-09 01:39:44,667] ERROR [KafkaApi-1] Number of alive brokers '0' does not meet the required replication factor '1' for the offsets topic (configured via 'offsets.topic.replication.factor'). This error can be ignored if the cluster is starting up and not all brokers are up yet. (kafka.server.KafkaApis)
[2018-06-09 01:41:26,689] INFO [GroupMetadataManager brokerId=1] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
I again stopped zookeeper and brokers, deleted their logs and restarted. As soon as I create the topics again, I start getting the above logs.
Topic details:
[zk: localhost:2181(CONNECTED) 3] get /brokers/topics/initial11_topic
{"version":1,"partitions":{"1":[1,0,2],"0":[0,2,1]}}
cZxid = 0x53
ctime = Sat Jun 09 01:25:42 EDT 2018
mZxid = 0x53
mtime = Sat Jun 09 01:25:42 EDT 2018
pZxid = 0x54
cversion = 1
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 52
numChildren = 1
[zk: localhost:2181(CONNECTED) 4] get /brokers/topics/initial12_topic
{"version":1,"partitions":{"1":[2,1,0],"0":[1,0,2]}}
cZxid = 0x61
ctime = Sat Jun 09 01:25:47 EDT 2018
mZxid = 0x61
mtime = Sat Jun 09 01:25:47 EDT 2018
pZxid = 0x62
cversion = 1
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 52
numChildren = 1
[zk: localhost:2181(CONNECTED) 5] get /brokers/topics/final11_topic
{"version":1,"partitions":{"1":[0,1,2],"0":[2,0,1]}}
cZxid = 0x48
ctime = Sat Jun 09 01:25:32 EDT 2018
mZxid = 0x48
mtime = Sat Jun 09 01:25:32 EDT 2018
pZxid = 0x4a
cversion = 1
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 52
numChildren = 1
Any clue?
I found out the issue. It was due to following incorrect config in server.properties of broker-1:
advertised.listeners=PLAINTEXT://10.23.152.109:9094
Mistakenly port for advertised.listeners got changed to same as port of advertised.listeners of broker-2.
Has anyone already found this issue at capturing traffic of snapchat?
Every https data deriving from other sites via application (of ios, android) are captured successfully by Fiddler but a few apps (appstore, snapchat) don't show anything just that request:
CONNECT app.snapchat.com:443 HTTP/1.1
Host: app.snapchat.com
User-Agent: Snapchat/10.8.1.0 (iPhone8,1; iOS 10.2.1; gzip)
Connection: keep-alive
Connection: keep-alive
A SSLv3-compatible ClientHello handshake was found. Fiddler extracted the parameters below.
Version: 3.3 (TLS/1.2)
Random: 59 23 9E E1 1C 23 49 F1 A1 21 6E 60 C5 94 AB E2 9F 09 10 C3 E0 C3 99 9B 78 9B 97 1F 74 69 5F 1C
"Time": 2089.12.12. 15:48:57
SessionID: empty
Extensions:
server_name app.snapchat.com
elliptic_curves secp256r1 [0x17], secp384r1 [0x18], secp521r1 [0x19]
ec_point_formats uncompressed [0x0]
signature_algs sha256_rsa, sha1_rsa, sha384_rsa, sha512_rsa, sha256_ecdsa, sha1_ecdsa, sha384_ecdsa, sha512_ecdsa
NextProtocolNego empty
ALPN http/1.1, http/1.0
status_request OCSP - Implicit Responder
SignedCertTimestamp (RFC6962) empty
extended_master_secret empty
Ciphers:
[00FF] TLS_EMPTY_RENEGOTIATION_INFO_SCSV
[C02C] TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
[C02B] TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256
[C024] TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384
[C023] TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256
[C00A] TLS1_CK_ECDHE_ECDSA_WITH_AES_256_CBC_SHA
[C009] TLS1_CK_ECDHE_ECDSA_WITH_AES_128_CBC_SHA
[C030] TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
[C02F] TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
[C028] TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384
[C027] TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
[C014] TLS1_CK_ECDHE_RSA_WITH_AES_256_CBC_SHA
[C013] TLS1_CK_ECDHE_RSA_WITH_AES_128_CBC_SHA
Compression:
[00] NO_COMPRESSION
What would be done for the unimpeded working?
Extending this issue more what's the reason the fiddler (or other interceptors) isn't able to capture all https data but their Connect handshakes?
Probably this is due to certificate pinning. Snapchat is know to use this to prevent MITM attacks.
Take a look at this answer here:
https://stackoverflow.com/a/40543302/1353689
and the links related to it as well.
I have written a Websphere MQ listener using Spring boot (JMS). I have configured the backout queue at queue level with threshold as 0
As part of program I am throwing JMSException immediately I receive message so that my message goes to back out queue.
The problem that I am facing is that message is getting continuously re-delivered to listener
JMSMessage class: jms_text
JMSType: null
JMSDeliveryMode: 2
JMSExpiration: 0
JMSPriority: 0
JMSMessageID: ID:414d51204d5148554244313020202020583e8e2b26af9905
JMSTimestamp: 1484639118180
JMSCorrelationID: null
JMSDestination: null
JMSReplyTo: null
JMSRedelivered: true
JMSXAppID: WebSphere MQ Client for Java
JMSXDeliveryCount: 98
JMSXUserID: a450922
JMS_IBM_Character_Set: UTF-8
JMS_IBM_Encoding: 546
JMS_IBM_Format: MQSTR
JMS_IBM_MsgType: 8
JMS_IBM_PutApplType: 28
JMS_IBM_PutDate: 20170117
JMS_IBM_PutTime: 07451818
#JmsListener(destination = "${ibm.mq.incomingqueue}", containerFactory = "defaultJmsListenerContainerFactory")
public void onMessage(TextMessage message) throws JMSException {
System.out.println("Here" + message.toString());
throw new JMSException("reason");
}
BOTHRESH must be greater than or equal to 1. Setting BOTHRESH to 0 disables it.
Reference the How WebSphere Application Server handles poison messages by IBM's
Paul Titheridge.
When WebSphere MQ is the JMS provider
By default, queues created with WebSphere MQ have the Backout threshold property (known in WebSphere MQ terms as BOTHRESH) set to 0. Therefore, the default behaviour of WebSphere MQ is never to back out poison messages.
When I run apache bench I get results like:
Command: abs.exe -v 3 -n 10 -c 1 https://mysite
Connection Times (ms)
min mean[+/-sd] median max
Connect: 203 213 8.1 219 219
Processing: 78 177 88.1 172 359
Waiting: 78 169 84.6 156 344
Total: 281 389 86.7 391 564
I can't seem to find the definition of Connect, Processing and Waiting. What do those numbers mean?
By looking at the source code we find these timing points:
apr_time_t start, /* Start of connection */
connect, /* Connected, start writing */
endwrite, /* Request written */
beginread, /* First byte of input */
done; /* Connection closed */
And when request is done some timings are stored as:
s->starttime = c->start;
s->ctime = ap_max(0, c->connect - c->start);
s->time = ap_max(0, c->done - c->start);
s->waittime = ap_max(0, c->beginread - c->endwrite);
And the 'Processing time' is later calculated as
s->time - s->ctime;
So if we translate this to a timeline:
t1: Start of connection
t2: Connected, start writing
t3: Request written
t4: First byte of input
t5: Connection closed
Then the definitions would be:
Connect: t1-t2 Most typically the network latency
Processing: t2-t5 Time to receive full response after connection was opened
Waiting: t3-t4 Time-to-first-byte after the request was sent
Total time: t1-t5
From http://chestofbooks.com/computers/webservers/apache/Stas-Bekman/Practical-mod_perl/9-1-1-ApacheBench.html:
Connect and Waiting times
The amount of time it took to establish the connection and get the first bits of a response
Processing time
The server response time—i.e., the time it took for the server to process the request and send a reply
Total time
The sum of the Connect and Processing times
I equate this to:
Connect time: the amount of time it took for the socket to open
Processing time: first byte + transfer
Waiting: time till first byte
Total: Sum of Connect + Processing
Connect: Time it takes to connect to remote host
Processing: Total time minus time it takes to connect to remote host
Waiting: Response first byte receive minus last byte sent
Total: From before connect till after the connection is closed