Apache CAMEL + HTTPS REST API (post) - https

Im a newbie to apache camle and
Lately Ive been trying to make a post request to a HTTPS Rest API.
I have gone through many posts and documentation but still I couldnt get a gist of this.
Please find my code below
**
from("timer:aTimer?period=20s")
.process(ex->ex.getIn().setBody(
"{\n" +
" \"userId\": 777,\n" +
" \"title\": \"sample\",\n" +
" \"body\": \"my body\"\n" +
" }"
))
.setHeader(Exchange.HTTP_METHOD,constant("POST"))
.setHeader(Exchange.CONTENT_TYPE,constant("application/json"))
.to("restlet:https://jsonplaceholder.typicode.com/posts")
.log("${body}");**
Whenever I run my application im getting the below error.
Started
INFO DefaultCamelContext - Apache Camel 2.20.1 (CamelContext: camel-1) is starting
INFO ManagedManagementStrategy - JMX is enabled
INFO DefaultTypeConverter - Type converters loaded (core: 192, classpath: 14)
INFO DefaultCamelContext - StreamCaching is not in use. If using streams then its recommended to enable stream caching. See more details at http://camel.apache.org/stream-caching.html
Mar 05, 2018 3:20:45 PM org.restlet.ext.httpclient.HttpClientHelper start
INFO: Starting the Apache HTTP client
INFO DefaultCamelContext - Route: route1 started and consuming from: timer://aTimer?period=20s
INFO DefaultCamelContext - Total 1 routes, of which 1 are started
INFO DefaultCamelContext - Apache Camel 2.20.1 (CamelContext: camel-1) started in 0.879 seconds
INFO DefaultCamelContext - Apache Camel 2.20.1 (CamelContext: camel-1) is shutting down
INFO DefaultShutdownStrategy - Starting to graceful shutdown 1 routes (timeout 300 seconds)
INFO DefaultShutdownStrategy - Waiting as there are still 1 inflight and pending exchanges to complete, timeout in 300 seconds. Inflights per route: [route1 = 1]
INFO DefaultShutdownStrategy - There are 1 inflight exchanges:
InflightExchange: [exchangeId=ID-ubuntu-Latitude-6430U-1520243444162-0-1, fromRouteId=route1, routeId=route1, nodeId=to1, elapsed=0, duration=3018]
INFO DefaultShutdownStrategy - Waiting as there are still 1 inflight and pending exchanges to complete, timeout in 299 seconds. Inflights per route: [route1 = 1]
INFO DefaultShutdownStrategy - There are 1 inflight exchanges:
InflightExchange: [exchangeId=ID-ubuntu-Latitude-6430U-1520243444162-0-1, fromRouteId=route1, routeId=route1, nodeId=to1, elapsed=0, duration=4020]
INFO DefaultShutdownStrategy - Waiting as there are still 1 inflight and pending exchanges to complete, timeout in 298 seconds. Inflights per route: [route1 = 1]
INFO DefaultShutdownStrategy - There are 1 inflight exchanges:
InflightExchange: [exchangeId=ID-ubuntu-Latitude-6430U-1520243444162-0-1, fromRouteId=route1, routeId=route1, nodeId=to1, elapsed=0, duration=5023]
Mar 05, 2018 3:20:51 PM org.restlet.ext.httpclient.internal.HttpMethodCall sendRequest
WARNING: An error occurred during the communication with the remote HTTP server.
javax.net.ssl.SSLException: Unrecognized SSL message, plaintext connection?
at sun.security.ssl.InputRecord.handleUnknownRecord(InputRecord.java:710)
at sun.security.ssl.InputRecord.read(InputRecord.java:527)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:983)
at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1385)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1413)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1397)
at org.apache.http.conn.ssl.SSLSocketFactory.createLayeredSocket(SSLSocketFactory.java:573)
at org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:557)
at org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:414)
at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180)
at org.apache.http.impl.conn.AbstractPoolEntry.open(AbstractPoolEntry.java:144)
at org.apache.http.impl.conn.AbstractPooledConnAdapter.open(AbstractPooledConnAdapter.java:134)
at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:610)
at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:445)
at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:835)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
at org.restlet.ext.httpclient.internal.HttpMethodCall.sendRequest(HttpMethodCall.java:339)
at org.restlet.ext.httpclient.internal.HttpMethodCall.sendRequest(HttpMethodCall.java:363)
at org.restlet.engine.adapter.ClientAdapter.commit(ClientAdapter.java:81)
at org.restlet.engine.adapter.HttpClientHelper.handle(HttpClientHelper.java:119)
at org.restlet.Client.handle(Client.java:153)
at org.restlet.Restlet.handle(Restlet.java:342)
at org.restlet.Restlet.handle(Restlet.java:355)
at org.apache.camel.component.restlet.RestletProducer.process(RestletProducer.java:179)
at org.apache.camel.processor.SendProcessor.process(SendProcessor.java:148)
at org.apache.camel.processor.RedeliveryErrorHandler.process(RedeliveryErrorHandler.java:548)
at org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:201)
at org.apache.camel.processor.Pipeline.process(Pipeline.java:138)
at org.apache.camel.processor.Pipeline.process(Pipeline.java:101)
at org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:201)
at org.apache.camel.component.timer.TimerConsumer.sendTimerExchange(TimerConsumer.java:197)
at org.apache.camel.component.timer.TimerConsumer$1.run(TimerConsumer.java:79)
at java.util.TimerThread.mainLoop(Timer.java:555)
at java.util.TimerThread.run(Timer.java:505)
WARN TimerConsumer - Error processing exchange. Exchange[ID-ubuntu-Latitude-6430U-1520243444162-0-1]. Caused by: [org.apache.camel.component.restlet.RestletOperationException - Restlet operation failed invoking https://jsonplaceholder.typicode.com:80/443:posts with statusCode: 1001 /n responseBody:HTTPS/1.1 - Communication Error (1001) - The connector failed to complete the communication with the server]
org.apache.camel.component.restlet.RestletOperationException: Restlet operation failed invoking https://jsonplaceholder.typicode.com:80/443:posts with statusCode: 1001 /n responseBody:HTTPS/1.1 - Communication Error (1001) - The connector failed to complete the communication with the server
at org.apache.camel.component.restlet.RestletProducer.populateRestletProducerException(RestletProducer.java:304)
at org.apache.camel.component.restlet.RestletProducer$1.handle(RestletProducer.java:190)
at org.restlet.engine.adapter.ClientAdapter$1.handle(ClientAdapter.java:90)
at org.restlet.ext.httpclient.internal.HttpMethodCall.sendRequest(HttpMethodCall.java:371)
at org.restlet.engine.adapter.ClientAdapter.commit(ClientAdapter.java:81)
at org.restlet.engine.adapter.HttpClientHelper.handle(HttpClientHelper.java:119)
at org.restlet.Client.handle(Client.java:153)
at org.restlet.Restlet.handle(Restlet.java:342)
at org.restlet.Restlet.handle(Restlet.java:355)
at org.apache.camel.component.restlet.RestletProducer.process(RestletProducer.java:179)
at org.apache.camel.processor.SendProcessor.process(SendProcessor.java:148)
at org.apache.camel.processor.RedeliveryErrorHandler.process(RedeliveryErrorHandler.java:548)
at org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:201)
at org.apache.camel.processor.Pipeline.process(Pipeline.java:138)
at org.apache.camel.processor.Pipeline.process(Pipeline.java:101)
at org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:201)
at org.apache.camel.component.timer.TimerConsumer.sendTimerExchange(TimerConsumer.java:197)
at org.apache.camel.component.timer.TimerConsumer$1.run(TimerConsumer.java:79)
at java.util.TimerThread.mainLoop(Timer.java:555)
at java.util.TimerThread.run(Timer.java:505)
ERROR DefaultErrorHandler - Failed delivery for (MessageId: ID-ubuntu-Latitude-6430U-1520243444162-0-2 on ExchangeId: ID-ubuntu-Latitude-6430U-1520243444162-0-1). Exhausted after delivery attempt: 1 caught: org.apache.camel.component.restlet.RestletOperationException: Restlet operation failed invoking https://jsonplaceholder.typicode.com:80/443:posts with statusCode: 1001 /n responseBody:HTTPS/1.1 - Communication Error (1001) - The connector failed to complete the communication with the server
Message History
---------------------------------------------------------------------------------------------------------------------------------------
RouteId ProcessorId Processor Elapsed (ms)
[route1 ] [route1 ] [timer://aTimer?period=20s ] [ 5321]
[route1 ] [process1 ] [Processor#0x33ae3bf8 ] [ 4]
[route1 ] [setHeader1 ] [setHeader[CamelHttpMethod] ] [ 0]
[route1 ] [setHeader2 ] [setHeader[Content-Type] ] [ 0]
[route1 ] [to1 ] [restlet:https://jsonplaceholder.typicode.com/443:posts ] [ 5308]
Stacktrace
---------------------------------------------------------------------------------------------------------------------------------------
org.apache.camel.component.restlet.RestletOperationException: Restlet operation failed invoking https://jsonplaceholder.typicode.com:80/443:posts with statusCode: 1001 /n responseBody:HTTPS/1.1 - Communication Error (1001) - The connector failed to complete the communication with the server
at org.apache.camel.component.restlet.RestletProducer.populateRestletProducerException(RestletProducer.java:304)
at org.apache.camel.component.restlet.RestletProducer$1.handle(RestletProducer.java:190)
at org.restlet.engine.adapter.ClientAdapter$1.handle(ClientAdapter.java:90)
at org.restlet.ext.httpclient.internal.HttpMethodCall.sendRequest(HttpMethodCall.java:371)
at org.restlet.engine.adapter.ClientAdapter.commit(ClientAdapter.java:81)
at org.restlet.engine.adapter.HttpClientHelper.handle(HttpClientHelper.java:119)
at org.restlet.Client.handle(Client.java:153)
at org.restlet.Restlet.handle(Restlet.java:342)
at org.restlet.Restlet.handle(Restlet.java:355)
at org.apache.camel.component.restlet.RestletProducer.process(RestletProducer.java:179)
at org.apache.camel.processor.SendProcessor.process(SendProcessor.java:148)
at org.apache.camel.processor.RedeliveryErrorHandler.process(RedeliveryErrorHandler.java:548)
at org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:201)
at org.apache.camel.processor.Pipeline.process(Pipeline.java:138)
at org.apache.camel.processor.Pipeline.process(Pipeline.java:101)
at org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:201)
at org.apache.camel.component.timer.TimerConsumer.sendTimerExchange(TimerConsumer.java:197)
at org.apache.camel.component.timer.TimerConsumer$1.run(TimerConsumer.java:79)
at java.util.TimerThread.mainLoop(Timer.java:555)
at java.util.TimerThread.run(Timer.java:505)
Mar 05, 2018 3:20:52 PM org.restlet.ext.httpclient.HttpClientHelper stop
INFO: Stopping the HTTP client
INFO DefaultShutdownStrategy - Route: route1 shutdown complete, was consuming from: timer://aTimer?period=20s
INFO DefaultShutdownStrategy - Graceful shutdown of 1 routes completed in 3 seconds
INFO DefaultCamelContext - Apache Camel 2.20.1 (CamelContext: camel-1) uptime 7.927 seconds
INFO DefaultCamelContext - Apache Camel 2.20.1 (CamelContext: camel-1) is shutdown in 3.048 seconds
Please help me.. I've also tried to use Apache HTTP4 component but still no luck.

Related

ZAP API Scan failing with error Read timed out

I am able to do an API scan as well as generate a report when I run the below command from Windows :
docker run -v "$(pwd):/zap/wrk/:rw" -t owasp/zap2docker-weekly zap-api-scan.py -t http://10.170.170.170:1700 /account?field4=448808888888"&"field7=GENERIC01"&"field10=ABC076 -f openapi -r ZAP_Report.htm
Once I switch to running the same command :
docker run -v $(pwd):/zap/wrk/:rw -t owasp/zap2docker-weekly zap-api-scan.py -t http://10.170.170.170:1700/account?field4=448808888888"&"field7=GENERIC01"&"field10=DCF43 -f openapi -r ~/serverkeys/ZAP_REPORT.htm
from Debian I get an error, not quite sure what I'm missing :
.....
[ZAP-ActiveScanner-1] WARN org.zaproxy.zap.extension.ascanrules.CommandInjectionScanRule - Command Injection vulnerability check failed for parameter [field10] and payload [';cat /etc/passwd;'] due to an I/O error
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method) ~[?:?]
at java.net.SocketInputStream.socketRead(SocketInputStream.java:115) ~[?:?]
at java.net.SocketInputStream.read(SocketInputStream.java:168) ~[?:?]
at java.net.SocketInputStream.read(SocketInputStream.java:140) ~[?:?]
at java.io.BufferedInputStream.fill(BufferedInputStream.java:252) ~[?:?]
at java.io.BufferedInputStream.read(BufferedInputStream.java:271) ~[?:?]
at org.apache.commons.httpclient.HttpParser.readRawLine(HttpParser.java:78) ~[commons-httpclient-3.1.jar:D-2021-10-25]
at org.apache.commons.httpclient.HttpParser.readLine(HttpParser.java:106) ~[commons-httpclient-3.1.jar:D-2021-10-25]
at org.apache.commons.httpclient.HttpConnection.readLine(HttpConnection.java:1153) ~[zap-D-2021-10-25.jar:D-2021-10-25]
at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.readLine(MultiThreadedHttpConnectionManager.java:1413) ~[commons-httpclient-3.1.jar:D-2021-10-25]
at org.apache.commons.httpclient.HttpMethodBase.readStatusLine(HttpMethodBase.java:2138) ~[zap-D-2021-10-25.jar:D-2021-10-25]
at org.zaproxy.zap.ZapGetMethod.readResponse(ZapGetMethod.java:112) ~[zap-D-2021-10-25.jar:D-2021-10-25]
at org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1162) ~[zap-D-2021-10-25.jar:D-2021-10-25]
at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:470) ~[zap-D-2021-10-25.jar:D-2021-10-25]
at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:207) ~[zap-D-2021-10-25.jar:D-2021-10-25]
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) ~[commons-httpclient-3.1.jar:D-2021-10-25]
at org.parosproxy.paros.network.HttpSender.executeMethod(HttpSender.java:430) ~[zap-D-2021-10-25.jar:D-2021-10-25]
at org.parosproxy.paros.network.HttpSender.runMethod(HttpSender.java:672) ~[zap-D-2021-10-25.jar:D-2021-10-25]
at org.parosproxy.paros.network.HttpSender.send(HttpSender.java:627) ~[zap-D-2021-10-25.jar:D-2021-10-25]
at org.parosproxy.paros.network.HttpSender.sendAuthenticated(HttpSender.java:602) ~[zap-D-2021-10-25.jar:D-2021-10-25]
at org.parosproxy.paros.network.HttpSender.sendAuthenticated(HttpSender.java:585) ~[zap-D-2021-10-25.jar:D-2021-10-25]
at org.parosproxy.paros.network.HttpSender.sendAndReceive(HttpSender.java:490) ~[zap-D-2021-10-25.jar:D-2021-10-25]
at org.parosproxy.paros.core.scanner.AbstractPlugin.sendAndReceive(AbstractPlugin.java:315) ~[zap-D-2021-10-25.jar:D-2021-10-25]
at org.parosproxy.paros.core.scanner.AbstractPlugin.sendAndReceive(AbstractPlugin.java:246) ~[zap-D-2021-10-25.jar:D-2021-10-25]
at org.zaproxy.zap.extension.ascanrules.CommandInjectionScanRule.testCommandInjection(CommandInjectionScanRule.java:524) [ascanrules-release-42.zap:?]
at org.zaproxy.zap.extension.ascanrules.CommandInjectionScanRule.scan(CommandInjectionScanRule.java:431) [ascanrules-release-42.zap:?]
at org.parosproxy.paros.core.scanner.AbstractAppParamPlugin.scan(AbstractAppParamPlugin.java:201) [zap-D-2021-10-25.jar:D-2021-10-25]
at org.parosproxy.paros.core.scanner.AbstractAppParamPlugin.scan(AbstractAppParamPlugin.java:126) [zap-D-2021-10-25.jar:D-2021-10-25]
at org.parosproxy.paros.core.scanner.AbstractAppParamPlugin.scan(AbstractAppParamPlugin.java:87) [zap-D-2021-10-25.jar:D-2021-10-25]
at org.parosproxy.paros.core.scanner.AbstractPlugin.run(AbstractPlugin.java:333) [zap-D-2021-10-25.jar:D-2021-10-25]
at java.lang.Thread.run(Thread.java:829) [?:?]
493852 [Thread-6] INFO org.parosproxy.paros.core.scanner.HostProcess - completed host/plugin http://10.170.4.117:8002 | CommandInjectionScanRule in 421.201s with 84 message(s) sent and 0 alert(s) raised.
493853 [Thread-6] INFO org.parosproxy.paros.core.scanner.HostProcess - start host http://10.170.170.170:1700 | DirectoryBrowsingScanRule strength MEDIUM threshold MEDIUM
493988 [Thread-6] INFO org.parosproxy.paros.core.scanner.HostProcess - completed host/plugin http://10.170.170.170:1700 | DirectoryBrowsingScanRule in 0.136s with 2 message(s) sent and 0 alert(s) raised.
493988 [Thread-6] INFO org.parosproxy.paros.core.scanner.HostProcess - start host http://10.170.170.170:1700 | BufferOverflowScanRule strength MEDIUM threshold MEDIUM
494126 [Thread-6] INFO org.parosproxy.paros.core.scanner.HostProcess - completed host/plugin http://10.170.170.170:1700 | BufferOverflowScanRule in 0.137s with 3 message(s) sent and 0 alert(s) raised.
494126 [Thread-6] INFO org.parosproxy.paros.core.scanner.HostProcess - start host http://10.170.170.170:1700 | FormatStringScanRule strength MEDIUM threshold MEDIUM
494287 [Thread-6] INFO org.parosproxy.paros.core.scanner.HostProcess - completed host/plugin http://10.170.170.170:1700 | FormatStringScanRule in 0.161s with 9 message(s) sent and 0 alert(s) raised.
494287 [Thread-6] INFO org.parosproxy.paros.core.scanner.HostProcess - start host http://10.170.170.170:1700 | CrlfInjectionScanRule strength MEDIUM threshold MEDIUM
494560 [Thread-6] INFO org.parosproxy.paros.core.scanner.HostProcess - completed host/plugin http://10.170.170.170:1700 | CrlfInjectionScanRule in 0.273s with 21 message(s) sent and 0 alert(s) raised.
........
........
Is they any additional tracing I can do on the scan - why its timing out?
It appears the scan is terminating before completing and its also pointing to /etc/passwd ??
You are not necessarily missing anything.
ZAP typically makes loads of requests to the target. Some of those may timeout - thats all this warning is telling you. If you keep getting these then it might be an indication that your site has become unresponsive.

Corda node crashed after Artemis MessagingClient failed, "Artemis MessagingClient failed. Shutting down."

The following error occurs while running 2 nodes and a notary with CordaOSS4.3 (Amazon EFS is used for Artemis service of each node and notary).
・nodeA
[INFO ] 2021-03-24T01:53:33,526Z [nioEventLoopGroup-2-1] engine.ConnectionStateMachine. - Transport Error TransportImpl [_connectionEndpoint=org.apache.qpid.proton.engine.impl.ConnectionImpl#d8755f, org.apache.qpid.proton.engine.impl.TransportImpl#720cb721] {localLegalName=O=nodeA, L=Local, C=JP, remoteLegalName=O=nodeB, L=Local, C=JP, serverMode=false}
[INFO ] 2021-03-24T01:53:33,526Z [nioEventLoopGroup-2-1] engine.ConnectionStateMachine. - Error: connection aborted {localLegalName=O=nodeA, L=Local, C=JP, remoteLegalName=O=nodeB, L=Local, C=JP, serverMode=false}
[INFO ] 2021-03-24T01:53:33,527Z [nioEventLoopGroup-2-1] netty.AMQPClient. - Disconnected from [NLBendpoint]:10005
[INFO ] 2021-03-24T01:53:33,527Z [nioEventLoopGroup-2-1] netty.AMQPChannelHandler. - Closed client connection 828af8c0 from [NLBendpoint]:10005 to /xx.xx.x.xx:40438 {allowedRemoteLegalNames=O=nodeB, L=Local, C=JP, localCert=O=nodeA, L=Local, C=JP, remoteAddress=[NLBendpoint]:10005, remoteCert=O=nodeB, L=Local, C=JP, serverMode=false}
[INFO ] 2021-03-24T01:53:33,527Z [nioEventLoopGroup-2-1] bridging.AMQPBridgeManager$AMQPBridge. - Bridge Disconnected {legalNames=O=nodeB, L=Local, C=JP, maxMessageSize=10485760, queueName=internal.peers.DLB29JcZp4kCP2aGGZKGkhw2X5RenndTjEK4xy48iT9643, targets=[NLBendpoint]:10005}
[WARN ] 2021-03-24T01:55:59,747Z [Thread-17936 (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$5#2936f48a)] core.client. - AMQ212037: Connection failure has been detected: AMQ119014: Did not receive data from /xxx.0.0.1:53166 within the 60,000ms connection TTL. The connection will now be closed. [code=CONNECTION_TIMEDOUT]
[WARN ] 2021-03-24T01:55:59,748Z [Thread-949 (ActiveMQ-client-global-threads)] core.client. - AMQ212037: Connection failure has been detected: AMQ119011: Did not receive data from server for org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnection#6eb6efd1[ID=e834052e, local= /127.0.0.1:53170, remote=localhost/127.0.0.1:10008] [code=CONNECTION_TIMEDOUT]
[WARN ] 2021-03-24T01:55:59,751Z [Thread-948 (ActiveMQ-client-global-threads)] core.client. - AMQ212037: Connection failure has been detected: AMQ119011: Did not receive data from server for org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnection#505dd5b8[ID=f1885302, local= /127.0.0.1:53166, remote=localhost/127.0.0.1:10008] [code=CONNECTION_TIMEDOUT]
[WARN ] 2021-03-24T01:55:59,751Z [Thread-950 (ActiveMQ-client-global-threads)] core.client. - AMQ212037: Connection failure has been detected: AMQ119011: Did not receive data from server for org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnection#57579387[ID=718e48b8, local= /127.0.0.1:53168, remote=localhost/127.0.0.1:10008] [code=CONNECTION_TIMEDOUT]
[WARN ] 2021-03-24T01:55:59,774Z [nioEventLoopGroup-2-1] netty.AMQPChannelHandler. - Closing channel due to nonrecoverable exception AMQ119014: Timed out after waiting 30,000 ms for response when sending packet 68 {allowedRemoteLegalNames=O=nodeB, L=Local, C=JP, localCert=O=nodeA, L=Local, C=JP, remoteAddress=[NLBendpoint]:10005, remoteCert=O=nodeB, L=Local, C=JP, serverMode=false}
[INFO ] 2021-03-24T01:55:59,775Z [nioEventLoopGroup-2-1] netty.AMQPClient. - Retry connect to [NLBendpoint]:10005
[ERROR] 2021-03-24T01:55:59,779Z [Thread-612] errorAndTerminate. - ArtemisMessagingClient failed. Shutting down.
・notary
[INFO ] 2021-03-24T01:53:34,850Z [nioEventLoopGroup-2-4] engine.ConnectionStateMachine. - Transport Error TransportImpl [_connectionEndpoint=org.apache.qpid.proton.engine.impl.ConnectionImpl#1a1be565, org.apache.qpid.proton.engine.impl.TransportImpl#1e6940e2] {localLegalName=O=Notary1, L=Local, C=JP, remoteLegalName=O=nodeA, L=Local, C=JP, serverMode=false}
[INFO ] 2021-03-24T01:53:34,850Z [nioEventLoopGroup-2-4] engine.ConnectionStateMachine. - Error: connection aborted {localLegalName=O=Notary1, L=Local, C=JP, remoteLegalName=O=nodeA, L=Local, C=JP, serverMode=false}
[INFO ] 2021-03-24T01:53:34,851Z [nioEventLoopGroup-2-4] netty.AMQPClient. - Disconnected from [NLBendpoint]:10008
[INFO ] 2021-03-24T01:53:34,851Z [nioEventLoopGroup-2-4] netty.AMQPChannelHandler. - Closed client connection 9da3b393 from [NLBendpoint]:10008 to /xx.xx.x.xx:33438 {allowedRemoteLegalNames=O=nodeA, L=Local, C=JP, localCert=O=Notary1, L=Local, C=JP, remoteAddress=[NLBendpoint]:10008, remoteCert=O=nodeA, L=Local, C=JP, serverMode=false}
[INFO ] 2021-03-24T01:53:34,851Z [nioEventLoopGroup-2-4] bridging.AMQPBridgeManager$AMQPBridge. - Bridge Disconnected {legalNames=O=nodeA, L=Local, C=JP, maxMessageSize=10485760, queueName=internal.peers.DLHVntq87Ai3vLSuQzG8BoKcc2napU6aU3NPVFwiF73322, targets=[NLBendpoint]:10008}
[INFO ] 2021-03-24T01:54:03,123Z [nioEventLoopGroup-2-3] netty.AMQPClient. - Retry connect to [NLBendpoint]:10005
[WARN ] 2021-03-24T01:54:17,939Z [nioEventLoopGroup-2-2] netty.AMQPChannelHandler. - SSL Handshake timed out {allowedRemoteLegalNames=O=nodeA, L=Local, C=JP, localCert=null, remoteAddress=[NLBendpoint]:10008, remoteCert=null, serverMode=false}
[ERROR] 2021-03-24T01:54:17,939Z [nioEventLoopGroup-2-2] netty.AMQPChannelHandler. - Handshake failure handshake timed out {allowedRemoteLegalNames=O=nodeA, L=Local, C=JP, localCert=null, remoteAddress=[NLBendpoint]:10008, remoteCert=null, serverMode=false}
[INFO ] 2021-03-24T01:56:11,385Z [nioEventLoopGroup-2-2] netty.AMQPClient. - Retry connect to [NLBendpoint]:10005
[INFO ] 2021-03-24T01:56:11,392Z [nioEventLoopGroup-2-3] netty.AMQPClient. - Failed to connect to [NLBendpoint]:10005
[INFO ] 2021-03-24T01:56:13,393Z [nioEventLoopGroup-2-4] netty.AMQPClient. - Retry connect to [NLBendpoint]:10005
[INFO ] 2021-03-24T01:56:13,398Z [nioEventLoopGroup-2-1] netty.AMQPClient. - Failed to connect to [NLBendpoint]:10005
After these logs were output, the nodeA process was down. (The notary process is still running)
What could be the cause of this problem?
I suspect that the connection to the Artemis service has been lost as a result of some problem connecting to Amazon EFS because these are output in the OS log.
Mar 24 10:55:51 [serverName] stunnel: LOG5[4]: Connection reset: 1105153036 byte(s) sent to TLS, 839120060 byte(s) sent to socket
Mar 24 10:55:54 [serverName] stunnel: LOG5[5]: Service [efs] accepted connection from xxx.x.x.x:38710
Mar 24 10:55:54 [serverName] stunnel: LOG5[5]: s_connect: connected xx.xx.x.xx:2049
Mar 24 10:55:54 [serverName] stunnel: LOG5[5]: Service [efs] connected remote server from xx.xx.x.xx:51468
Mar 24 10:55:55 [serverName] stunnel: LOG5[5]: Certificate accepted at depth=0: CN=*.efs.ap-northeast-1.amazonaws.com
Mar 24 10:55:55 [serverName] stunnel: LOG3[5]: transfer: s_poll_wait: TIMEOUTclose exceeded: closing
Mar 24 10:55:55 [serverName] stunnel: LOG5[5]: Connection closed: 0 byte(s) sent to TLS, 0 byte(s) sent to socket
Mar 24 10:55:55 [serverName] stunnel: LOG5[6]: Service [efs] accepted connection from xxx.x.x.x:38716
Mar 24 10:55:55 [serverName] stunnel: LOG5[6]: s_connect: connected xx.xx.x.xx2049
Mar 24 10:55:55 [serverName] stunnel: LOG5[6]: Service [efs] connected remote server from xx.xx.x.xx:51474
I believe we talked about this on slack, but yeah if you start a corda node and it can't bind on the p2p port or p2pAddress. that could cause artemis errors like you're describing.
it might also be something strange going on in your network security group. Make sure you're able to get this working on your local machine and that the nodes can all ping / telnet each other on the ports that you expect.

Kerberos problem: GSSException: No valid credentials provided

My application is sending data to Kafka, Kerberos is used for authentication. Everything works fine for around 20 days, then I get the following exception:
2020-01-07 22:22:08.481 DEBUG 24987 --- [fka-producer-network-thread | producer-1] org.apache.kafka.clients.NetworkClient : Initiating connection to node mkav2.dc.ex.com:9092 (id: 101 rack: null)
2020-01-07 22:22:08.481 DEBUG 24987 --- [fka-producer-network-thread | producer-1] org.apache.kafka.common.security.authenticator.SaslClientAuthenticator : Set SASL client state to SEND_HANDSHAKE_REQUEST
2020-01-07 22:22:08.481 DEBUG 24987 --- [fka-producer-network-thread | producer-1] org.apache.kafka.common.security.authenticator.SaslClientAuthenticator : Creating SaslClient: client=lpa/appX.dc.ex.com#DC.EX.COM;service=kafka;serviceHostname=mkav2.dc.ex.com;mechs=[GSSAPI]
2020-01-07 22:22:08.482 DEBUG 24987 --- [fka-producer-network-thread | producer-1] org.apache.kafka.common.network.Selector : Created socket with SO_RCVBUF = 32768, SO_SNDBUF = 131072, SO_TIMEOUT = 0 to node 101
2020-01-07 22:22:08.482 DEBUG 24987 --- [fka-producer-network-thread | producer-1] org.apache.kafka.common.security.authenticator.SaslClientAuthenticator : Set SASL client state to RECEIVE_HANDSHAKE_RESPONSE
2020-01-07 22:22:08.482 DEBUG 24987 --- [fka-producer-network-thread | producer-1] org.apache.kafka.clients.NetworkClient : Completed connection to node 101. Fetching API versions.
2020-01-07 22:22:08.484 DEBUG 24987 --- [fka-producer-network-thread | producer-1] org.apache.kafka.common.security.authenticator.SaslClientAuthenticator : Set SASL client state to INITIAL
2020-01-07 22:22:08.484 DEBUG 24987 --- [fka-producer-network-thread | producer-1] org.apache.kafka.common.network.Selector : Connection with mkav2.dc.ex.com/172.10.15.44 disconnected
javax.security.sasl.SaslException: An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]) occurred when evaluating SASL token received from the Kafka Broker. Kafka Client will go to AUTH_FAILED state.
at org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.createSaslToken(SaslClientAuthenticator.java:298)
at org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.sendSaslToken(SaslClientAuthenticator.java:215)
at org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.authenticate(SaslClientAuthenticator.java:183)
at org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:76)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:376)
at org.apache.kafka.common.network.Selector.poll(Selector.java:326)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:433)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:224)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:162)
at java.lang.Thread.run(Thread.java:748)
Caused by: javax.security.sasl.SaslException: GSS initiate failed
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
at org.apache.kafka.common.security.authenticator.SaslClientAuthenticator$2.run(SaslClientAuthenticator.java:280)
at org.apache.kafka.common.security.authenticator.SaslClientAuthenticator$2.run(SaslClientAuthenticator.java:278)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.createSaslToken(SaslClientAuthenticator.java:278)
... 9 common frames omitted
Caused by: org.ietf.jgss.GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122)
at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
... 14 common frames omitted
2020-01-07 22:22:08.484 DEBUG 24987 --- [fka-producer-network-thread | producer-1] org.apache.kafka.clients.NetworkClient : Node 101 disconnected.
2020-01-07 22:22:08.484 WARN 24987 --- [fka-producer-network-thread | producer-1] org.apache.kafka.clients.NetworkClient : Connection to node 101 terminated during authentication. This may indicate that authentication failed due to invalid credentials.
After restarting the application everything works fine for another 20 days or so and then I get the same exception again. These are the ticket properties in krb5.conf file:
ticket_lifetime = 86400
renew_lifetime = 604800
Any ideas on why this could be happening?

Apache Atlas quickstart - kafka error

Env: no kerberos, no ranger, no hdfs. EC2 with ssl.
Getting this error after running $ATLAS_HOME/bin/quick_start.py https://$componentPrivateDNSRecord:21443 with correct user/pass
Creating sample types:
Created type [DB]
Created type [Table]
Created type [StorageDesc]
Created type [Column]
Created type [LoadProcess]
Created type [View]
Created type [JdbcAccess]
Created type [ETL]
Created type [Metric]
Created type [PII]
Created type [Fact]
Created type [Dimension]
Created type [Log Data]
Creating sample entities:
Exception in thread "main" com.sun.jersey.api.client.ClientHandlerException: java.net.SocketTimeoutException: Read timed out
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:155)
at com.sun.jersey.api.client.filter.HTTPBasicAuthFilter.handle(HTTPBasicAuthFilter.java:105)
at com.sun.jersey.api.client.Client.handle(Client.java:652)
at com.sun.jersey.api.client.WebResource.handle(WebResource.java:682)
at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
at com.sun.jersey.api.client.WebResource$Builder.method(WebResource.java:634)
at org.apache.atlas.AtlasBaseClient.callAPIWithResource(AtlasBaseClient.java:334)
at org.apache.atlas.AtlasBaseClient.callAPIWithResource(AtlasBaseClient.java:311)
at org.apache.atlas.AtlasBaseClient.callAPI(AtlasBaseClient.java:199)
at org.apache.atlas.AtlasClientV2.createEntity(AtlasClientV2.java:277)
at org.apache.atlas.examples.QuickStartV2.createInstance(QuickStartV2.java:339)
at org.apache.atlas.examples.QuickStartV2.createDatabase(QuickStartV2.java:362)
at org.apache.atlas.examples.QuickStartV2.createEntities(QuickStartV2.java:268)
at org.apache.atlas.examples.QuickStartV2.runQuickstart(QuickStartV2.java:150)
at org.apache.atlas.examples.QuickStartV2.main(QuickStartV2.java:132)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
at sun.security.ssl.InputRecord.read(InputRecord.java:503)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:983)
at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:940)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:735)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:678)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1587)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:347)
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:253)
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:153)
... 14 more
No sample data added to Apache Atlas Server.
Relevant code:
https://github.com/apache/incubator-atlas/blob/master/webapp/src/main/java/org/apache/atlas/examples/QuickStartV2.java
#This works
quickStartV2.createTypes();
#This errors
quickStartV2.createEntities();
First I thought atlas->kafka connectivity was issue but then I see:
[ec2-user#ip-10-160-187-181 logs]$ cat atlas_kafka_setup.log
2018-07-25 00:06:14,923 INFO - [main:] ~ Looking for atlas-application.properties in classpath (ApplicationProperties:78)
2018-07-25 00:06:14,926 INFO - [main:] ~ Loading atlas-application.properties from file:/home/ec2-user/atlas/distro/target/apache-atlas-1.0.0-SNAPSHOT-bin/apache-atlas-1.0.0-SNAPSHOT/conf/atlas-application.properties (ApplicationProperties:91)
2018-07-25 00:06:16,512 WARN - [main:] ~ Attempting to create topic ATLAS_HOOK (AtlasTopicCreator:72)
2018-07-25 00:06:17,004 WARN - [main:] ~ Created topic ATLAS_HOOK with partitions 1 and replicas 1 (AtlasTopicCreator:119)
2018-07-25 00:06:17,004 WARN - [main:] ~ Attempting to create topic ATLAS_ENTITIES (AtlasTopicCreator:72)
2018-07-25 00:06:17,024 WARN - [main:] ~ Created topic ATLAS_ENTITIES with partitions 1 and replicas 1 (AtlasTopicCreator:119)
2018-07-25 01:49:45,147 DEBUG - [main:] ~ Calling API [ GET : api/atlas/v2/types/typedefs ] (AtlasBaseClient:319)
2018-07-25 01:49:45,147 DEBUG - [main:] ~ Attempting to configure HTTPS connection using client configuration (SecureClientUtils$4:221)
2018-07-25 01:49:45,166 INFO - [main:] ~ Unable to configure HTTPS connection from configuration. Leveraging JDK properties. (SecureClientUtils$4:240)
2018-07-25 01:49:45,269 DEBUG - [main:] ~ API https://mydns:21443/api/atlas/v2/types/typedefs?name=Dimension returned status 200 (AtlasBaseClient:337)
2018-07-25 01:49:45,270 DEBUG - [main:] ~ Calling API [ GET : api/atlas/v2/types/typedefs ] (AtlasBaseClient:319)
2018-07-25 01:49:45,271 DEBUG - [main:] ~ Attempting to configure HTTPS connection using client configuration (SecureClientUtils$4:221)
2018-07-25 01:49:45,291 INFO - [main:] ~ Unable to configure HTTPS connection from configuration. Leveraging JDK properties. (SecureClientUtils$4:240)
2018-07-25 01:49:45,450 DEBUG - [main:] ~ API https://mydns:21443/api/atlas/v2/types/typedefs?name=Log+Data returned status 200 (AtlasBaseClient:337)
2018-07-25 01:49:45,455 DEBUG - [main:] ~ Calling API [ POST : api/atlas/v2/entity ] <== AtlasEntityWithExtInfo{entity=AtlasEntity{AtlasStruct{typeName='DB', attributes=[owner:John ETL, createTime:1532483385453, name:Sales, description:sales database, locationuri:hdfs://host:8000/apps/warehouse/sales]}guid='-6466195619848', status=null, createdBy='null', updatedBy='null', createTime=null, updateTime=null, version=0, relationshipAttributes=[], classifications=[], },AtlasEntityExtInfo{referredEntities={}}} (AtlasBaseClient:319)
2018-07-25 01:49:45,455 DEBUG - [main:] ~ Attempting to configure HTTPS connection using client configuration (SecureClientUtils$4:221)
2018-07-25 01:49:45,474 INFO - [main:] ~ Unable to configure HTTPS connection from configuration. Leveraging JDK properties. (SecureClientUtils$4:240)
2018-07-25 01:49:33,256 Audit: myuser/10.160.189.35-10.160.189.35 performed request POST https://mydns:21443/api/atlas/v2/types/typedefs (10.160.187.181) at time 2018-07-25T01:49Z
2018-07-25 01:49:45,445 Audit: myuser/10.160.189.35-10.160.189.35 performed request GET https://mydns:21443/api/atlas/v2/types/typedefs?name=Log+Data (10.160.187.181) at time 2018-07-25T01:49Z
2018-07-25 01:49:45,678 Audit: myuser/10.160.189.35-10.160.189.35 performed request POST https://mydns:21443/api/atlas/v2/entity (10.160.187.181) at time 2018-07-25T01:49Z
The 2 topics are returned by this:
$KAFKA_HOME/bin/kafka-topics.sh --list --zookeeper localhost:2181
atlas' application.log does have this, not sure why:
2018-07-25 02:18:14,991 DEBUG - [NotificationHookConsumer thread-0:] ~ Give up sending metadata request since no node is available (NetworkClient$DefaultMetadataUpdater:625)
2018-07-25 02:18:15,018 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Initialize connection to node -1 for sending metadata request (NetworkClient$DefaultMetadataUpdater:644)
2018-07-25 02:18:15,018 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Initiating connection to node -1 at localhost:9027. (NetworkClient:496)
2018-07-25 02:18:15,018 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Connection with localhost/127.0.0.1 disconnected (Selector:345)
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.kafka.common.network.PlaintextTransportLayer.finishConnect(PlaintextTransportLayer.java:51)
at org.apache.kafka.common.network.KafkaChannel.finishConnect(KafkaChannel.java:73)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:309)
at org.apache.kafka.common.network.Selector.poll(Selector.java:283)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:260)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:229)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:134)
at java.lang.Thread.run(Thread.java:748)
2018-07-25 02:18:15,018 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Node -1 disconnected. (NetworkClient:463)
2018-07-25 02:18:15,018 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Give up sending metadata request since no node is available (NetworkClient$DefaultMetadataUpdater:625)
2018-07-25 02:18:15,092 DEBUG - [NotificationHookConsumer thread-0:] ~ Initialize connection to node -1 for sending metadata request (NetworkClient$DefaultMetadataUpdater:644)
2018-07-25 02:18:15,092 DEBUG - [NotificationHookConsumer thread-0:] ~ Initiating connection to node -1 at localhost:9027. (NetworkClient:496)
2018-07-25 02:18:15,092 DEBUG - [NotificationHookConsumer thread-0:] ~ Connection with localhost/127.0.0.1 disconnected (Selector:345)
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.kafka.common.network.PlaintextTransportLayer.finishConnect(PlaintextTransportLayer.java:51)
at org.apache.kafka.common.network.KafkaChannel.finishConnect(KafkaChannel.java:73)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:309)
at org.apache.kafka.common.network.Selector.poll(Selector.java:283)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:260)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:360)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:192)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitMetadataUpdate(ConsumerNetworkClient.java:134)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:183)
at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:973)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:937)
at org.apache.atlas.kafka.AtlasKafkaConsumer.receive(AtlasKafkaConsumer.java:63)
at org.apache.atlas.kafka.AtlasKafkaConsumer.receive(AtlasKafkaConsumer.java:55)
at org.apache.atlas.notification.NotificationHookConsumer$HookConsumer.doWork(NotificationHookConsumer.java:305)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2018-07-25 02:18:15,092 DEBUG - [NotificationHookConsumer thread-0:] ~ Node -1 disconnected. (NetworkClient:463)
2018-07-25 02:18:15,092 DEBUG - [NotificationHookConsumer thread-0:] ~ Give up sending metadata request since no node is available (NetworkClient$DefaultMetadataUpdater:625)
2018-07-25 02:18:15,119 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Initialize connection to node -1 for sending metadata request (NetworkClient$DefaultMetadataUpdater:644)
2018-07-25 02:18:15,119 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Initiating connection to node -1 at localhost:9027. (NetworkClient:496)
2018-07-25 02:18:15,119 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Connection with localhost/127.0.0.1 disconnected (Selector:345)
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.kafka.common.network.PlaintextTransportLayer.finishConnect(PlaintextTransportLayer.java:51)
at org.apache.kafka.common.network.KafkaChannel.finishConnect(KafkaChannel.java:73)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:309)
at org.apache.kafka.common.network.Selector.poll(Selector.java:283)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:260)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:229)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:134)
at java.lang.Thread.run(Thread.java:748)
2018-07-25 02:18:15,119 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Node -1 disconnected. (NetworkClient:463)
2018-07-25 02:18:15,119 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Give up sending metadata request since no node is available (NetworkClient$DefaultMetadataUpdater:625)
This fixed it!
sed -i 's/atlas.kafka.bootstrap.servers=localhost:9027/atlas.kafka.bootstrap.servers=localhost:9092/' $ATLAS_HOME/conf/atlas-application.properties```

Spark on Yarn job failed with ExitCode:1 and stderr says "Can't find main class"

We tried to submit a simple SparkPI example onto Spark on Yarn. The bat is written as below:
./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster --num-executors 3 --driver-memory 4g --executor-memory 1g --executor-cores 1 .\examples\target\spark-examples_2.10-1.4.0.jar 10
pause
Our HDFS and Yarn works well. We are using Hadoop 2.7.0 and Spark 1.4.1. We have only 1 node that acts as both NameNode and DataNode.
When we execute it, it fails with log says the following:
2015-08-21 11:07:22,044 DEBUG [main] | ===============================================================================
2015-08-21 11:07:22,044 DEBUG [main] | Yarn AM launch context:
2015-08-21 11:07:22,044 DEBUG [main] | user class: org.apache.spark.examples.SparkPi
2015-08-21 11:07:22,044 DEBUG [main] | env:
2015-08-21 11:07:22,044 DEBUG [main] | CLASSPATH -> {{PWD}}<CPS>{{PWD}}/__hadoop_conf__<CPS>{{PWD}}/__spark__.jar<CPS>%HADOOP_HOME%\etc\hadoop<CPS>%HADOOP_HOME%\share\hadoop\common\*<CPS>%HADOOP_HOME%\share\hadoop\common\lib\*<CPS>%HADOOP_HOME%\share\hadoop\mapreduce\*<CPS>%HADOOP_HOME%\share\hadoop\mapreduce\lib\*<CPS>%HADOOP_HOME%\share\hadoop\hdfs\*<CPS>%HADOOP_HOME%\share\hadoop\hdfs\lib\*<CPS>%HADOOP_HOME%\share\hadoop\yarn\*<CPS>%HADOOP_HOME%\share\hadoop\yarn\lib\*<CPS>%HADOOP_MAPRED_HOME%\share\hadoop\mapreduce\*<CPS>%HADOOP_MAPRED_HOME%\share\hadoop\mapreduce\lib\*
2015-08-21 11:07:22,060 DEBUG [main] | SPARK_YARN_CACHE_FILES_FILE_SIZES -> 165181064,1420218
2015-08-21 11:07:22,060 DEBUG [main] | SPARK_YARN_STAGING_DIR -> .sparkStaging/application_1440062075415_0026
2015-08-21 11:07:22,060 DEBUG [main] | SPARK_YARN_CACHE_FILES_VISIBILITIES -> PRIVATE,PRIVATE
2015-08-21 11:07:22,060 DEBUG [main] | SPARK_USER -> msrabi
2015-08-21 11:07:22,060 DEBUG [main] | SPARK_YARN_MODE -> true
2015-08-21 11:07:22,060 DEBUG [main] | SPARK_YARN_CACHE_FILES_TIME_STAMPS -> 1440126441200,1440126441575
2015-08-21 11:07:22,060 DEBUG [main] | SPARK_YARN_CACHE_FILES -> hdfs://msra-sa-44:9000/user/msrabi/.sparkStaging/application_1440062075415_0026/spark-assembly-1.4.0-hadoop2.7.0.jar#__spark__.jar,hdfs://msra-sa-44:9000/user/msrabi/.sparkStaging/application_1440062075415_0026/spark-examples_2.10-1.4.0.jar#__app__.jar
2015-08-21 11:07:22,060 DEBUG [main] | resources:
2015-08-21 11:07:22,060 DEBUG [main] | __app__.jar -> resource { scheme: "hdfs" host: "msra-sa-44" port: 9000 file: "/user/msrabi/.sparkStaging/application_1440062075415_0026/spark-examples_2.10-1.4.0.jar" } size: 1420218 timestamp: 1440126441575 type: FILE visibility: PRIVATE
2015-08-21 11:07:22,060 DEBUG [main] | __spark__.jar -> resource { scheme: "hdfs" host: "msra-sa-44" port: 9000 file: "/user/msrabi/.sparkStaging/application_1440062075415_0026/spark-assembly-1.4.0-hadoop2.7.0.jar" } size: 165181064 timestamp: 1440126441200 type: FILE visibility: PRIVATE
2015-08-21 11:07:22,060 DEBUG [main] | __hadoop_conf__ -> resource { scheme: "hdfs" host: "msra-sa-44" port: 9000 file: "/user/msrabi/.sparkStaging/application_1440062075415_0026/__hadoop_conf__7908628615251032149.zip" } size: 82888 timestamp: 1440126441794 type: ARCHIVE visibility: PRIVATE
2015-08-21 11:07:22,060 DEBUG [main] | command:
2015-08-21 11:07:22,075 DEBUG [main] | {{JAVA_HOME}}/bin/java -server -Xmx4096m -Djava.io.tmpdir={{PWD}}/tmp '-Dspark.app.name=org.apache.spark.examples.SparkPi' '-Dspark.executor.memory=1g' '-Dspark.driver.memory=4g' '-Dspark.master=yarn-cluster' -Dspark.yarn.app.container.log.dir=<LOG_DIR> org.apache.spark.deploy.yarn.ApplicationMaster --class 'org.apache.spark.examples.SparkPi' --jar file:/D:/sp/./examples/target/spark-examples_2.10-1.4.0.jar --arg '10' --executor-memory 1024m --executor-cores 1 --num-executors 3 1> <LOG_DIR>/stdout 2> <LOG_DIR>/stderr
2015-08-21 11:07:22,075 DEBUG [main] | ===============================================================================
...........(omitting some lines)......
2015-08-21 11:07:23,231 INFO [main] | Application report for application_1440062075415_0026 (state: ACCEPTED)
2015-08-21 11:07:23,247 DEBUG [main] |
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1440126442169
final status: UNDEFINED
tracking URL: http://msra-sa-44:8088/proxy/application_1440062075415_0026/
user: msrabi
2015-08-21 11:07:24,263 TRACE [main] | 1: Call -> MSRA-SA-44/10.190.173.181:8032: getApplicationReport {application_id { id: 26 cluster_timestamp: 1440062075415 }}
2015-08-21 11:07:24,263 DEBUG [IPC Parameter Sending Thread #0] | IPC Client (443384617) connection to MSRA-SA-44/10.190.173.181:8032 from msrabi sending #37
2015-08-21 11:07:24,263 DEBUG [IPC Client (443384617) connection to MSRA-SA-44/10.190.173.181:8032 from msrabi] | IPC Client (443384617) connection to MSRA-SA-44/10.190.173.181:8032 from msrabi got value #37
2015-08-21 11:07:24,263 DEBUG [main] | Call: getApplicationReport took 0ms
2015-08-21 11:07:24,263 TRACE [main] | 1: Response <- MSRA-SA-44/10.190.173.181:8032: getApplicationReport {application_report { applicationId { id: 26 cluster_timestamp: 1440062075415 } user: "msrabi" queue: "default" name: "org.apache.spark.examples.SparkPi" host: "N/A" rpc_port: -1 yarn_application_state: ACCEPTED trackingUrl: "http://msra-sa-44:8088/proxy/application_1440062075415_0026/" diagnostics: "" startTime: 1440126442169 finishTime: 0 final_application_status: APP_UNDEFINED app_resource_Usage { num_used_containers: 1 num_reserved_containers: 0 used_resources { memory: 4608 virtual_cores: 1 } reserved_resources { memory: 0 virtual_cores: 0 } needed_resources { memory: 4608 virtual_cores: 1 } memory_seconds: 0 vcore_seconds: 0 } originalTrackingUrl: "N/A" currentApplicationAttemptId { application_id { id: 26 cluster_timestamp: 1440062075415 } attemptId: 1 } progress: 0.0 applicationType: "SPARK" }}
2015-08-21 11:07:24,263 INFO [main] | Application report for application_1440062075415_0026 (state: ACCEPTED)
.......(omitting some lines where the state are all ACCEPTED and final status are all UNDEFINED).....
2015-08-21 11:07:30,359 INFO [main] | Application report for application_1440062075415_0026 (state: FAILED)
2015-08-21 11:07:30,359 DEBUG [main] |
client token: N/A
diagnostics: Application application_1440062075415_0026 failed 2 times due to AM Container for appattempt_1440062075415_0026_000002 exited with exitCode: 1
For more detailed output, check application tracking page:http://msra-sa-44:8088/cluster/app/application_1440062075415_0026Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1440062075415_0026_02_000001
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Shell output: 1 file(s) moved.
And then we opened stderr, it says:
Error: Could not find or load main class 'Dspark.app.name=org.apache.spark.examples.SparkPi'
It's so strange, this should be a parameter passed to java, and it seems that java recognized it as the main class. There should be a main class parameter in the command section of the log, but there is not.
How can that happen? What should we do to know what's wrong with it?
Thank you!
We solved this problem.
The root cause is that when generating the java command line, our Spark uses single quote('-Dxxxx') to wrap the parameters. Single quote works only in Linux. On Windows, the parameters are either not wrapped, or wrapped with double quotes("-Dxxxx"). The only way to solve this is to edit the source code of Spark and re-compile it.
It seems that this is currently an issue of Spark. (https://issues.apache.org/jira/browse/SPARK-5754)

Resources