Spark 2.1 + Yarn application has already ended

Spark 2.1 + Yarn application has already ended - hadoop

we are using spark application version 2.1 in out ambari cluster
ambari thrift servers isn't stable and restarted all times
from the log we can see that:
ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
we found the following link that described solution for this problem
https://markobigdata.com/2016/08/11/yarn-application-has-already-ended-it-might-have-been-killed-or-unable-to-launch-application-master/
but after we set the parameters as described in the article , the problem still exsist
please advice what is the solution for this?
full log:
tail -f spark-hive-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-master01.sys873dns.com.out
Spark Command: /usr/jdk64/jdk1.8.0_112/bin/java -Dhdp.version=2.6.0.3-8 -cp /usr/hdp/current/spark2-thriftserver/conf/:/usr/hdp/current/spark2-thriftserver/jars/*:/usr/hdp/current/hadoop-client/conf/ -Xmx10000m org.apache.spark.deploy.SparkSubmit --conf spark.driver.memory=50g --properties-file /usr/hdp/current/spark2-thriftserver/conf/spark-thrift-sparkconf.conf --class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 --name Thrift JDBC/ODBC Server --executor-cores 7 spark-internal
========================================
Warning: Master yarn-client is deprecated since 2.0. Please use master "yarn" with specified deploy mode instead.
18/02/08 09:38:07 ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:85)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:62)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:156)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2320)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:860)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:860)
at org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:47)
at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveThriftServer2.scala:81)
at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThriftServer2.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:745)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
18/02/08 09:38:07 WARN YarnSchedulerBackend$YarnSchedulerEndpoint: Attempted to request executors before the AM has registered!
18/02/08 09:38:07 ERROR Utils: Uncaught exception in thread main
java.lang.NullPointerException
I give also the yarn logs:
grep -i erro yarn-yarn-resourcemanager-master01.sys873dns.com.log
018-02-08 11:19:00,993 INFO zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server master01.sys873dns.com/23.1.29.61:2181. Will not attempt to authenticate using SASL (unknown error)
2018-02-08 11:19:15,767 ERROR resourcemanager.ResourceManager (LogAdapter.java:error(69)) - RECEIVED SIGNAL 15: SIGTERM
2018-02-08 11:19:27,281 INFO zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server master01.sys873dns.com/23.1.29.61:2181. Will not attempt to authenticate using SASL (unknown error)
2018-02-08 11:29:00,064 INFO zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server master01.sys873dns.com/23.1.29.61:2181. Will not attempt to authenticate using SASL (unknown error)
2018-02-08 11:29:01,839 INFO zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server master01.sys873dns.com/23.1.29.61:2181. Will not attempt to authenticate using SASL (unknown error)
2018-02-08 11:29:15,725 ERROR resourcemanager.ResourceManager (LogAdapter.java:error(69)) - RECEIVED SIGNAL 15: SIGTERM
2018-02-08 11:29:27,033 INFO zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server master03.sys873dns.com/23.1.29.63:2181. Will not attempt to authenticate using SASL (unknown error)
ons.YarnException: Unauthorized request to start container.
2018-02-08 12:56:11,144 INFO amlauncher.AMLauncher (AMLauncher.java:run(273)) - Error launching appattempt_1518089370033_0028_000008. Got exception: org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start container.
2018-02-08 12:59:39,822 INFO amlauncher.AMLauncher (AMLauncher.java:run(273)) - Error launching appattempt_1518089370033_0029_000002. Got exception: org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start container.
2018-02-08 13:00:01,671 INFO amlauncher.AMLauncher (AMLauncher.java:run(273)) - Error launching appattempt_1518089370033_0029_000004. Got exception: org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start container.
2018-02-08 13:00:18,062 INFO amlauncher.AMLauncher (AMLauncher.java:run(273)) - Error launching appattempt_1518089370033_0029_000006. Got exception: org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start container.
2018-02-08 13:00:20,245 INFO amlauncher.AMLauncher (AMLauncher.java:run(273)) - Error launching appattempt_1518089370033_0030_000003. Got exception: org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start container.
2018-02-08 13:00:42,100 INFO amlauncher.AMLauncher (AMLauncher.java:run(273)) - Error launching appattempt_1518089370033_0030_000006. Got exception: org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start container.
2018-02-08 13:00:56,310 INFO amlauncher.AMLauncher (AMLauncher.java:run(273)) - Error launching appattempt_1518089370033_0030_000008. Got exception: org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start container.
2018-02-08 13:00:58,511 INFO amlauncher.AMLauncher (AMLauncher.java:run(273)) - Error launching appattempt_1518089370033_0030_000010. Got exception: org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start container.
2018-02-08 13:00:58,537 INFO rmapp.RMAppImpl (RMAppImpl.java:transition(1063)) - Application application_1518089370033_0030 failed 10 times due to Error launching appattempt_1518089370033_0030_000010. Got exception: org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start container.
last log
2018-02-08 14:14:54,410 INFO rmapp.RMAppImpl (RMAppImpl.java:handle(778)) - application_1518089370033_0050 State change from FINAL_SAVING to FAILED
2018-02-08 14:14:54,410 INFO capacity.ParentQueue (ParentQueue.java:removeApplication(385)) - Application removed - appId: application_1518089370033_0050 user: hive leaf-queue of parent: root #applications: 1
2018-02-08 14:14:54,412 INFO integration.RMRegistryOperationsService (RMRegistryOperationsService.java:onApplicationCompleted(119)) - Application application_1518089370033_0050 completed, purging application-level records
2018-02-08 14:14:54,412 INFO integration.RMRegistryOperationsService (RMRegistryOperationsService.java:purgeRecordsAsync(198)) - records under / with ID application_1518089370033_0050 and policy application: {}
2018-02-08 14:14:55,393 INFO rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(422)) - container_e09_1518089370033_0049_10_000001 Container Transitioned from RUNNING to COMPLETED
2018-02-08 14:14:55,393 INFO scheduler.SchedulerNode (SchedulerNode.java:releaseContainer(220)) - Released container container_e09_1518089370033_0049_10_000001 of capacity <memory:10240, vCores:1> on host worker02.sys768.com:45454, which currently has 0 containers, <memory:0, vCores:0> used and <memory:30720, vCores:6> available, release resources=true
2018-02-08 14:14:55,393 INFO attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:rememberTargetTransitionsAndStoreState(1209)) - Updating application attempt appattempt_1518089370033_0049_000010 with final state: FAILED, and exit status: -1000
2018-02-08 14:14:55,398 INFO attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(809)) - appattempt_1518089370033_0049_000010 State change from LAUNCHED to FINAL_SAVING
2018-02-08 14:14:55,399 INFO integration.RMRegistryOperationsService (RMRegistryOperationsService.java:onContainerFinished(144)) - Container container_e09_1518089370033_0049_10_000001 finished, purging container-level records
2018-02-08 14:14:55,400 INFO integration.RMRegistryOperationsService (RMRegistryOperationsService.java:purgeRecordsAsync(198)) - records under / with ID container_e09_1518089370033_0049_10_000001 and policy container: {}
2018-02-08 14:14:55,408 INFO resourcemanager.ApplicationMasterService (ApplicationMasterService.java:unregisterAttempt(685)) - Unregistering app attempt : appattempt_1518089370033_0049_000010
2018-02-08 14:14:55,408 INFO security.AMRMTokenSecretManager (AMRMTokenSecretManager.java:applicationMasterFinished(124)) - Application finished, removing password for appattempt_1518089370033_0049_000010
2018-02-08 14:14:55,408 INFO attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(809)) - appattempt_1518089370033_0049_000010 State change from FINAL_SAVING to FAILED
2018-02-08 14:14:55,408 INFO rmapp.RMAppImpl (RMAppImpl.java:transition(1330)) - The number of failed attempts is 10. The max attempts is 10
2018-02-08 14:14:55,409 INFO rmapp.RMAppImpl (RMAppImpl.java:rememberTargetTransitionsAndStoreState(1123)) - Updating application application_1518089370033_0049 with final state: FAILED
2018-02-08 14:14:55,409 INFO rmapp.RMAppImpl (RMAppImpl.java:handle(778)) - application_1518089370033_0049 State change from ACCEPTED to FINAL_SAVING
2018-02-08 14:14:55,409 INFO recovery.RMStateStore (RMStateStore.java:transition(228)) - Updating info for app: application_1518089370033_0049
2018-02-08 14:14:55,409 INFO capacity.CapacityScheduler (CapacityScheduler.java:doneApplicationAttempt(811)) - Application Attempt appattempt_1518089370033_0049_000010 is done. finalState=FAILED
2018-02-08 14:14:55,409 INFO scheduler.AppSchedulingInfo (AppSchedulingInfo.java:clearRequests(124)) - Application application_1518089370033_0049 requests cleared
2018-02-08 14:14:55,410 INFO capacity.LeafQueue (LeafQueue.java:removeApplicationAttempt(795)) - Application removed - appId: application_1518089370033_0049 user: hive queue: default #user-pending-applications: 0 #user-active-applications: 0 #queue-pending-applications: 0 #queue-active-applications: 0
2018-02-08 14:14:55,417 INFO rmapp.RMAppImpl (RMAppImpl.java:transition(1063)) - Application application_1518089370033_0049 failed 10 times due to AM Container for appattempt_1518089370033_0049_000010 exited with exitCode: -1000
For more detailed output, check the application tracking page: http://master02.sys768.com:8088/cluster/app/application_1518089370033_0049 Then click on links to logs of each attempt.
Diagnostics: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-1212891131-25.1.53.61-1518077044052:blk_1073741833_1009 file=/hdp/apps/2.6.0.3-8/spark2/spark2-hdp-yarn-archive.tar.gz
Failing this attempt. Failing the application.
2018-02-08 14:14:55,418 INFO rmapp.RMAppImpl (RMAppImpl.java:handle(778)) - application_1518089370033_0049 State change from FINAL_SAVING to FAILED
2018-02-08 14:14:55,418 INFO capacity.ParentQueue (ParentQueue.java:removeApplication(385)) - Application removed - appId: application_1518089370033_0049 user: hive leaf-queue of parent: root #applications: 0
2018-02-08 14:14:55,419 INFO integration.RMRegistryOperationsService (RMRegistryOperationsService.java:onApplicationCompleted(119)) - Application application_1518089370033_0049 completed, purging application-level records
2018-02-08 14:14:55,419 INFO integration.RMRegistryOperationsService (RMRegistryOperationsService.java:purgeRecordsAsync(198)) - records under / with ID application_1518089370033_0049 and policy application: {}
[root#master02 yarn]#

Related

Apache Nifi - refused to connect to localhost error

When I tried to connect to Nifi UI using http://localhost:8080/nifi, i am getting below error
org.apache.nifi.web.server.JettyServer Failed to start web server... shutting down.
java.net.BindException: Address already in use: bind
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.eclipse.jetty.server.ServerConnector.openAcceptChannel(ServerConnector.java:331)
at org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:299)
at org.eclipse.jetty.server.AbstractNetworkConnector.doStart(AbstractNetworkConnector.java:80)
at org.eclipse.jetty.server.ServerConnector.doStart(ServerConnector.java:235)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
at org.eclipse.jetty.server.Server.doStart(Server.java:398)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
at org.apache.nifi.web.server.JettyServer.start(JettyServer.java:935)
at org.apache.nifi.NiFi.<init>(NiFi.java:158)
at org.apache.nifi.NiFi.<init>(NiFi.java:72)
at org.apache.nifi.NiFi.main(NiFi.java:297)
2020-02-27 11:51:11,834 INFO [Thread-1] org.apache.nifi.NiFi Initiating shutdown of Jetty web server...
2020-02-27 11:51:11,836 INFO [Thread-1] o.eclipse.jetty.server.AbstractConnector Stopped ServerConnector#355ee205{HTTP/1.1,[http/1.1]}{0.0.0.0:8080}
2020-02-27 11:51:11,837 INFO [Thread-1] org.eclipse.jetty.server.session node 0 Stopped scavenging
Can anyone suggest what is the cause of this issue?
Nifi version- 1.9.2,installed on windows machine
Here is the nifi status logs,
12:33:16.886 [main] DEBUG org.apache.nifi.bootstrap.NotificationServiceManager - Found 0 service elements
12:33:16.896 [main] INFO org.apache.nifi.bootstrap.NotificationServiceManager - Successfully loaded the following 0 services: []
12:33:16.897 [main] INFO org.apache.nifi.bootstrap.RunNiFi - Registered no Notification Services for Notification Type NIFI_STARTED
12:33:16.897 [main] INFO org.apache.nifi.bootstrap.RunNiFi - Registered no Notification Services for Notification Type NIFI_STOPPED
12:33:16.898 [main] INFO org.apache.nifi.bootstrap.RunNiFi - Registered no Notification Services for Notification Type NIFI_DIED
12:33:16.899 [main] DEBUG org.apache.nifi.bootstrap.Command - Status File:
12:33:16.900 [main] DEBUG org.apache.nifi.bootstrap.Command - Properties: {pid=9724}
Failed to determine if Process 9724 is running; assuming that it is not
12:33:16.902 [main] INFO org.apache.nifi.bootstrap.Command - Apache NiFi is not running

The port use by nifi is already used by another process.
you can change web server port in conf/nifi.properties

Spark streaming job on YARN cluster mode stuck in accepted, then fails with a Timeout Exception

I am running a spark streaming application that simply read messages from a Kafka topic, enrich them and then write the enriched messages in another kafka topic.
I already tried it in Standalone mode (both client and cluster deploy mode) and in YARN client mode, successfully.
When I submit the application in cluster mode it gives me the following messages:
18/01/10 12:13:34 INFO Client: Submitting application application_1515582681419_0001 to ResourceManager
18/01/10 12:13:34 INFO YarnClientImpl: Submitted application application_1515582681419_0001
18/01/10 12:13:35 INFO Client: Application report for application_1515582681419_0001 (state: ACCEPTED)
18/01/10 12:13:35 INFO Client:
client token: N/A
diagnostics: AM container is launched, waiting for AM container to Register with RM
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1515582814080
final status: UNDEFINED
tracking URL: http://ambari1.internal:8088/proxy/application_1515582681419_0001/
user: root
18/01/10 12:13:36 INFO Client: Application report for application_1515582681419_0001 (state: ACCEPTED)
18/01/10 12:13:37 INFO Client: Application report for application_1515582681419_0001 (state: ACCEPTED)
And keeps stuck in ACCEPTED Status until after around 4-5 minutes, exit with the following error message:
18/01/10 12:17:00 INFO InputInfoTracker: remove old batch metadata: 1515583000000 ms
18/01/10 12:17:02 ERROR ApplicationMaster: Uncaught exception:
java.util.concurrent.TimeoutException: Futures timed out after [100000 milliseconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:201)
at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:423)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:282)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:768)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:67)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:66)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:766)
at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
18/01/10 12:17:02 INFO ApplicationMaster: Final app status: FAILED, exitCode: 10, (reason: Uncaught exception: java.util.concurrent.TimeoutException: Futures timed out after [100000 milliseconds])
18/01/10 12:17:02 INFO StreamingContext: Invoking stop(stopGracefully=false) from shutdown hook
18/01/10 12:17:02 INFO ReceiverTracker: ReceiverTracker stopped
18/01/10 12:17:02 INFO JobGenerator: Stopping JobGenerator immediately
Funny fact: If I visit the age of the application, I can see that the Spark Context has been started and it processes some messages.
Could anyone help me on this?
PS: These are the resources of my YARN cluster:

The problem might be with Yarn "App Timeline Server". Try to restart it.

Are you creating your spark session with master as local?. Please do check this.

Job job_* failed with state FAILED due to: Application application_* failed 2 times due to ApplicationMaster for attempt appattempt_* timed out.

I submitted a job to a cluster running Hadoop 2.7.1.'jps'is okay in Master and Slaves."hdfs dfsadmin -report" is fun,but when i run any grep or wordcount,it is wrong.Even small input file,it stays for half to one hour,then failed with following errors.
15/12/09 08:42:55 INFO impl.YarnClientImpl: Submitted application application_1449645631518_0003
15/12/09 08:42:55 INFO mapreduce.Job: The url to track the job: http://Master:8088/proxy/application_1449645631518_0003/
15/12/09 08:42:55 INFO mapreduce.Job: Running job: job_1449645631518_0003
15/12/09 09:07:12 INFO mapreduce.Job: Job job_1449645631518_0003 running in uber mode : false
15/12/09 09:07:12 INFO mapreduce.Job: map 0% reduce 0%
15/12/09 09:07:12 INFO mapreduce.Job: Job job_1449645631518_0003 failed with state FAILED due to: Application application_1449645631518_0003 failed 2 times due to ApplicationMaster for attempt appattempt_1449645631518_0003_000002 timed out. Failing the application.
15/12/09 09:07:12 INFO mapreduce.Job: Counters: 0
15/12/09 09:07:13 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/12/09 09:07:13 INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/hadoop/.staging/job_1449645631518_0004
org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://Master:9000/user/hadoop/grep-temp-105897268
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:323)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:265)
at org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:59)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:387)
at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:301)
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:318)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:196)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308)
at org.apache.hadoop.examples.Grep.run(Grep.java:94)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.examples.Grep.main(Grep.java:103)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
This is ResourceManager log:
2015-12-09 12:37:11,661 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: The number of failed attempts is 2. The max attempts is 2
2015-12-09 12:37:11,661 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Updating application application_1449645631518_0005 with final state: FAILED
2015-12-09 12:37:11,661 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating info for app: application_1449645631518_0005
2015-12-09 12:37:11,661 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1449645631518_0005 State change from ACCEPTED to FINAL_SAVING
2015-12-09 12:37:11,661 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Application Attempt appattempt_1449645631518_0005_000002 is done. finalState=FAILED
2015-12-09 12:37:11,662 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1449645631518_0005_02_000001 Container Transitioned from RUNNING to KILLED
2015-12-09 12:37:11,662 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp: Completed container: container_1449645631518_0005_02_000001 in state: KILLED event:KILL
2015-12-09 12:37:11,662 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop OPERATION=AM Released Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1449645631518_0005 CONTAINERID=container_1449645631518_0005_02_000001
2015-12-09 12:37:11,662 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Released container container_1449645631518_0005_02_000001 of capacity <memory:2048, vCores:1> on host Slave2:48352, which currently has 0 containers, <memory:0, vCores:0> used and <memory:8192, vCores:8> available, release resources=true
2015-12-09 12:37:11,662 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: default used=<memory:0, vCores:0> numContainers=0 user=hadoop user-resources=<memory:0, vCores:0>
2015-12-09 12:37:11,662 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: completedContainer container=Container: [ContainerId: container_1449645631518_0005_02_000001, NodeId: Slave2:48352, NodeHttpAddress: Slave2:8042, Resource: <memory:2048, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 11.11.1.3:48352 }, ] queue=default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:0, vCores:0>, usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=1, numContainers=0 cluster=<memory:16384, vCores:16>
2015-12-09 12:37:11,663 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: completedContainer queue=root usedCapacity=0.0 absoluteUsedCapacity=0.0 used=<memory:0, vCores:0> cluster=<memory:16384, vCores:16>
2015-12-09 12:37:11,663 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Re-sorting completed queue: root.default stats: default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:0, vCores:0>, usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=1, numContainers=0
2015-12-09 12:37:11,663 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Application attempt appattempt_1449645631518_0005_000002 released container container_1449645631518_0005_02_000001 on node: host: Slave2:48352 #containers=0 available=<memory:8192, vCores:8> used=<memory:0, vCores:0> with event: KILL
2015-12-09 12:37:11,663 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: Application application_1449645631518_0005 requests cleared
2015-12-09 12:37:11,663 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application removed - appId: application_1449645631518_0005 user: hadoop queue: default #user-pending-applications: 0 #user-active-applications: 0 #queue-pending-applications: 0 #queue-active-applications: 0
2015-12-09 12:37:11,663 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Application application_1449645631518_0005 failed 2 times due to ApplicationMaster for attempt appattempt_1449645631518_0005_000002 timed out. Failing the application.
2015-12-09 12:37:11,667 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1449645631518_0005 State change from FINAL_SAVING to FAILED
2015-12-09 12:37:11,667 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Application removed - appId: application_1449645631518_0005 user: hadoop leaf-queue of parent: root #applications: 0
2015-12-09 12:37:11,667 WARN org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop OPERATION=Application Finished - Failed TARGET=RMAppManager RESULT=FAILURE DESCRIPTION=App failed with state: FAILED PERMISSIONS=Application application_1449645631518_0005 failed 2 times due to ApplicationMaster for attempt appattempt_1449645631518_0005_000002 timed out. Failing the application. APPID=application_1449645631518_0005
2015-12-09 12:37:11,668 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary: appId=application_1449645631518_0005,name=grep-search,user=hadoop,queue=default,state=FAILED,trackingUrl=http://Master:8088/cluster/app/application_1449645631518_0005,appMasterHost=N/A,startTime=1449663079331,finishTime=1449664631661,finalStatus=FAILED,memorySeconds=3177991,vcoreSeconds=1550,preemptedAMContainers=0,preemptedNonAMContainers=0,preemptedResources=<memory:0\, vCores:0>,applicationType=MAPREDUCE
2015-12-09 12:37:11,668 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Cleaning master appattempt_1449645631518_0005_000002
2015-12-09 12:37:12,366 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated new applicationId: 6
2015-12-09 12:37:12,710 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Null container completed...
2015-12-09 12:37:12,711 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Null container completed...
What is wrong with it.It has troubled me several days,thank you very much for help me!

While running a topology in storm we are getting error like this

While running a topology in storm we are getting error like this,
8983 [Thread-6] INFO com.netflix.curator.framework.imps.CuratorFrameworkImpl -
Starting
9144 [main] INFO **backtype.storm.daemon.nimbus** - Shutting down master
9199 [Thread-6-EventThread] INFO backtype.storm.zookeeper - Zookeeper state upd
ate: :connected:none
9241 [main] INFO backtype.storm.daemon.nimbus - Shut down master
9273 [Thread-6] INFO com.netflix.curator.framework.imps.CuratorFrameworkImpl -
Starting
9306 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] WARN org.apache.zookeeper.serv
er.NIOServerCnxn - EndOfStreamException: Unable to read additional data from cli
ent sessionid 0x143af55728d0003, likely client has closed socket
9354 [main] INFO backtype.storm.daemon.supervisor - Shutting down c094c3b1-a378
-4c4f-af35-9278647c217a:4beddc09-4675-4fb9-8bdc-9cf5013ce9ca
9358 [main] INFO backtype.storm.daemon.supervisor - Shut down c094c3b1-a378-4c4
f-af35-9278647c217a:4beddc09-4675-4fb9-8bdc-9cf5013ce9ca
9361 [main] INFO **backtype.storm.daemon.superviso**r - Shutting down supervisor c0
94c3b1-a378-4c4f-af35-9278647c217a
9364 [Thread-5] INFO **backtype.storm.event** - Event manager interrupted
9369 [Thread-6] INFO backtype.storm.event - Event manager interrupted
9425 [main] INFO **backtype.storm.daemon.supervisor** - Shutting down supervisor 38
6d8d71-c9b5-4b51-bd6e-f9f605034ea0
9428 [Thread-8] INFO backtype.storm.event - Event manager interrupted
9429 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] WARN org.apache.zookeeper.serv
er.NIOServerCnxn - EndOfStreamException: Unable to read additional data from cli
ent sessionid 0x143af55728d0007, likely client has closed socket
9429 [Thread-9] INFO backtype.storm.event - Event manager interrupted
9473 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] WARN org.apache.zookeeper.serv
er.NIOServerCnxn - EndOfStreamException: Unable to read additional data from cli
ent sessionid 0x143af55728d0009, likely client has closed socket
9476 [main] INFO backtype.storm.testing - Shutting down in process zookeeper
9503 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] WARN org.apache.zookeeper.serv
er.NIOServerCnxn - Ignoring exception
**java.nio.channels.ClosedChannelException**: null
at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.jav
a:211) ~[na:1.7.0_03]
at org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.j
ava:242) ~[zookeeper-3.3.3.jar:3.3.3-1073969]
9510 [main] INFO **backtype.storm.testing** - Done shutting down in process zookeep
er
9513 [main] INFO backtype.storm.testing - Deleting temporary path C:\Users\sowm
iya\AppData\Local\Temp\c9b1bc1a-a950-4098-af77-f81a4d2b112f
9520 [main] INFO backtype.storm.testing - Deleting temporary path C:\Users\sowm
iya\AppData\Local\Temp\7e75c468-18ea-4787-a4ac-496fb108db71
9527 [main] INFO backtype.storm.testing - Unable to delete file: C:\Users\sowmi
ya\AppData\Local\Temp\7e75c468-18ea-4787-a4ac-496fb108db71\version-2\log.1
9529 [main] INFO backtype.storm.testing - Deleting temporary path C:\Users\sowm
iya\AppData\Local\Temp\fa7b3c9b-ac93-4090-b9e2-63f10019e61f
9543 [main] INFO backtype.storm.testing - Deleting temporary path C:\Users\sowm
iya\AppData\Local\Temp\55f1fd11-508e-43bb-b340-0d9b79f3af33
9579 [Thread-6-EventThread] INFO com.netflix.curator.framework.state.Connection
StateManager - State change: SUSPENDED
9580 [ConnectionStateManager-0] WARN com.netflix.curator.framework.state.Connec
tionStateManager - There are no ConnectionStateListeners registered.
9583 [Thread-6-EventThread] WARN backtype.storm.cluster - Received event :disco
nnected::none: with disconnected Zookeeper.
11232 [Thread-6-SendThread(localhost:2000)] WARN org.apache.zookeeper.ClientCnx
n - Session 0x143af55728d000b for server null, unexpected error, closing socket
connection and attempting reconnect
**java.net.ConnectException: Connection refused: no further information**
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.7.0_0
3]
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:701
) ~[na:1.7.0_03]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
~[zookeeper-3.3.3.jar:3.3.3-1073969]
13992 [Thread-6-SendThread(localhost:2000)] WARN org.apache.zookeeper.ClientCnx
n - Session 0x143af55728d000b for server null, unexpected error, closing socket
connection and attempting reconnect
java.net.ConnectException: Connection refused: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.7.0_0
3]
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:701
) ~[na:1.7.0_03]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
Whwn we are trying to run the topology jar file all the operation like nimbus,zookeeper and supervisor process going to dead.please help us to know why this is happened.
Please help us to rectify this error and help to proceed further.
Thank you,
Sowmiya
Priya

This looks like a zookeeper issue. It looks like your processes are not being able to connect to zookeeper. Can't say more without more information.

MapReduce job hanging, "container" issue

When I run a MapReduce job, it just hangs and eventually fails (after about 20 minutes).
This is the error code I see on :8088
exited with exitCode: -100 due to: Container expired since it was unused.Failing this attempt.. Failing the application.
Any thoughts on what this issue is?
I am running Hadoop 2.2.
Update:
It would appear the issue is related to this:
Container killed by the framework, either due to being released by the application or being 'lost' due to node failures etc. have a special exit code of -100.
Update 2:
These errors are from the resourcemanger logs:
2013-12-18 04:28:42,544 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: completedContainer queue=root usedCapacity=0.0 absoluteUsedCapacity=0.0 used=<memory:0, vCores:0> cluster=<memory:16384, vCores:16>
2013-12-18 04:28:42,544 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Re-sorting completed queue: root.default stats: default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:0, vCores:0>usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=1, numContainers=0
2013-12-18 04:28:42,544 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Application appattempt_1387307711170_0002_000002 released container container_1387307711170_0002_02_000001 on node: host: slave-2:42143 #containers=0 available=8192 used=0 with event: EXPIRE
2013-12-18 04:28:42,544 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Unregistering app attempt : appattempt_1387307711170_0002_000002
2013-12-18 04:28:42,545 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1387307711170_0002_000002 State change from ALLOCATED to FAILED
2013-12-18 04:28:42,545 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Application application_1387307711170_0002 failed 2 times due to AM Container for appattempt_1387307711170_0002_000002 exited with exitCode: -100 due to: Container expired since it was unused.Failing this attempt.. Failing the application.
2013-12-18 04:28:42,546 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Removing info for app: application_1387307711170_0002
2013-12-18 04:28:42,546 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1387307711170_0002 State change from ACCEPTED to FAILED
2013-12-18 04:28:42,546 WARN org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hduser OPERATION=Application Finished - Failed TARGET=RMAppManager RESULT=FAILURE DESCRIPTION=App failed with state: FAILED PERMISSIONS=Application application_1387307711170_0002 failed 2 times due to AM Container for appattempt_1387307711170_0002_000002 exited with exitCode: -100 due to: Container expired since it was unused.Failing this attempt.. Failing the application. APPID=application_1387307711170_0002
2013-12-18 04:28:42,546 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary: appId=application_1387307711170_0002,name=streamjob5941238512810428268.jar,user=hduser,queue=default,state=FAILED,trackingUrl=master-1:8088/cluster/app/application_1387307711170_0002,appMasterHost=N/A,startTime=1387339379570,finishTime=1387340922546,finalStatus=FAILED
2013-12-18 04:28:42,546 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Application appattempt_1387307711170_0002_000002 is done. finalState=FAILED
2013-12-18 04:28:42,546 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: Application application_1387307711170_0002 requests cleared
2013-12-18 04:28:42,546 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application removed - appId: application_1387307711170_0002 user: hduser queue: default #user-pending-applications: 0 #user-active-applications: 0 #queue-pending-applications: 0 #queue-active-applications: 0
2013-12-18 04:28:42,547 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Application removed - appId: application_1387307711170_0002 user: hduser leaf-queue of parent: root #applications: 0
2013-12-18 04:28:43,136 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave-2/10.239.132.243:42143. Already tried 39 time(s); maxRetries=45
2013-12-18 04:29:03,157 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave-2/10.239.132.243:42143. Already tried 40 time(s); maxRetries=45
2013-12-18 04:29:23,158 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave-2/10.239.132.243:42143. Already tried 41 time(s); maxRetries=45
2013-12-18 04:29:43,179 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave-2/10.239.132.243:42143. Already tried 42 time(s); maxRetries=45
2013-12-18 04:30:03,183 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave-2/10.239.132.243:42143. Already tried 43 time(s); maxRetries=45
2013-12-18 04:30:23,185 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave-2/10.239.132.243:42143. Already tried 44 time(s); maxRetries=45
2013-12-18 04:30:43,208 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Error launching appattempt_1387307711170_0002_000002. Got exception: org.apache.hadoop.net.ConnectTimeoutException: Call From ip-10-73-169-19/10.73.169.19 to slave-2:42143 failed on socket timeout exception: org.apache.hadoop.net.ConnectTimeoutException: 20000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=slave-2/10.239.132.243:42143]; For more details see: http://wiki.apache.org/hadoop/SocketTimeout
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:749)
at org.apache.hadoop.ipc.Client.call(Client.java:1351)
at org.apache.hadoop.ipc.Client.call(Client.java:1300)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy69.startContainers(Unknown Source)
at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:96)
at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:118)
at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:249)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Caused by: org.apache.hadoop.net.ConnectTimeoutException: 20000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=slave-2/10.239.132.243:42143]
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:532)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:547)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:642)
at org.apache.hadoop.ipc.Client$Connection.access$2600(Client.java:314)
2013-12-18 04:30:43,208 ERROR org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: Can't handle this event at current state
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: LAUNCH_FAILED at FAILED
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:625)
at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:104)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:566)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:547)
at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:81)
at java.lang.Thread.run(Thread.java:724)
2013-12-18 19:15:17,626 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Rolling master-key for amrm-tokens
2013-12-18 19:15:17,632 INFO org.apache.hadoop.yarn.server.resourcemanager.security.RMContainerTokenSecretManager: Rolling master-key for container-tokens
2013-12-18 19:15:17,633 INFO org.apache.hadoop.yarn.server.resourcemanager.security.RMContainerTokenSecretManager: Going to activate master-key with key-id 422264835 in 900000ms
2013-12-18 19:15:17,637 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: Rolling master-key for nm-tokens
2013-12-18 19:15:17,637 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: Going to activate master-key with key-id 1883530799 in 900000ms
2013-12-18 19:15:25,884 INFO org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: Updating the current master key for generating delegation tokens
2013-12-18 19:15:25,885 INFO org.apache.hadoop.yarn.server.resourcemanager.security.RMDelegationTokenSecretManager: storing master key with keyID 3
2013-12-18 19:30:17,633 INFO org.apache.hadoop.yarn.server.resourcemanager.security.RMContainerTokenSecretManager: Activating next master key with id: 422264835
2013-12-18 19:30:17,637 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: Activating next master key with id: 1883530799

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Spark 2.1 + Yarn application has already ended - hadoop

Related

Apache Nifi - refused to connect to localhost error

Spark streaming job on YARN cluster mode stuck in accepted, then fails with a Timeout Exception

Job job_* failed with state FAILED due to: Application application_* failed 2 times due to ApplicationMaster for attempt appattempt_* timed out.

While running a topology in storm we are getting error like this

MapReduce job hanging, "container" issue

Categories

Resources