Container exited with a non-zero exit code 1 during wordcount - hadoop

When I am executing the wordcount program in hadoop-mapreduce-examples using below command
hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.5.jar wordcount /wordcount/input/test_input.txt /wordcount/output
It is throwing me following exception
Exception from container-launch.
Countainer id: countainer_1540539176003_003_01_000002
Exit code: 1
Stack trace: ExitCodeException exitCode 1;
at org.apache.hadoop.util.Sgell.runCommand(Shell.java:972)
at org.apache.hadoop.util.Sgell.run(Shell.java:869)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.javaL1170)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExcutor.java:235)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:83)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurreunt.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)
at java.util.concurreunt.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
at java.lang.Thread.run(Thread.java:748)
Container exited with a non-zero exit code 1
How to fix it?
Sorry I'm new here.
Does it mean there is some memory problems?

You need to start by getting the correct logs.
Look at the url to track the job for the address to the YARN UI.
If that address is not available, you can copy the full app id to the logs command
yarn logs -applicationId application_1540...
From there, you can search for a stacktrace generated by the code.
If you've just setup Hadoop, I would guess that hdfs dfs -ls /wordcount_input/ throws some error about not existing or about permission denied

Related

Remotely connect to spark on yarn cluster in client mode

I have a remote spark on yarn cluster that if I use rstudio server(web version) hosted on that cluster to connect in client mode I can do the following:
sc <- SparkR::sparkR.init(master = "yarn-client")
However if I try to use rstudio on my local machine to connect to that spark cluster the same way then I have errors:
ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master
...
ERROR Utils: Uncaught exception in thread nioEventLoopGroup-2-2
java.lang.NullPointerException
...
ERROR RBackendHandler: createSparkContext on org.apache.spark.api.r.RRDD failed
Error in invokeJava(isStatic = TRUE, className, methodName, ...) :
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
A more detailed error message on hadoop application tracking page is like this:
User: blueivy
Name: SparkR
Application Type: SPARK
Application Tags:
State: FAILED
FinalStatus: FAILED
Started: 27-Oct-2015 11:07:09
Elapsed: 4mins, 39sec
Tracking URL: History
Diagnostics:
Application application_1445628650748_0027 failed 2 times due to AM Container for appattempt_1445628650748_0027_000002 exited with exitCode: 10
For more detailed output, check application tracking page:http://master:8088/proxy/application_1445628650748_0027/Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1445628650748_0027_02_000001
Exit code: 10
Stack trace: ExitCodeException exitCode=10:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:267)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1143)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:618)
at java.lang.Thread.run(Thread.java:785)
Container exited with a non-zero exit code 10
Failing this attempt. Failing the application.
I have the same configurations and environment for hadoop and spark with remote cluster: spark 1.5.1, hadoop 2.6.0 and ubuntu 14.04. Anyone can help me find what's my mistake here?

issue Running Spark Job on Yarn Cluster

I want to run my spark Job in Hadoop YARN cluster mode, and I am using the following command:
spark-submit --master yarn-cluster
--driver-memory 1g
--executor-memory 1g
--executor-cores 1
--class com.dc.analysis.jobs.AggregationJob
sparkanalitic.jar param1 param2 param3
I am getting error below, kindly suggest whats going wrong, is the command correct or not. I am using CDH 5.3.1.
Diagnostics: Application application_1424284032717_0066 failed 2 times due
to AM Container for appattempt_1424284032717_0066_000002 exited with
exitCode: 15 due to: Exception from container-launch.
Container id: container_1424284032717_0066_02_000001
Exit code: 15
Stack trace: ExitCodeException exitCode=15:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:197)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 15
.Failing this attempt.. Failing the application.
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: root.hdfs
start time: 1424699723648
final status: FAILED
tracking URL: http://myhostname:8088/cluster/app/application_1424284032717_0066
user: hdfs
2015-02-23 19:26:04 DEBUG Client - stopping client from cache: org.apache.hadoop.ipc.Client#4085f1ac
2015-02-23 19:26:04 DEBUG Utils - Shutdown hook called
2015-02-23 19:26:05 DEBUG Utils - Shutdown hook called
Any help would be greatly appreciated.
It can mean a lot of things, for us, we get the similar error message because of unsupported Java class version, and we fixed the problem by deleting the referenced Java class in our project.
Use this command to see the detailed error message:
yarn logs -applicationId application_1424284032717_0066
You should remove ".setMaster("local")" in the code.
The command looks correct.
What I've come across is that the "exit code 15" normally indicates a TableNotFound Exception. That usually means there's an error in the code you're submitting.
You can check this by visiting the tracking URL.
For me exit code issue solved by placing hive-site.xml in spark/conf directory.
Remove the line "spark.master":"local[*]" in the spark configuration file if you are running the spark jobs under cluster.
Suppose run on the local pc, include it.
Mani

Storm-YARN : Application container fails to launch

I am running a storm (trident) topology that reads avro from kafka & writes the records in hbase.
The topology is running as expected in Localcluster mode, but while using Stormsubmitter I'm facing below issues.
In Distributed Hadoop mode I'm getting the below error [1] while launching the YARN application.
In Hadoop (local mode, with 1 box only) Yarn is spawnning the nimbus server and storm-ui. But there are no supervisor(s) running to run the spout/bolts in the topology. I guess the reason might be insufficient memory (4G to run the topology + hbase, hdfs, kafka, zookeeper etc...).
Can you help me out in understanding the reason of this container failure? There are no errors/info present in application logs.
[1] YARN container fails to launch with below error on running.
storm-yarn launch /homext/storm-yarn.yml --queue default -appname storm-yarn-demo --stormZip /tmp/storm-0.9.zip
Application application_1415038356032_0304 failed 2 times due to AM Container for appattempt_1415038356032_0304_000002 exited with exitCode: 127 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException:
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:505)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 127
.Failing this attempt.. Failing the application.
This log is insufficient to diagnose. All it says is that the container failed to launch. You should look into the container output. Check the ${yarn.nodemanager.log-dirs} on the nodes, there will be an application folder (application_1415038356032_0304) and in there there will be a container folder for each attempt (...1415038356032_0304_000002) containing the stderr, stdout and syslog of this attempt. Read those and you'll likely identify the problem.
If these don't exist, look in ${yarn.nodemanager.local-dirs} you'll find the container launch script (I thinks is called container-launch.sh) for this app/container attempt. In it will be the actual command to launch the container. Try to run that from the shell prompt and see what you get.
If it fails at an early stage then the logs can be found in HDFS under:
/tmp/logs/<user>/logs/
This should give enough information to diagnose the problem.
In my case I found a log file:
/tmp/logs/hdfs/logs/application_1426618997634_0004/vagrant-cdh-node4_8041
With some errors like:
/bin/bash: /usr/lib/jvm/java-7-oracle/bin/java: No such file or directory
And fixing the JAVA_HOME environment variable did the trick.

Hadoop/YARN job FAILED - "exited with exitCode: -1000 due to: Could not find any valid local directory for nmPrivate..."

I am trying to run a MapReduce job with Hadoop, YARN and Accumulo.
I am getting the following output that I cannot track down the issue. Looks to be a YARN issue, but I am not sure what it is looking for. I have a nmPrivate folder at location $HADOOP_PREFIX/grid/hadoop/hdfs/yarn/logs. Is this the folder it says that it cannot find?
14/03/31 08:48:46 INFO mapreduce.Job: Job job_1395942264921_0023 failed with state FAILED due to: Application application_1395942264921_0023 failed 2 times due to AM Container for appattempt_1395
942264921_0023_000002 exited with exitCode: -1000 due to: Could not find any valid local directory for nmPrivate/container_1395942264921_0023_02_000001.tokens
.Failing this attempt.. Failing the application.
When i test the spark-submit-on-yarn in the cluster mode:
spark-submit --master yarn --deploy-mode cluster --class org.apache.spark.examples.SparkPi /usr/local/install/spark-2.2.0-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.2.0.jar 100
i gotten the same error:
Application application_1532249549503_0007 failed 2 times due to AM Container for appattempt_1532249549503_0007_000002 exited with exitCode: -1000 Failing this attempt.Diagnostics: java.io.IOException: Resource file:/usr/local/install/spark-2.2.0-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.2.0.jar changed on src filesystem (expected 1531576498000, was 1531576511000
there have one sugesstion to desolve this kind of error,to revise your core-site.xml or other conf of the HADOOP.
Finally, i fixed the error by set the property fs.defaultFS in the the $HADOOP_HOME/etc/hadoop/core-site.xml

Hadoop Pseudo-Distributed Task failed

For some strange reasons I get the below exception when i try to run wordcount example in Pseudo-Distributed mode. It says exit with status -1
Enviornment: Cygwin
java.lang.Throwable: Child Error
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
Caused by: java.io.IOException: Task process exit with nonzero status of -1.
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)

Resources