Hadoop job accepted but not running Hadoop2.4.1 - hadoop

I had a distributed cluster of Hadoop2.4.1. When i run a sample job it in accepted state but not running.
below is the command prompt where is getting ideal.
/usr/local/hadoop/share/hadoop/mapreduce$ hadoop jar hadoop-mapreduce-examples-2.4.1.jar pi 3 2
Number of Maps = 3
Samples per Map = 2
14/08/12 14:21:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Starting Job
14/08/12 14:21:20 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
14/08/12 14:21:21 INFO input.FileInputFormat: Total input paths to process : 3
14/08/12 14:21:21 INFO mapreduce.JobSubmitter: number of splits:3
14/08/12 14:21:21 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1407833440940_0001
14/08/12 14:21:22 INFO impl.YarnClientImpl: Submitted application application_1407833440940_0001
14/08/12 14:21:22 INFO mapreduce.Job: The url to track the job: http://impc1368.htcindia.com:8088/proxy/application_1407833440940_0001/
14/08/12 14:21:22 INFO mapreduce.Job: Running job: job_1407833440940_0001

check logs files of nodemanager for java.net.ConnectException,and check config of yarn-site.xml for yarn.resourcemanager.address

Related

Running Hadoop MapReduce word count for the first time fails?

When running the Hadoop word count example the first time it fails. Here's what I'm doing:
Format namenode: $HADOOP_HOME/bin/hdfs namenode -format
Start HDFS/YARN:
$HADOOP_HOME/sbin/start-dfs.sh
$HADOOP_HOME/sbin/start-yarn.sh
$HADOOP_HOME/sbin/yarn-daemon.sh start nodemanager
Run wordcount: hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount input output
(let's say input folder is already in HDFS I'm not gonna put every single command here)
Output:
16/07/17 01:04:34 INFO client.RMProxy: Connecting to ResourceManager at hadoop-master/172.20.0.2:8032
16/07/17 01:04:35 INFO input.FileInputFormat: Total input paths to process : 2
16/07/17 01:04:35 INFO mapreduce.JobSubmitter: number of splits:2
16/07/17 01:04:36 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1468688654488_0001
16/07/17 01:04:36 INFO impl.YarnClientImpl: Submitted application application_1468688654488_0001
16/07/17 01:04:36 INFO mapreduce.Job: The url to track the job: http://hadoop-master:8088/proxy/application_1468688654488_0001/
16/07/17 01:04:36 INFO mapreduce.Job: Running job: job_1468688654488_0001
16/07/17 01:04:46 INFO mapreduce.Job: Job job_1468688654488_0001 running in uber mode : false
16/07/17 01:04:46 INFO mapreduce.Job: map 0% reduce 0%
Terminated
And then HDFS crashes so I can't access http://localhost:50070/
Then I restart eveyrthing (repeat step 2), rerun the example and everything's fine.
How can I fix it for the first run? My HDFS obviously has no data the first time around, maybe that's the problem?
UPDATE:
Running an even simpler example fails as well:
hadoop#8f98bf86ceba:~$ hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples*.jar pi 3 3
Number of Maps = 3
Samples per Map = 3
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Starting Job
16/07/17 03:21:28 INFO client.RMProxy: Connecting to ResourceManager at hadoop-master/172.20.0.3:8032
16/07/17 03:21:29 INFO input.FileInputFormat: Total input paths to process : 3
16/07/17 03:21:29 INFO mapreduce.JobSubmitter: number of splits:3
16/07/17 03:21:29 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1468696855031_0001
16/07/17 03:21:31 INFO impl.YarnClientImpl: Submitted application application_1468696855031_0001
16/07/17 03:21:31 INFO mapreduce.Job: The url to track the job: http://hadoop-master:8088/proxy/application_1468696855031_0001/
16/07/17 03:21:31 INFO mapreduce.Job: Running job: job_1468696855031_0001
16/07/17 03:21:43 INFO mapreduce.Job: Job job_1468696855031_0001 running in uber mode : false
16/07/17 03:21:43 INFO mapreduce.Job: map 0% reduce 0%
Same problem, HDFS terminates
Your post looks incomplete to deduce what is wrong here. My guess is that hadoop-mapreduce-examples-2.7.2-sources.jar is not what you want. More likely you need hadoop-mapreduce-examples-2.7.2.jar containing .class files and not the sources.
HDFS has to be restarted the first time before MapReduce jobs can be successfully ran. This is because HDFS creates some data on the first run but stopping it can clean up its state so MapReduce jobs can be ran through YARN afterwards.
So my solution was:
Start Hadoop: $HADOOP_HOME/sbin/start-dfs.sh
Stop Hadoop: $HADOOP_HOME/sbin/stop-dfs.sh
Start Hadoop again: $HADOOP_HOME/sbin/start-dfs.sh

MapReduce job is stuck on a multi node Hadoop-2.7.1 cluster

I have successfully run Hadoop 2.7.1 on a multi node cluster (1 namenode and 4 datanodes). But, when I run MapReduce job (WordCount example from Hadoop website), it always stuck at this point.
[~#~ hadoop-2.7.1]$ bin/hadoop jar WordCount.jar WordCount /user/inputdata/ /user/outputdata
15/09/30 17:54:56 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/09/30 17:54:57 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/09/30 17:54:58 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
15/09/30 17:54:59 INFO input.FileInputFormat: Total input paths to process : 1
15/09/30 17:55:00 INFO mapreduce.JobSubmitter: number of splits:1
15/09/30 17:55:00 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1443606819488_0002
15/09/30 17:55:00 INFO impl.YarnClientImpl: Submitted application application_1443606819488_0002
15/09/30 17:55:00 INFO mapreduce.Job: The url to track the job: http://~~~~:8088/proxy/application_1443606819488_0002/
15/09/30 17:55:00 INFO mapreduce.Job: Running job: job_1443606819488_0002
Do I have to specify a memory for yarn?
NOTE: DataNode hardwares are really old (Each has 1GB RAM).
Appreciate your help.
Thank you.
The data nodes memory (1gb) is really very scarce to prepare atleast 1 container to run mapper/reducer/am in it.
You could try lowering the below container memory allocation values in yarn-site.xml with very lower values to get the container created on them.
yarn.scheduler.minimum-allocation-mb
yarn.scheduler.maximum-allocation-mb
Also try to reduce the below properties values in your job configration,
mapreduce.map.memory.mb
mapreduce.reduce.memory.mb
mapreduce.map.java.opts
mapreduce.reduce.java.opts

Job submitting but map reduce not working

I tried to run the example program present in Hadoop. However, I'm not successful in getting the output.
I have included my logs below. Please help in solving the issue.
hdfs#localhost:~$ hadoop jar '/opt/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar' wordcount /README.txt /ooo
15/08/21 09:48:26 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:8050
15/08/21 09:48:28 INFO input.FileInputFormat: Total input paths to process : 1
15/08/21 09:48:28 INFO mapreduce.JobSubmitter: number of splits:1
15/08/21 09:48:28 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1440130528838_0001
15/08/21 09:48:29 INFO impl.YarnClientImpl: Submitted application application_1440130528838_0001
15/08/21 09:48:29 INFO mapreduce.Job: The url to track the job: http://localhost:8088/proxy/application_1440130528838_0001/
15/08/21 09:48:29 INFO mapreduce.Job: Running job: job_1440130528838_0001
The mapreduce seems working, there is no error logs which appears.
1/ Can you please detail furthermore your logs?!
2/ Your output folder /ooo is created?? If yes what its contents?!
3/ Verify please if your input file is not empty.

Why hadoop yarn mapreduce job not working and stop on running job?

I have a mapreduce job and I ran it with YARN mode. But why my mapreduce job stop and not continue while running job step? It's like this :
15/04/04 17:18:21 INFO impl.YarnClientImpl: Submitted application application_1428142358448_0002
15/04/04 17:18:21 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1428142358448_0002/
15/04/04 17:18:21 INFO mapreduce.Job: Running job: job_1428142358448_0002
And that's stop here. Is because lack of memory? After start-all.sh and all daemon have started, I have about 300-350 MB memory. I need your suggest all, why this happened?
Thanks all..
No, this isn't because of out of memory, else the logs would have clearly mentioned that. The job seems to be in running state and has got stuckup somewhere, you can probably go and check on the application master for more details about the job.
I'm sorry but you mean this thing ?
15/04/05 14:11:27 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/04/05 14:11:29 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.1.163:8050
15/04/05 14:11:30 INFO input.FileInputFormat: Total input paths to process : 1
15/04/05 14:11:31 INFO mapreduce.JobSubmitter: number of splits:1
15/04/05 14:11:31 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1428216622742_0003
15/04/05 14:11:31 INFO impl.YarnClientImpl: Submitted application application_1428216622742_0003
15/04/05 14:11:31 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1428216622742_0003/
15/04/05 14:11:31 INFO mapreduce.Job: Running job: job_1428216622742_0003
or something else? on my master node port 8088 there are only tables....

Hadoop 2.6.0 wordcount example not running

I was following the instructions found here and here.
All web urls are opened properly and then I tried to run wordcount example.
I went into ACCEPTED state .. didn't run.
[root#localhost hadoop-2.6.0]# yarn jar /usr/local/deployment/WordCount.jar input output
14/12/05 19:15:21 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/12/05 19:15:22 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
14/12/05 19:15:22 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
14/12/05 19:15:23 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
14/12/05 19:15:24 INFO mapred.FileInputFormat: Total input paths to process : 30
14/12/05 19:15:25 INFO mapreduce.JobSubmitter: number of splits:30
14/12/05 19:15:25 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1417787106330_0001
14/12/05 19:15:26 INFO impl.YarnClientImpl: Submitted application application_1417787106330_0001
14/12/05 19:15:26 INFO mapreduce.Job: The url to track the job: http://local:8088/proxy/application_1417787106330_0001/
14/12/05 19:15:26 INFO mapreduce.Job: Running job: job_1417787106330_0001
Following output on web interface :
User: root
Name: wordcount
Application Type: MAPREDUCE
Application Tags:
State: ACCEPTED
FinalStatus: UNDEFINED
Can someone tell me possible reason for the this ??

Resources