I am trying to run one example MR job on my pseudo distributed cluster in virtualbox VM (RHEL 6.5, 8GB RAM, 100GB HDD) but after submission of the job It stucks there only.
INFO: mapreduce.Job: Running job: job_1437483993_001
The application tracking url (http://localhost:8088/cluster/applicationID) shows the result like this:
User: root
Name : grep-search
Application-Type : mapreduce
Status : Accepted
FinalStatus : Undefined
What I have tried:
modified yarn-site.xml and mapred-site.xml for minimum and maximum allocation for memory following the tutorial (http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/)
ensured that disk space is free enough to accomodate new jobs.
jps shows all the services are running properly.
But no luck. Please guide me through.
Edit:
Here's the log:
[root#master ~]# hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar grep /user/pradeep output23 'dfs[a-z.]+'
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
16/04/27 10:21:09 INFO client.RMProxy: Connecting to ResourceManager at /127.0.0.1:8032
16/04/27 10:21:09 WARN mapreduce.JobResourceUploader: No job jar file set. User classes may not be found. See Job or Job#setJar(String).
16/04/27 10:21:09 INFO input.FileInputFormat: Total input paths to process : 4
16/04/27 10:21:10 INFO mapreduce.JobSubmitter: number of splits:4
16/04/27 10:21:11 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1461732411884_0001
16/04/27 10:21:11 INFO mapred.YARNRunner: Job jar is not present. Not adding any jar to the list of resources.
16/04/27 10:21:11 INFO impl.YarnClientImpl: Submitted application application_1461732411884_0001
16/04/27 10:21:11 INFO mapreduce.Job: The url to track the job: http://localhost:8088/proxy/application_1461732411884_0001/
16/04/27 10:21:11 INFO mapreduce.Job: Running job: job_1461732411884_0001
There is some other issue and not the infrastructure. Please paste what are u trying to do in the map reduce code.
Related
I am working on hadoop installation in Windows 7.
Tried to untar the tarfiles from apache site but it was unsuccessful.
I have searched in internet and found below link.
http://toodey.com/2015/08/10/hadoop-installation-on-windows-without-cygwin-in-10-mints/
I was able to install. But when i was trying to execute the examples i was encountered with below errors.
Command executed :
C:\Users\hadoop\hadoop-2.7.1\bin\hadoop.cmd jar C:\Users\hadoop\hadoop-2.7.1\share\hadoop\mapreduce\hadoop-mapreduce-examples-2.7.1.jar wordcount /hadoop/input /hadoop/output
Error :
C:\Users\hadoop\hadoop-2.7.1>C:\Users\hadoop\hadoop-2.7.1\bin\hadoop.cmd jar C:\Users\hadoop\hadoop-2.7.1\share\hadoop\mapreduce\hadoop-mapreduce-examples-2.7.1.jar wordcount /hadoop/input /hadoop/output
16/11/14 17:05:28 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:0000
16/11/14 17:05:30 INFO input.FileInputFormat: Total input paths to process : 3
16/11/14 17:05:30 INFO mapreduce.JobSubmitter: number of splits:3
16/11/14 17:05:30 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1479122512555_0003
16/11/14 17:05:31 INFO impl.YarnClientImpl: Submitted application application_1479122512555_0003
16/11/14 17:05:32 INFO mapreduce.Job: The url to track the job: http://MachineName:8088/proxy/application_1479122512555_0003/
16/11/14 17:05:32 INFO mapreduce.Job: Running job: job_1479122512555_0003
16/11/14 17:05:36 INFO mapreduce.Job: Job job_1479122512555_0003 running in uber mode : false
16/11/14 17:05:36 INFO mapreduce.Job: map 0% reduce 0%
16/11/14 17:05:36 INFO mapreduce.Job: Job job_1479122512555_0003 failed with state FAILED due to: Application application_1479122512555_0003 failed 2 times due to AM Container for appattempt_1479122512555_0003_000002 exited with exitCode: -1000
For more detailed output, check application tracking page:http://MachineName:8088/cluster/app/application_1479122512555_0003Then, click on links to logs of each attempt.
Diagnostics: null
Failing this attempt. Failing the application.
16/11/14 17:05:36 INFO mapreduce.Job: Counters: 0
Thanks in advance...
Actually this is due to the permission issues on some of the yarn local directories. So here is the solution
Identify the yarn directories specified using the parameter yarn.nodemanager.local-dirs from /etc/gphd/hadoop/conf/yarn-site.xml on yarn node manager.
Delete the files / folders under usercache from all the directories listed in yarn-site.xml and all the node managers.
e.g
rmdir path/to/yarn/nm/usercache/*
I have installed Cloudera 5.8 in a Linux RHEL 7.2 instance of Amazon EC2. I have logged in with SSH and I am trying to run the wordcount example for testing mapreduce operation with the following command:
hadoop jar /opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar wordcount archivo.txt output
The problem is that the wordcount program is blocked and it not produces the output. Only the following is prompted:
16/08/11 13:10:02 INFO client.RMProxy: Connecting to ResourceManager at ip-172-31-22-226.ec2.internal/172.31.22.226:8032
16/08/11 13:10:03 INFO input.FileInputFormat: Total input paths to process : 1
16/08/11 13:10:03 INFO mapreduce.JobSubmitter: number of splits:1
16/08/11 13:10:04 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1470929244097_0007
16/08/11 13:10:04 INFO impl.YarnClientImpl: Submitted application application_1470929244097_0007
16/08/11 13:10:04 INFO mapreduce.Job: The url to track the job: http://ip-172-31-22-226.ec2.internal:8088/proxy/application_1470929244097_0007/
16/08/11 13:10:04 INFO mapreduce.Job: Running job: job_1470929244097_0007
And then get blocked since "Running job". After this I have to press Ctrl+C for unblock and it not produces the output.
Anyone that knows why?. I think it is probably a configuration issue and I am new to DataNodes and so on.
Thanks a lot.
Looks like there are no resources (map or reducer slots), job is waiting for resources. You can check the job status on.
http://ip-172-31-22-226.ec2.internal:8088
I was running hadoop program (wordcount) in Horton sandbox. And the situation occurred as below. Especially, this is the program I had ran successfully for many times on exactly the same virtual machine I used, however this time it "failed" without any notification, so it just stuck there. I tried other mapreduce program, the results are similar. Normally, the command lines will notify me with ubermode : false, follows by the Running job..., but this time, it doesn't, and out of no reason.
[root#sandbox ~]# hadoop jar testWC.jar testWC.WCdriver /data/input/pg103.txt /data/output/WC
WARNING: Use "yarn jar" to launch YARN applications.
16/03/11 19:20:01 INFO impl.TimelineClientImpl: Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/
16/03/11 19:20:01 INFO client.RMProxy: Connecting to ResourceManager at sandbox.hortonworks.com/10.0.2.15:8050
16/03/11 19:20:01 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
16/03/11 19:20:02 INFO input.FileInputFormat: Total input paths to process : 1
16/03/11 19:20:02 INFO mapreduce.JobSubmitter: number of splits:1
16/03/11 19:20:02 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1457723341319_0002
16/03/11 19:20:03 INFO impl.YarnClientImpl: Submitted application application_1457723341319_0002
16/03/11 19:20:03 INFO mapreduce.Job: The url to track the job: http://sandbox.hortonworks.com:8088/proxy/application_1457723341319_0002/
16/03/11 19:20:03 INFO mapreduce.Job: Running job: job_1457723341319_0002
The program just could not move on anymore.
I tried to run the example program present in Hadoop. However, I'm not successful in getting the output.
I have included my logs below. Please help in solving the issue.
hdfs#localhost:~$ hadoop jar '/opt/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar' wordcount /README.txt /ooo
15/08/21 09:48:26 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:8050
15/08/21 09:48:28 INFO input.FileInputFormat: Total input paths to process : 1
15/08/21 09:48:28 INFO mapreduce.JobSubmitter: number of splits:1
15/08/21 09:48:28 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1440130528838_0001
15/08/21 09:48:29 INFO impl.YarnClientImpl: Submitted application application_1440130528838_0001
15/08/21 09:48:29 INFO mapreduce.Job: The url to track the job: http://localhost:8088/proxy/application_1440130528838_0001/
15/08/21 09:48:29 INFO mapreduce.Job: Running job: job_1440130528838_0001
The mapreduce seems working, there is no error logs which appears.
1/ Can you please detail furthermore your logs?!
2/ Your output folder /ooo is created?? If yes what its contents?!
3/ Verify please if your input file is not empty.
Mapreduce with YARN fail to move ahead of 0% map and 0% reduce. I am using Cloudera CDH on google compute high memory instance(13 GM RAM). 8 GB free ram is available on the machine. Can you please help me to fix it?
sunny#hadoop-m:~$ hadoop jar /opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/jars/hadoop-mapreduce-examples-2.5.0-cdh5.3.0.jar grep input output 'dfs[a-z.]+'
14/12/24 00:13:53 INFO client.RMProxy: Connecting to ResourceManager at hadoop-m.c.sunny-hadoop-trial.internal/10.240.253.233:8032
14/12/24 00:13:53 WARN mapreduce.JobSubmitter: No job jar file set. User classes may not be found. See Job or Job#setJar(String).
14/12/24 00:13:54 INFO input.FileInputFormat: Total input paths to process : 5
14/12/24 00:13:54 INFO mapreduce.JobSubmitter: number of splits:5
14/12/24 00:13:54 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1419360146634_0001
14/12/24 00:13:54 INFO mapred.YARNRunner: Job jar is not present. Not adding any jar to the list of resources.
14/12/24 00:13:54 INFO impl.YarnClientImpl: Submitted application application_1419360146634_0001
14/12/24 00:13:55 INFO mapreduce.Job: The url to track the job: http://hadoop-m.c.sunny-hadoop-trial.internal:8088/proxy/application_1419360146634_0001/
14/12/24 00:13:55 INFO mapreduce.Job: Running job: job_1419360146634_0001
Resource Manager Output
Some more info about job
yarn-site.xml: http://pastebin.mozilla.org/8113782
mapred-site.xml: http://pastebin.mozilla.org/8113813
Server 's IP got changed because of DHCP service. Client configuration for HDFS and YARN became stale. I needed to update client configuration, I did it with Cloudera manager and now cluster is running fine.