sudo jps not locating MapReduce jobtracker - hadoop

I am running CDH5 on Ubuntu. I have installed everything i need, but when i type in sudo jps, the jobtracker is not displayed. Heres my configuration on mapred-site.xml
mapred.job.tracker.http.address: localhost{50030|50020}
Can someone please explain why this is happening? How can it be fixed?

what is your hadoop version? If it is 0.20.2+ then there is no need of configuring the jobtracker as they have removed the seperate jobtracker functionality. You can find it at localhost:8088.
Remove the configuration and restart the node
If it is an older version then try to manually start it using:
$hadoop jobtracker
if it doe not start, post the error log here

Related

jps command only returns jps

I recently installed Hadoop 3.3.1 on Ubuntu (single node cluster) using this tutorial.
The start-dfs.sh and start-yarn.sh shows all the processing running properly.
However when I use the jps command, it returns only jps, with none of the other nodes, trackers showing.
I am new to Hadoop and lost on how to solve this. Any help is appreciated.

unable to see Task tracker and Jobtracker after Hadoop single node installation 2.5.1

Iam new to Hadoop 2.5.1. As i have already installed Hadoop 1.0.4 previously, i thought installation process would be same so followed following tutorial.
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
Every thing was fine, even i have given these settings in core-site.xml
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
But i have seen in several sites this value as 9000.
And also changes in yarn.xml.
Still everything works fine when i run a mapreduce job. But my question is
when i run command jps it gives me this output..
hduser#secondmaster:~$ jps
5178 ResourceManager
5038 SecondaryNameNode
4863 DataNode
5301 NodeManager
4719 NameNode
6683 Jps
I dont see task tracker and job tracker in jps. Where are these demons running.
And without these deamons how am i able to run Mapreduce job.
Thanks,
Sreelatha K.
From hadoop version hadoop 2.0 onwards, default processing framework has been changed to YARN from Classic Mapreduce. You are using YARN, where you cannot see Jobtracker, Tasker in YARN. Jobtracker and Tasktracker is replaced by Resource manager and Nodemanager respectively in YARN.
But still you have an option to use Classic Mapreduce framework instead of YARN.
In Hadoop 2 there is an alternative method to run MapReduce jobs, called YARN. Since you have made changes in yarn.xml, MapReduce processing happens using YARN, not using the traditional MapReduce framework. That's probably be the reason why you don't see TaskTracker and JobTracker listed after executing the jps command. Note that ResourceManager and NodeManager are the daemons for YARN.
YARN is next generation of Resource Manager who can able to integrate with Apache spark, storm and many more tools you can use to write map-reduce jobs

where is the hadoop task manager UI

I installed the hadoop 2.2 system on my ubuntu box using this tutorial
http://codesfusion.blogspot.com/2013/11/hadoop-2x-core-hdfs-and-yarn-components.html
Everything worked fine for me and now when I do
http://localhost:50070
I can see the management UI for HDFS. Very good!!
But the I am going through another tutorial which tells me that there must be a task manager UI running at http://mymachine.com:50030 and http://mymachine.com:50060
on my machine I cannot open these ports.
I have already done
start-dfs.sh
start-yarn.sh
start-all.sh
is something wrong? why can't I see the task manager UI?
You have installed YARN (MRv2) which runs the ResourceManager. The URL http://mymachine.com:50030 is the web address for the JobTracker daemon that comes with MRv1 and hence you are not able to see it.
To see the ResourceManager UI, check your yarn-site.xml file for the following property:
yarn.resourcemanager.webapp.address
By default, it should point to : resource_manager_hostname:8088
Assuming your ResourceManager runs on mymachine, you should see the ResourceManager UI at http://mymachine.com:8088/
Make sure all your deamons are up and running before you visit the URL for the ResourceManager.
For Hadoop 2[aka YARN/MRV2] - Any hadoop installation version-ed 2.x or higher its at port number 8088. eg. localhost:8088
For Hadoop 1 - Any hadoop installation version-ed lower than 2.x[eg 1.x or 0.x] its at port number 50030. eg localhost:50030
By default HadoopUI location is as below
http://mymachine.com:50070

Hadoop Configuration

I have started configuring Hadoop 2.1.0-beta version for single node. I followed steps mentioned in Michael Noll's Tutorial (http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/#configuring-single-node-clusters-first). Every thing I did and configured well. As a result of JPS, I got that NameNode, DataNode, Secondary NameNode started fine. Then I found out that there is no start-mapred.sh script. So I tried starting the jobtracker using hadoop-daemons.sh (hadoop-daemon.sh --config /home/nayan/dev/hadoop/etc/hadoop/ start jobtracker) and it resulted in failure with message "Sorry, the jobtracker command is no longer supported. You may find similar functionality with the "yarn" shell command.". I do not know what all configuration changes (if any) I need to make. I made changes in "yarn-site.xml" file, as suggested in Hadoop:The Definitive Guide. But could not proceed further. Where can I find out about Yarn. I checked Apache site, but could not figure it out.
You need to check your configuration xml files. Sometimes if you have any problrm in xml then some daemons wont start.
and try to use ./start-all.sh and then JPS
you can use start-yarn.sh to start the ResourceManger and Jobtracker daemons
I usually start everything using these two commands
./start-dfs.sh
./start-yarn.sh
You Should use start-dfs.sh for Hdfs Daemons and start-yarn.sh for Resource manager and nodemanager daemon both are in /bin of hadoop.
./start-dfs.sh or start-dfs.sh will start only HDFS components , while ./start-yarn.sh or start-yarn.sh will start Yarn component like NodeManager , Resource manager etc. If you don't want to start both the components separately , try using this command :
./start-all.sh or start-all.sh (This is deprecated command though).
To answer your question , use ./start-yarn.sh
Cheers!
First have to start the yarn daemons in the YARN( HADOOP 2.x) Environment.
So start with this
at /hadoop_installed_path/sbin$ ./start-yarn.sh
Once the yarn daemons started then we can start df daemons
at /hadoop_installed_path/sbin$ ./start-dfs.sh
1.You should check all the steps in Hadoop The definitive guide.
if it's all proper than use start-all.sh
than run jps.
2.some time You have to close console for reflecting your changes.so close the console and reopen it again and then try jps,
hope this will help.

Hadoop jobtracker UI not accessible

I've configured hadoop 1.0.4 in pseudo-distributed mode. Everything's good, I can put local files in HDFS and run wordcount task. But I just can't access the jobtracker web UI through localhost:50030, localhost:50070 doesn't work neither.
HTTP ERROR 404
Problem accessing /jobtracker.jsp. Reason:
/jobtracker.jsp Powered by Jetty://
I look at the log files, but there's no error...
I used to have some problem with datanode, and jobtracker complained about replication, but that is solved and now all daemons are good (namenode, datanode, jobtracker, tasktracker, secondarynamenode) and no error in any of the log files.
Any suggestions?
Ok finally I solved it myself, I had to re-install the system then re-install hadoop. I think the problem should be that I've previously installed the CDH4 on my system, which is hadoop 2.0.0 and even if I uninstalled all of its packages (debian system) and change the tmp folder of HDFS, but maybe there's still something left. The only way is to restart over.

Resources