Jobtracker is not up - hadoop

I have installed hadoop on centos. I have modified host name to slave2. I have also modified
core-site.xml, mapred files.but job tracker , Data node and task tracker are not starting. Please advice.

Check your daemon logs. They should point you to where the problem is.
When you start hadoop, you see messages like this:
starting tasktracker, logging to /home/username/hadoop/logs/hadoop-user-tasktracker-user-desktop.out
Open the .log file, it should give you a clue.

Related

Secondary name node is not displaying when I hit JPS command

I have Hadoop-3.1.3 and I can upload a file in hadoop pseudo distributed mode, also can display the contents of file.
but when I call jps command i am getting the following output
10912 DataNode
13072 ResourceManager
4480 NodeManager
6584 Jps
664 Namenode
I am unable to find secondary name node, is there a problem with any configuration or hadoop installation?
You're assuming that secondary namenode is started with psuedo-distributed?
If the basic commands work, then its fine.
You need to look at log files to know if something is broken, before asking elsewhere....
In general, I always suggest you use Apache Ambari to provision a Hadoop cluster
You can start the Secondary NameNode manually and observe the start up logs to see if there's anything wrong:
hdfs secondarynamenode
If there's no error, run jps again and hopefully you see SecondaryNameNode listed.
I'd suggest running hdfs --help and checking out all of the options, there's a lot of good stuff there.

Job tracker and Task tracker don't sow up when ran the start-all.sh command in ububtu for hadoop

Job tracker and Task tracker don't sow up when ran the start-all.sh command in ububtu for hadoop
I do get the rest of the processes while i run the "JPS" command in unix.
Not sure why i am not being shown with the job tracker and task tracker.Have been following couple of links and couldn't get my prob sorted.
Steps done :
-Multiple times formatted the namenode
-Multiple time deleted and recreated the tmp folder with appropriate permissions.
What could be the issue ?
Any suggestions would really help me as i am struggling in setting up hadoop on my laptop.I am new to it though.
Try starting jobtracker and tasktracker separately.
From your hadoop HOME directory run
. bin/../libexec/hadoop-config.sh
Then from hadoop BIN directory run
hadoop-daemon.sh --config $HADOOP_CONF_DIR start jobtracker
hadoop-daemon.sh --config $HADOOP_CONF_DIR start tasktracker
You must have been using hadoop 2.x version where jobtracker is replaced with YARN resource manager. Using jps(jdk is needed) you can check whether resouce manager is running. If it is running then the default url for it is (host-name):8088. You can check your nodes,jobs also configuration there.If not running then start them with sbin/start-yarn.sh.

mapreduce tasks only run on namenode

I have builded a Hadoop cluster on three machines; these are the characteristics:
OS:Ubuntu14.04LTS
Hadoop:2.6.0
NameNode and ResourceManager IP: namenode/192.168.0.100
DataNode also as the NodeManger IP: data1/192.168.0.101, data2/192.168.0.102
I have configed all xml files as official doc. When I execute the wordcount example program in eclipse, I wanna show the machine information, which is running the mapTask or reduceTask, so here is my code snippet.
//get localhost
InetAddress mLocalHost = InetAddress.getLocalHost();
System.out.println("Task on " + mLocalHost);
above the snippet was put into map and reduce function and runs it on hadoop. Nevertheless the console always show:
Task on namenode/192.168.0.100
From my perspective, these tasks should run on data1 or data2. Can you explain the puzzle? Whats wrong with my cluster?
Whats more?
the jobHistory(namenode:19888) records nothing.
and webAppProxy(namenode:8088) just show the active nodes:2, but nothing more infomation about job.
can you help me? really appreciated.
namenode's further info below,
jps command show:
12647 Jps
11426 SecondaryNameNode
11217 NameNode
11585 ResourceManager
12033 JobHistoryServe
Where did you put that code, Is it in your Driver class ? You need to have it in your mapper or reducer so that you can see which node is processing.
Instead of that you can have a look at resource manager web ui at rmipaddress:8088 which will give you more details on which node is executing mappers and other logs.
i have found whats wrong with my problem. "run on hadoop" in Eclipse just starts the job locally, so i should modify the MyHadoopXML.xml file which is under Eclipse plugins' sub-directory. Otherwise, i just develop and debug mapreduce job locally and export the project into a jar, then run the jar with command of "hadoop jar" in the cluster to verify whether the job is executed successfully.

Hadoop Configuration

I have started configuring Hadoop 2.1.0-beta version for single node. I followed steps mentioned in Michael Noll's Tutorial (http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/#configuring-single-node-clusters-first). Every thing I did and configured well. As a result of JPS, I got that NameNode, DataNode, Secondary NameNode started fine. Then I found out that there is no start-mapred.sh script. So I tried starting the jobtracker using hadoop-daemons.sh (hadoop-daemon.sh --config /home/nayan/dev/hadoop/etc/hadoop/ start jobtracker) and it resulted in failure with message "Sorry, the jobtracker command is no longer supported. You may find similar functionality with the "yarn" shell command.". I do not know what all configuration changes (if any) I need to make. I made changes in "yarn-site.xml" file, as suggested in Hadoop:The Definitive Guide. But could not proceed further. Where can I find out about Yarn. I checked Apache site, but could not figure it out.
You need to check your configuration xml files. Sometimes if you have any problrm in xml then some daemons wont start.
and try to use ./start-all.sh and then JPS
you can use start-yarn.sh to start the ResourceManger and Jobtracker daemons
I usually start everything using these two commands
./start-dfs.sh
./start-yarn.sh
You Should use start-dfs.sh for Hdfs Daemons and start-yarn.sh for Resource manager and nodemanager daemon both are in /bin of hadoop.
./start-dfs.sh or start-dfs.sh will start only HDFS components , while ./start-yarn.sh or start-yarn.sh will start Yarn component like NodeManager , Resource manager etc. If you don't want to start both the components separately , try using this command :
./start-all.sh or start-all.sh (This is deprecated command though).
To answer your question , use ./start-yarn.sh
Cheers!
First have to start the yarn daemons in the YARN( HADOOP 2.x) Environment.
So start with this
at /hadoop_installed_path/sbin$ ./start-yarn.sh
Once the yarn daemons started then we can start df daemons
at /hadoop_installed_path/sbin$ ./start-dfs.sh
1.You should check all the steps in Hadoop The definitive guide.
if it's all proper than use start-all.sh
than run jps.
2.some time You have to close console for reflecting your changes.so close the console and reopen it again and then try jps,
hope this will help.

Task tracker not running,the job is scheduled but doesnt run. how to fix?

I have been running some benchmarks and i am new to hadoop and hdfs. I have got the setup and things running and they were working fine. But now i am faced with this issue, jps on the master shows
1. secondary name node
2. job tracker
but not the name node and task tracker.
similarly jps on the slave nodes shows only name node, but task tracker is not running.
I usually run the job as the user and not root, but mistakenly i ran it as root and then when i exited and ran the job as user, i found the job doesn't start. then with jps i found the task tracker is not running.
I am new to hdfs, and not sure how to debug and solve this, it would be great if you can give some pointers/help on this one, i did try google and couldnt find relevant answers.
Edit: I tried clearing tmp files, killing obsolete java process and restarting. still i get the same issue.
Thanks.
Kill all java process, after stopping the cluster
remove /tmp hadoop pids
verify file permission errors, but looking at hadoop/logs/*.log file in name node and data node, this gave me useful info in debugging the issue.
this link was helpful,
http://felixtechnique.blogspot.com/2010/09/no-namenode-to-stop-no-tasktracker-to.html

Resources