Task tracker not running,the job is scheduled but doesnt run. how to fix? - hadoop

I have been running some benchmarks and i am new to hadoop and hdfs. I have got the setup and things running and they were working fine. But now i am faced with this issue, jps on the master shows
1. secondary name node
2. job tracker
but not the name node and task tracker.
similarly jps on the slave nodes shows only name node, but task tracker is not running.
I usually run the job as the user and not root, but mistakenly i ran it as root and then when i exited and ran the job as user, i found the job doesn't start. then with jps i found the task tracker is not running.
I am new to hdfs, and not sure how to debug and solve this, it would be great if you can give some pointers/help on this one, i did try google and couldnt find relevant answers.
Edit: I tried clearing tmp files, killing obsolete java process and restarting. still i get the same issue.
Thanks.

Kill all java process, after stopping the cluster
remove /tmp hadoop pids
verify file permission errors, but looking at hadoop/logs/*.log file in name node and data node, this gave me useful info in debugging the issue.
this link was helpful,
http://felixtechnique.blogspot.com/2010/09/no-namenode-to-stop-no-tasktracker-to.html

Related

Job tracker and Task tracker don't sow up when ran the start-all.sh command in ububtu for hadoop

Job tracker and Task tracker don't sow up when ran the start-all.sh command in ububtu for hadoop
I do get the rest of the processes while i run the "JPS" command in unix.
Not sure why i am not being shown with the job tracker and task tracker.Have been following couple of links and couldn't get my prob sorted.
Steps done :
-Multiple times formatted the namenode
-Multiple time deleted and recreated the tmp folder with appropriate permissions.
What could be the issue ?
Any suggestions would really help me as i am struggling in setting up hadoop on my laptop.I am new to it though.
Try starting jobtracker and tasktracker separately.
From your hadoop HOME directory run
. bin/../libexec/hadoop-config.sh
Then from hadoop BIN directory run
hadoop-daemon.sh --config $HADOOP_CONF_DIR start jobtracker
hadoop-daemon.sh --config $HADOOP_CONF_DIR start tasktracker
You must have been using hadoop 2.x version where jobtracker is replaced with YARN resource manager. Using jps(jdk is needed) you can check whether resouce manager is running. If it is running then the default url for it is (host-name):8088. You can check your nodes,jobs also configuration there.If not running then start them with sbin/start-yarn.sh.

mapreduce tasks only run on namenode

I have builded a Hadoop cluster on three machines; these are the characteristics:
OS:Ubuntu14.04LTS
Hadoop:2.6.0
NameNode and ResourceManager IP: namenode/192.168.0.100
DataNode also as the NodeManger IP: data1/192.168.0.101, data2/192.168.0.102
I have configed all xml files as official doc. When I execute the wordcount example program in eclipse, I wanna show the machine information, which is running the mapTask or reduceTask, so here is my code snippet.
//get localhost
InetAddress mLocalHost = InetAddress.getLocalHost();
System.out.println("Task on " + mLocalHost);
above the snippet was put into map and reduce function and runs it on hadoop. Nevertheless the console always show:
Task on namenode/192.168.0.100
From my perspective, these tasks should run on data1 or data2. Can you explain the puzzle? Whats wrong with my cluster?
Whats more?
the jobHistory(namenode:19888) records nothing.
and webAppProxy(namenode:8088) just show the active nodes:2, but nothing more infomation about job.
can you help me? really appreciated.
namenode's further info below,
jps command show:
12647 Jps
11426 SecondaryNameNode
11217 NameNode
11585 ResourceManager
12033 JobHistoryServe
Where did you put that code, Is it in your Driver class ? You need to have it in your mapper or reducer so that you can see which node is processing.
Instead of that you can have a look at resource manager web ui at rmipaddress:8088 which will give you more details on which node is executing mappers and other logs.
i have found whats wrong with my problem. "run on hadoop" in Eclipse just starts the job locally, so i should modify the MyHadoopXML.xml file which is under Eclipse plugins' sub-directory. Otherwise, i just develop and debug mapreduce job locally and export the project into a jar, then run the jar with command of "hadoop jar" in the cluster to verify whether the job is executed successfully.

Jobtracker is not up

I have installed hadoop on centos. I have modified host name to slave2. I have also modified
core-site.xml, mapred files.but job tracker , Data node and task tracker are not starting. Please advice.
Check your daemon logs. They should point you to where the problem is.
When you start hadoop, you see messages like this:
starting tasktracker, logging to /home/username/hadoop/logs/hadoop-user-tasktracker-user-desktop.out
Open the .log file, it should give you a clue.

Need help adding multiple DataNodes in pseudo-distributed mode (one machine), using Hadoop-0.18.0

I am a student, interested in Hadoop and started to explore it recently.
I tried adding an additional DataNode in the pseudo-distributed mode but failed.
I am following the Yahoo developer tutorial and so the version of Hadoop I am using is hadoop-0.18.0
I tried to start up using 2 methods I found online:
Method 1 (link)
I have a problem with this line
bin/hadoop-daemon.sh --script bin/hdfs $1 datanode $DN_CONF_OPTS
--script bin/hdfs doesn't seem to be valid in the version I am using. I changed it to --config $HADOOP_HOME/conf2 with all the configuration files in that directory, but when the script is ran it gave the error:
Usage: Java DataNode [-rollback]
Any idea what does the error mean? The log files are created but DataNode did not start.
Method 2 (link)
Basically I duplicated conf folder to conf2 folder, making necessary changes documented on the website to hadoop-site.xml and hadoop-env.sh. then I ran the command
./hadoop-daemon.sh --config ..../conf2 start datanode
it gives the error:
datanode running as process 4190. stop it first.
So I guess this is the 1st DataNode that was started, and the command failed to start another DataNode.
Is there anything I can do to start additional DataNode in the Yahoo VM Hadoop environment? Any help/advice would be greatly appreciated.
Hadoop start/stop scripts use /tmp as a default directory for storing PIDs of already started daemons. In your situation, when you start second datanode, startup script finds /tmp/hadoop-someuser-datanode.pid file from the first datanode and assumes that the datanode daemon is already started.
The plain solution is to set HADOOP_PID_DIR env variable to something else (but not /tmp). Also do not forget to update all network port numbers in conf2.
The smart solution is start a second VM with hadoop environment and join them in a single cluster. It's the way hadoop is intended to use.

Hadoop tasks: "execvp: permission denied"

In a small Hadoop cluster set up on a number of developer workstations (i.e., they have different local configurations), I have one TaskTracker of 6 that is being problematic. Whenever it receives a task, that task immediately fails with ChildError:
java.lang.Throwable: Child Error
at org.apache.hardoop.mapred.TaskRunner.run(TaskRunner.java:242)
Caused by: java.io.IOException: Task process exit with nonzero status of 1.
at org.apache.hardoop.mapred.TaskRunner.run(TaskRunner.java:229)
When I look at the stdout and stderr logs for the task, the stdout log is empty, and the stderr log only has:
execvp: Permission denied
My jobs complete because the tasktracker eventually gets blacklisted and runs on the other nodes that have no problem running a task. I am not able to get any tasks running on this one node, from any number of jobs, so this is a universal problem.
I have a DataNode running on this node with no issues.
I imagine there might some sort of Java issue here where it is having a hard time spawning a JVM or something...
We have same problem. we fix it by adding 'execute' to below file.
$JAVA_HOME/jre/bin/java
Because hadoop use $JAVA_HOME/jre/bin/java to spawn task program instead of $JAVA_HOME/bin/java.
If you still have this issue after change the file mode, suggest you use remote debug to find the shell cmd which spawning the task, see debugging hadoop task
Whatever it is trying to execvp does not have the executable bit set on it. You can set the executable bit using chmod from the commandline.
I have encountered the same problem.
You can try changing the jdk version 32bit to 64bit or 64bit to 32bit.

Resources