hadoop web interface failed to show job history - hadoop

I could access most functionality of hadoop admin site, like below:
But, when I tried to visit the history of each application, I am no luck any more:
Any body know what happens to my environment? Where should I check?
BTW, when I try to run "netstat -a" on my VM, I found no records for port 8088 or 19888, which is very unreasonable to me, because 8088 lead to hadoop main-page and works well.

In this web interface, you can see your jobs in real time if they are running or the history :
Once a M/R finish, the ressource manager does'nt matter of it. This is the job of the historyServer.
Your historyServer (optionnal part of hadoop YARN) seems not to be launched.
It's this service which listen on 19888.
You can launch it with the command : /etc/init.d/hadoop-mapreduce-historyserver start

Related

Cannot start Ambari services

I installed Hortonworks Sandbox via Virtualbox. And when i started ambari every service is stopped like you can see in this screenshot screenshot
I tried to start manually each of the services but nothing happens when i click the start button. And plus, i have many erros in my notifications section.
Also this is what my ambari agent logs looks like log1 log2
Any idea on how i can resolve this?
#emeric
You must first start HDFS. You may get warnings that require executing this command:
hdfs dfsadmin -safemode leave
Once HDFS is started, you can begin to start YARN, and other services. If you do not have enough resources in your environment you will not be able to start everything. So only start what you need.

Calculate time taken by reducers hadoop

I am running a MapReduce job in Hadoop 2.7.3 in a single node cluster. How do I calculate the time taken by the map and reduce tasks of this job?
SOLVED
In case it helps anyone who views this question or faces a similar problem.
Thanks to #Shubham's answer and a little research I did:
Job tracker has been removed in hadoop 2. It has been split into resource manager and application master.
To access the Resource manager, type in the URL in your browser "http://localhost:8088"
To access the Job History Server (to view statistics about the applications and jobs that have been completed) type in the URL in your browser "http://localhost:19888"
You could encounter an error when trying to access the Job History Server. It may show that there is no history for the application. In that case follow these steps:
Change the bashrc file
Steps:
i. In your terminal, type "nano ~/.bashrc"
ii. Now in this file, where the other hadoop variables are written add the line
export HADOOP_CONFIG_DIR=/usr/local/hadoop/etc/hadoop
iii. Exit out of nano and save the file.
iv. Run the command "source ~/.bashrc"
1. To start the job history server
Steps:
i. Run the command in your terminal
$HADOOP_HOME/sbin/mr-jobhistory-daemon.sh --config $HADOOP_CONFIG_DIR start historyserver
ii. Then run the command
jps
You should be able to see the "JobHistoryServer" in the list
iii. Now run the command
netstat -ntlp | grep 19888
Hit resource manager's web UI(http://rm_http_address_host:port/). Typically the web port is 8088. You can hit http://resourcemanager_host:8088/ for this.
There you will find the link for all the applications that are in various states like STARTED, RUNNING, FAILED, SUCCEEDED etc
Clicking on each application's link will give you all the statistics(like number of containers(mappers/reducers in case of mapreduce), memory/Vcores used, running time and a lot more stats) about that yarn job.
And a lot many stats are exposed by ResourceManager REST API’s. Find them here https://hadoop.apache.org/docs/r2.7.3/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html
You can go to the jobtracker (Runs on port 50030 by default) and check the job details. It shows the counters for Map time and reduce time. Moreover if you are interested in individual tasks you can follow link "Analyse This Job" that shows best and worst performing tasks.

Spark Shell stuck in YARN Accepted state

Running Spark 1.3.1 on Yarn and EMR. When I run the spark-shell everything looks normal until I start seeing messages like INFO yarn.Client: Application report for application_1439330624449_1561 (state: ACCEPTED). These messages are generated endlessly, once per second. Meanwhile, I am unable to use the Spark shell.
I don't understand why this is happening.
Seeing (near) endless Accepted messages from YARN has always been a sure sign that there were not enough cluster resources to allocate for my Spark jobs / shell. YARN will continue trying to schedule your Spark application, but will eventually time-out if not enough resources become available in a certain amount of time.
Are you providing any command line options to spark-shell that override the defaults provided? When I ask for too many executors/cores/memory YARN will accept my request but never transition to a Running ApplicationMaster.
Try running a spark-shell with no options (other than perhaps --master yarn) and see if it gets past Accepted.
Realized there were a couple of streaming jobs I had killed in the terminal, but I guess they were somehow still running. I was able to find these in the UI showing all running applications on YARN (I wasn't able to execute Hive queries as either). Once I killed the jobs using the command below the spark-shell started as usual.
yarn application -kill application_1428487296152_25597
I guess that YARN is not having resources enough for running jobs.
Please check
https://www.cloudera.com/documentation/enterprise/5-3-x/topics/cdh_ig_yarn_tuning.html
for calculating how many resources can you provide to YARN.
Please check the number of cores and the RAM quantity that it is controlled by the following variables:
yarn.nodemanager.resource.cpu-vcores
yarn.nodemanager.resource.memory-mb

JobTracker web UI - not working in psedo-distributed mode v 2.7.1

I have installed Hadoop 2.7.1 in psuedo distributed mode (all daemons on single machine). It's up and running and I'm able to access HDFS through command line and run the jobs and I'm able to see the output.
I can access http://localhost:50070/dfshealth.html#tab-overview. it shows version and cluster status and can access hadoop file system.
I found one link and applied its accepted solution but that does not work for me. When I am trying to access http://127.0.0.1:54310, I am getting below error message
It looks like you are making an HTTP request to a Hadoop IPC port. This is
not the correct port for the web interface on this daemon.
Any help is appreciated.
Thanks..
I am using MR2 and not able to track my job on 8088. When I run map reduce job, it submit the job on http://localhost:8080 and thats url is not opening to track the job.
Use port 50030 if you are using MRV1 for YARN use port 8088 for accessing resource manager.

Submitting jobs to Spark EC2 cluster remotely

I've set up the EC2 cluster with Spark. Everything works, all master/slaves are up and running.
I'm trying to submit a sample job (SparkPi). When I ssh to cluster and submit it from there - everything works fine. However when driver is created on a remote host (my laptop), it doesn't work. I've tried both modes for --deploy-mode:
--deploy-mode=client:
From my laptop:
./bin/spark-submit --master spark://ec2-52-10-82-218.us-west-2.compute.amazonaws.com:7077 --class SparkPi ec2test/target/scala-2.10/ec2test_2.10-0.0.1.jar
Results in the following indefinite warnings/errors:
WARN TaskSchedulerImpl: Initial job has not accepted any resources;
check your cluster UI to ensure that workers are registered and have
sufficient memory 15/02/22 18:30:45
ERROR SparkDeploySchedulerBackend: Asked to remove non-existent executor 0 15/02/22 18:30:45
ERROR SparkDeploySchedulerBackend: Asked to remove non-existent executor 1
...and failed drivers - in Spark Web UI "Completed Drivers" with "State=ERROR" appear.
I've tried to pass limits for cores and memory to submit script but it didn't help...
--deploy-mode=cluster:
From my laptop:
./bin/spark-submit --master spark://ec2-52-10-82-218.us-west-2.compute.amazonaws.com:7077 --deploy-mode cluster --class SparkPi ec2test/target/scala-2.10/ec2test_2.10-0.0.1.jar
The result is:
.... Driver successfully submitted as driver-20150223023734-0007 ...
waiting before polling master for driver state ... polling master for
driver state State of driver-20150223023734-0007 is ERROR Exception
from cluster was: java.io.FileNotFoundException: File
file:/home/oleg/spark/spark12/ec2test/target/scala-2.10/ec2test_2.10-0.0.1.jar
does not exist. java.io.FileNotFoundException: File
file:/home/oleg/spark/spark12/ec2test/target/scala-2.10/ec2test_2.10-0.0.1.jar
does not exist. at
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:397)
at
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:329) at
org.apache.spark.deploy.worker.DriverRunner.org$apache$spark$deploy$worker$DriverRunner$$downloadUserJar(DriverRunner.scala:150)
at
org.apache.spark.deploy.worker.DriverRunner$$anon$1.run(DriverRunner.scala:75)
So, I'd appreciate any pointers on what is going wrong and some guidance how to deploy jobs from remote client. Thanks.
UPDATE:
So for the second issue in cluster mode, the file must be globally visible by each cluster node, so it has to be somewhere in accessible location. This solve IOException but leads to the same issue as in the client mode.
The documentation at:
http://spark.apache.org/docs/latest/security.html#configuring-ports-for-network-security
lists all the different communication channels used in a Spark cluster. As you can see, there are a bunch where the connection is made from the Executor(s) to the Driver. When you run with --deploy-mode=client, the driver runs on your laptop, so the executors will try to make a connection to your laptop. If the AWS security group that your executors run under blocks outbound traffic to your laptop (which the default security group created by the Spark EC2 scripts doesn't), or you are behind a router/firewall (more likely), they fail to connect and you get the errors you are seeing.
So to resolve it, you have to forward all the necessary ports to your laptop, or reconfigure your firewall to allow connection to the ports. Seeing as a bunch of the ports are chosen at random, this means opening up a wide range of, if not all ports. So probably using --deploy-mode=cluster, or client from the cluster, is less painful.
I advise against submitting spark jobs remotely using the port opening strategy, because it can create security problems and is in my experience, more trouble than it's worth, especially due to having to troubleshoot the communication layer.
Alternatives:
1) Livy - now an Apache project! http://livy.io or http://livy.incubator.apache.org/
2) Spark Job server - https://github.com/spark-jobserver/spark-jobserver

Resources