HDP 2.5: Spark History Server UI won't show incomplete applications - hadoop

I set-up a new Hadoop Cluster with Hortonworks Data Platform 2.5. In the "old" cluster (installed HDP 2.4) I was able to see the information about running Spark jobs via the History Server UI by clicking the link show incomplete applications:
Within the new installation this link opens the page, but it always sais No incomplete applications found! (when there's still an application running).
I just saw, that the YARN ResourceManager UI shows two different kind of links in the "Tracking UI" column, dependent on the status of the Spark application:
application running: Application Master
this link opens http://master_url:8088/proxy/application_1480327991583_0010/
application finished: History
this link opens http://master_url:18080/history/application_1480327991583_0009/jobs/
Via the YARN RM link I can see the running Spark app infos, but why can't I access them via Spark History Server UI? Was there somethings changed from HDP 2.4 to 2.5?

I solved it, it was a network problem: Some of the cluster hosts (Spark slaves) couldn't reach each other due to a incorrect switch configuration. Found it out, as I tried to ping each host from each other.
Since all hosts can ping each other hosts the problem is gone and I can see active and finished jobs in my Spark History server UI again!
I didn't noticed the problem, because the ambari-agents worked on each host, and the ambari-server was also reachable from each cluster host! However, since ALL hosts can reach each other the problem is solved!

Related

How do i started ambari hortonworks services?

I just installed Hortonwroks Sandbox via virtualbox. And when i started Ambari every services was red like you can see in this screenshot . Have i missed something? i'm a beginner in hadoop
Actually, when we start HDP Sandbox all Services services go into the starting stage except Strome, Atlas, Hbase (This can be checked by Gear Icon on the top right side from there you can check the reason behind of failed Services).
Try to Manually Start services in the following manner
Zookeeper
HDFS
YARN
MapReduce
Hive

Some automatically launched Hadoop YARN application

I'm new to Apache Hadoop. I've installed a cluster of YARN with one master and two slaves on AWS. When I just start the cluster YARN, I could observe that some applications are launched by user dr.who with app type YARN automatically. It bothers me a lot. Hoping someone could help me out of this. Thanks!
application_1531399885156_0041 dr.who hadoop YARN default Thu Jul 12 14:58:37 +0200 2018 N/A ACCEPTED UNDEFINED ApplicationMaster 0
This is a known bug in latest launch of Hadoop and a JIRA has also been created. The apps submission by dr.who and when the user kills all the jobs then the NodeManager goes down.
EDIT: Problem Resolution
PROBLEM Customer unable to see logs via Resource Manager UI due to incorrect permissions for the default user dr.who.
RESOLUTION Customer changed the following property in core-site.xml to resolve the issue. Other values such as hdfs or mapred also resolve the issue. If the cluster is managed by Ambari, this should be added in Ambari > HDFS > Configurations >Advanced core-site > Add Property
hadoop.http.staticuser.user=yarn
The same threat was posted on Hortonworks and was answered by Sandeep Nemuri who wrote:
Stop further attacks:
a. Use Firewall / IP table settings to allow access only to whitelisted IP addresses for Resource Manager port (default 8088). Do this on both Resource Managers in your HA setup. This only addresses the current attack. To permanently secure your clusters, all HDP end-points ( e.g WebHDFS) must be blocked from open access outside of firewalls.
b. Make your cluster secure (kerberized).
Clean up existing attacks:
a. If you already see the above problem in your clusters, please filter all applications named “MYYARN” and kill them after verifying that these applications are not legitimately submitted by your own users.
b. You will also need to manually login into the cluster machines and check for any process with “z_2.sh” or “/tmp/java” or “/tmp/w.conf” and kill them.
The link to that thread is : dr.who

Spark History UI not working | Ambari | YARN

I have a hadoop cluster setup using Ambari which has services like HDFS,YARN,spark running on the hosts.
When i run the sample spark pi in cluster mode as master yarn, the application gets successfully executed and I can view the same from resource manager logs.
But when i click on the history link, it does not show the spark history UI. How to enable/view the same?
First, check if your spark-history server is already configured by looking for spark.yarn.historyServer.address in spark-defaults.conf file.
If not configured, this link should help you configure the server: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.6/bk_installing_manually_book/content/ch19s04s01.html
If already configured, check if the history server host is accessible from all the nodes in the cluster, and also the port is open.

Make spark environment for cluster

I made a spark application that analyze file data. Since input file data size could be big, It's not enough to run my application as standalone. With one more physical machine, how should I make architecture for it?
I'm considering using mesos for cluster manager but pretty noobie at hdfs. Is there any way to make it without hdfs (for sharing file data)?
Spark maintain couple cluster modes. Yarn, Mesos and Standalone. You may start with the Standalone mode which means you work on your cluster file-system.
If you are running on Amazon EC2, you may refer to the following article in order to use Spark built-in scripts that loads Spark cluster automatically.
If you are running on an on-prem environment, the way to run in Standalone mode is as follows:
-Start a standalone master
./sbin/start-master.sh
-The master will print out a spark://HOST:PORT URL for itself. For each worker (machine) on your cluster use the URL in the following command:
./sbin/start-slave.sh <master-spark-URL>
-In order to validate that the worker was added to the cluster, you may refer to the following URL: http://localhost:8080 on your master machine and get Spark UI that shows more info about the cluster and its workers.
There are many more parameters to play with. For more info, please refer to this documentation
Hope I have managed to help! :)

Spark EC-2 deployment error: Exiting due to error from cluster scheduler: All masters are unresponsive! Giving up

I have a question in regard to deploying a spark application on a standalone EC-2 Cluster. I have followed the tutorial by Spark ans was able to successfully deploy a standalone EC-2 cluster. I verified that by connecting to the clusrer UI and making sure that everything is as it supposed to be. I developed a simple application and tested it locally. Everything works fine. When I submit it to the cluster (just changing --master local[4] into --masers spark://.... ) I get the following error: ERROR TaskSchedulerImpl: Exiting due to error from cluster scheduler: All masters are unresponsive! Giving up. Does any one know how to overcome this problem. my deploy-mode is client.
Make sure that you have provided the correct url to the master.
Basically, the exact spark master URL is displayed on the page when you connected to the Web UI.
URL on the page is something like: Spark Master at spark://IPAddress:port
Also you may notice that web UI and the Spark running port numbers may be different

Resources