I have a Ambari cluster to manage my hadoop/spark jobs. I want to schedule my workflows using oozie editor. Hue is the most popular and easy to use one. How do I install hue on top of an existing hadoop cluster managed by Ambari.
Thanks
Hue is a service created by Cloudera. You cannot install it using Ambari, but you can download a package with Hue and install it based on official documentation. You should check this article - Installing Hue 3.9 on HDP 2.3
Related
Like hue is used to deploy oozie jobs using oozie editor, what alternative do we have, when using hortonworks ambari? I want to deploy oozie jobs but also want to avoid oozie cli client.
Latest versions of Ambari support the Oozie Workflow Manager.
https://docs.hortonworks.com/HDPDocuments/Ambari-2.7.1.0/bk_workflow-management/content/ch_wfm_basics.html
Or, you could download and install/configure Hue on your cluster; Ambari doesn't need to be the central configuration component of the cluster
I'm trying to install ambari server + agents.
I have a doubt regarding ambari.
I tried to install ambari.
It always gets link with hortonwork
My doubt is that I have hadoop cluster of my own in Ubunu 16.0.Will ambari only work with HDP or is it possible to also make it work with custom built clusters?
Or if possible please share me detailed descriptive documentation
It's not clear where you downloaded Ambari from, but it sounds like you used the Hortonworks version of it. Not directly from https://ambari.apache.org
Ambari works with the concept of stacks. Each stack has a set of services and components. HDP is such a stack, but there are others, or you can even define your own, so yes, you can manage your own Hadoop installation components, but that really would be not much different from what Hortonworks already provides.
Besides, the HDP services and components have been tested to work together more throughly than off the shelf Hadoop installation.
If you don't want HDP components, there is also the Apache Bigtop project that provides installation packs for many Hadoop related services
Ambari expects Java and Hadoop to be installed in a certain way. I'm not sure how easy it is to setup for an existing Hadoop install.
Recently i heard that apache Ambari support Hue (which is a Cloudera component) but I'm not sure if I can use it on my HDP 2.5, and if it can work well for my cluster with no problem (I use Ambari 2.5)
Does HDP support Hue?
We believe they officially stopped, but as Hue is compatible with any Hadoop, people still install it the clusters: http://gethue.com/hadoop-hue-3-on-hdp-installation-tutorial/
Based on HDP 2.5.0 Release Notes it does. At the bottom of the page, you'll see:
Additional component versions:
Cascading 3.0.0
Hue 2.6.0
I have a 6-node-cluster running Hortonworks HDP 2.5.3 and Ambari 2.4.2.0
I want to install Apache NiFi on this cluster. When looking in the documentation, the following line jumps to my eyes:
1.1. Interoperability Requirements
You cannot install HDF on a system where HDP is already installed.
I wonder how I can install NiFi on my cluster. I would like to manage it with Ambari too, if possible.
Should I just go ahead and install the standalone version of NiFi and changing the port to something else than 8080, which is in use by Ambari? The problem is that I'd have to install it on every node and this process is not automated.
Currently you can only install one stack into a given Ambari instance, and there is an HDP stack which does not include NiFi, and an HDF stack which includes NiFi, Kafka, Storm, and Ranger. So you need a second Ambari instance where you can install the HDF stack. You also can't share nodes between two Ambaris because there can only be one Ambari agent running on a node.
There might be enhancements in future Ambari releases to improve this situation, but for now if you are limited to using your 6 HDP nodes then you would have to install/manage NiFi manually using the RPM or TAR.
As of HDP 2.6.1 it is possible to install HDF components on an HDP cluster. See https://docs.hortonworks.com/HDPDocuments/HDF3/HDF-3.0.1.1/bk_installing-hdf-and-hdp/content/ch_install-ambari.html
Since the latest HDP 3.0, it can add HDF 3.2 and work together with NiFi
I successfully built a 5 node cluster of HortonWorks HDP 2.2 using Ambari.
However I don't see Apache Spark in the installed services list.
I did some research and found that Ambari does not install certain components like hue etc. ( Spark was not in that list, but I guess its not installed).
How do I do a manual install of Apache spark on my 5 node HDP 2.2?
Or should I delete my cluster and perform a fresh install without using Ambari?
Hortonworks support for Spark is arriving but not fully complete (details and blog).
Instructions for how to integrate Spark with HDP can be found here.
You could build your own Ambari Stack for Spark. I recently did just that, but I cannot share that code :(
What I can do is share a tutorial I did on how to do any stack for Ambari, including Spark. There are many interesting issues with Spark that need to be addressed and are not covered through the tutorial. Anyways hope it helps. http://bit.ly/1HDBgS6
There is also a guide from the Ambari people here: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=38571133.
1) Ambari 1.7x does not install Accumulo, Hue, Ranger, or Solr services for the HDP 2.2 Stack.
For Installing Accumulo, Hue, Knox, Ranger, and Solr services, install
HDP Manually.
2) Apache Spark 1.2.0 on YARN with HDP 2.2 : here .
3)
Spark and Hadoop: Working Together :
Standalone deployment: With the standalone deployment one can statically allocate resources on all or a subset of machines in a Hadoop cluster and run Spark side by side with Hadoop MR. The user can then run arbitrary Spark jobs on her HDFS data. Its simplicity makes this the deployment of choice for many Hadoop 1.x users.
Hadoop Yarn deployment: Hadoop users who have already deployed or are planning to deploy Hadoop Yarn can simply run Spark on YARN without any pre-installation or administrative access required. This allows users to easily integrate Spark in their Hadoop stack and take advantage of the full power of Spark, as well as of other components running on top of Spark.
Spark In MapReduce : For the Hadoop users that are not running YARN yet, another option, in addition to the standalone deployment, is to use SIMR to launch Spark jobs inside MapReduce. With SIMR, users can start experimenting with Spark and use its shell within a couple of minutes after downloading it! This tremendously lowers the barrier of deployment, and lets virtually everyone play with Spark.