how to upgrade components in ambari - hadoop

I would love to have hadoop and few other packages in newer versions that the current ambari with HDP2.6.3 allows.
Is there an option for this kind of single components version upgrades?

This feature will not be ready until Ambari 3.0. See AMBARI-18678 & AMBARI-14714
Depending on what you want to upgrade, though, I wouldn't suggest it.
For example, HBase, Hive and Spark do not yet support Hadoop 3.0. The streaming components of HDP like Spark, Kafka, NiFi seem to release versions more frequently, and there are ways outside of Ambari to upgrade those.
You don't need HDP, or Ambari to manage your Hadoop, but it does make a nice package and central management component for the cluster. If you upgrade pieces on your own, you risk incompatibility.
HDP is tested as a whole unit. The Hortonworks repos that you setup in Ambari limit what component versions that are available to you, but this does not stop you from using your own repositories plus Puppet/Chef from installing additional software into your Hadoop environment. The only thing you lose at that point, is management and configuration from Ambari.
You could try to define your own Ambari Mpacks to install additional software, but make sure you have the resources to maintain it.

Ambari upgrade steps are well documented in the Hortonworks documentation, You may follow the below link for upgrading Ambari + Hadoop components
https://docs.hortonworks.com/HDPDocuments/Ambari-2.4.0.0/bk_ambari-upgrade/content/upgrading_ambari.html
https://docs.hortonworks.com/HDPDocuments/Ambari-2.4.0.0/bk_ambari-upgrade/content/upgrading_hdp_stack.html
All 2.6 package urls are available in the below
https://docs.hortonworks.com/HDPDocuments/Ambari-2.6.0.0/bk_ambari-installation/content/hdp_26_repositories.html
You can do single components (HADOOP -include both HDFS and YARN, HIVE, OOZIE etc, ) upgrades using yum or apt-get or other package managers, however Single component upgrade in the Hadoop cluster is not recommended due to dependency issues - services might fails sometimes, So it is better to have to complete HDP stack upgraded instead of upgrading individual components.
Also you need to check Amabari version compatibility in the hortonworks documents, If you planning to upgrade only Hadoop core packages without Ambari, other wide cluster monitoring might fails

Related

Apache ambari installation

I'm trying to install ambari server + agents.
I have a doubt regarding ambari.
I tried to install ambari.
It always gets link with hortonwork
My doubt is that I have hadoop cluster of my own in Ubunu 16.0.Will ambari only work with HDP or is it possible to also make it work with custom built clusters?
Or if possible please share me detailed descriptive documentation
It's not clear where you downloaded Ambari from, but it sounds like you used the Hortonworks version of it. Not directly from https://ambari.apache.org
Ambari works with the concept of stacks. Each stack has a set of services and components. HDP is such a stack, but there are others, or you can even define your own, so yes, you can manage your own Hadoop installation components, but that really would be not much different from what Hortonworks already provides.
Besides, the HDP services and components have been tested to work together more throughly than off the shelf Hadoop installation.
If you don't want HDP components, there is also the Apache Bigtop project that provides installation packs for many Hadoop related services
Ambari expects Java and Hadoop to be installed in a certain way. I'm not sure how easy it is to setup for an existing Hadoop install.

Adding Apache NiFi to existing Hortonworks HDP Cluster

I have a 6-node-cluster running Hortonworks HDP 2.5.3 and Ambari 2.4.2.0
I want to install Apache NiFi on this cluster. When looking in the documentation, the following line jumps to my eyes:
1.1. Interoperability Requirements
You cannot install HDF on a system where HDP is already installed.
I wonder how I can install NiFi on my cluster. I would like to manage it with Ambari too, if possible.
Should I just go ahead and install the standalone version of NiFi and changing the port to something else than 8080, which is in use by Ambari? The problem is that I'd have to install it on every node and this process is not automated.
Currently you can only install one stack into a given Ambari instance, and there is an HDP stack which does not include NiFi, and an HDF stack which includes NiFi, Kafka, Storm, and Ranger. So you need a second Ambari instance where you can install the HDF stack. You also can't share nodes between two Ambaris because there can only be one Ambari agent running on a node.
There might be enhancements in future Ambari releases to improve this situation, but for now if you are limited to using your 6 HDP nodes then you would have to install/manage NiFi manually using the RPM or TAR.
As of HDP 2.6.1 it is possible to install HDF components on an HDP cluster. See https://docs.hortonworks.com/HDPDocuments/HDF3/HDF-3.0.1.1/bk_installing-hdf-and-hdp/content/ch_install-ambari.html
Since the latest HDP 3.0, it can add HDF 3.2 and work together with NiFi

How to intergrate hadoop using ambari without HDP?

I have a hadoop cluster with apache hadoop 2.0.7.
I want to know how to integrate Ambari with the apache hadoop without the HDP(HortonWorks).
Actually, If I use HDP the solution is easy. but , I don't want to use the in my situation.
Do you have an any Idea?
Ambari relies on 'Stack' definitions to describe what services the Hadoop cluster consists of. Hortonworks defined a custom Ambari stack, its called HDP.
You could define your own stack and use any services and respective versions that you wanted. See the ambari wiki for more information about defining stacks and services.
That being said, I don't think it's possible to use your pre-existing installation of Hadoop with Ambari. Ambari is used to provision and manage hadoop clusters. It keeps track of the state of each of its stacks services, and the states of each services components. Since your cluster is already provisioned it would be difficult (maybe impossible) to add it to an Ambari instance.

How to install Apache Spark on HortonWorks HDP 2.2 (built using Ambari)

I successfully built a 5 node cluster of HortonWorks HDP 2.2 using Ambari.
However I don't see Apache Spark in the installed services list.
I did some research and found that Ambari does not install certain components like hue etc. ( Spark was not in that list, but I guess its not installed).
How do I do a manual install of Apache spark on my 5 node HDP 2.2?
Or should I delete my cluster and perform a fresh install without using Ambari?
Hortonworks support for Spark is arriving but not fully complete (details and blog).
Instructions for how to integrate Spark with HDP can be found here.
You could build your own Ambari Stack for Spark. I recently did just that, but I cannot share that code :(
What I can do is share a tutorial I did on how to do any stack for Ambari, including Spark. There are many interesting issues with Spark that need to be addressed and are not covered through the tutorial. Anyways hope it helps. http://bit.ly/1HDBgS6
There is also a guide from the Ambari people here: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=38571133.
1) Ambari 1.7x does not install Accumulo, Hue, Ranger, or Solr services for the HDP 2.2 Stack.
For Installing Accumulo, Hue, Knox, Ranger, and Solr services, install
HDP Manually.
2) Apache Spark 1.2.0 on YARN with HDP 2.2 : here .
3)
Spark and Hadoop: Working Together :
Standalone deployment: With the standalone deployment one can statically allocate resources on all or a subset of machines in a Hadoop cluster and run Spark side by side with Hadoop MR. The user can then run arbitrary Spark jobs on her HDFS data. Its simplicity makes this the deployment of choice for many Hadoop 1.x users.
Hadoop Yarn deployment: Hadoop users who have already deployed or are planning to deploy Hadoop Yarn can simply run Spark on YARN without any pre-installation or administrative access required. This allows users to easily integrate Spark in their Hadoop stack and take advantage of the full power of Spark, as well as of other components running on top of Spark.
Spark In MapReduce : For the Hadoop users that are not running YARN yet, another option, in addition to the standalone deployment, is to use SIMR to launch Spark jobs inside MapReduce. With SIMR, users can start experimenting with Spark and use its shell within a couple of minutes after downloading it! This tremendously lowers the barrier of deployment, and lets virtually everyone play with Spark.

How to deploy ambari for an existing hadoop cluster

As I mention in this title, can I skip the step of install hadoop cluster for that cluster already exist and which in service?
Ambari relies on 'Stack' definitions to describe what services the Hadoop cluster consists of. Hortonworks defined a custom Ambari stack, its called HDP.
You could define your own stack and use any services and respective versions that you wanted. See the ambari wiki for more information about defining stacks and services.
That being said, I don't think it's possible to use your pre-existing installation of Hadoop with Ambari. Ambari is used to provision and manage hadoop clusters. It keeps track of the state of each of its stacks services, and the states of each services components. Since your cluster is already provisioned it would be difficult (maybe impossible) to add it to an Ambari instance.
One of the minimum requierments of installing Ambari is removing the pre-existing installations of tools mentioned here.It is not mentioned to remove any pre-existing hadoop installation.

Resources