Building Amabari HDP stacks from sources - hadoop

I am trying to setup Ambari + HDP from sources (since Cloudera closed off Hortonworks package repos). Can anyone share experience / howto on this? Documentation is very scarce in this regard.

#alfheim the documentation is here:
https://cwiki.apache.org/confluence/display/AMBARI/Installation+Guide+for+Ambari+2.7.5
And a post with all the details:
Ambari 2.7.5 installation failure on CentOS 7
Be sure to get the correct versions of npm, maven, node, etc. There are some manual changes you may need to make inside of the source files. You can find quite a few posts solving those issues here on the ambari tag. Go back to pages 2 or 3 to find most recent posts for Building Ambari from Source or just search any errors you may have during build.

Related

How do we install Apache BigTop with Ambari?

I am trying to find out how to deploy a hadoop cluster using ambari by using apache big top
According to the latest release bigtop 1.5:
https://blogs.apache.org/bigtop/
my understanding is that Bigtop Mpack was added as a new feature, that enables users to
deploy Bigtop components via Apache Ambari.
I am able to install the Bigtop components via command line, but do not find any documentation on how to install these bigtop hadoop components via ambari.
Can someone please help redirect me into some documentation that tells me how to install various hadoop components(bigtop packages) via ambari?
Thanks,
I'm from Bigtop community. Though I don't have a comprehensive answer. The Bigtop user mailing list had a discussion recently that has several tech details can answer your question:
https://lists.apache.org/thread.html/r8c5d8dfdee9b7d72164504ff2f2ea641ce39aa02364d60917eaa9fa5%40%3Cuser.bigtop.apache.org%3E
OTOH, you are always welcome to join the mailing list and ask questions. Our community is active and happy to answer questions.
Build a repo of Big Top
To install that repo with Ambari, you have to register the stack/version. You will need to create a version file. I found an example of one here.
Complete installation like you would with a normal build
This is highly theoretical (..haven't done this before..) I have worked with a BIGTOP Mpack before that took care of some of this work but it's not production ready yet, and works with an old version of Ambari, not the newest. (I was able to install/stop/start HDFS/Hive). These instruction above should work with any version of Ambari.
I have been able to test Matt Andruff's theory with a VM. Here was my process and where I stopped;
Built a repo of Apache BigTop 1.5.0
Built BigTop using Gradlew
Installed Apache Ambari 2.6.1 on my system
Enabled BigInsights build version xml file and modified the package version numbers to match my Bigtop build
Note: You can also build your own version file if you want as Matt mentioned
Setup a webserver to host your package repo
Point your xml version file repo to your local webserver for packages
From there you can complete the installation of your packages as you would normally.
I have only done this with a single VM thus far and will be trying to spin up a small cluster using AWS in the coming weeks.

How to build deb/rpm repos from open source Hadoop or publicly available HDP source code to be installed by ambari

I am trying to install the open source hadoop or building the HDP from source to be installed by ambari. I can see that it is possible to build the java packages for each component with the documentation available in apache repos, but how can i use those to build rpm/deb packages that are provided by hortonworks for HDP distribution to be installed by ambari.
#ShivamKhandelwal Building Ambari From Source is a challenge but one that can be accomplished with some persistence. In this post I have disclosed the commands I used recently to build Ambari 2.7.5 in centos:
Ambari 2.7.5 installation failure on CentOS 7
"Building HDP From Source" is very big task as it requires building each component separately, creating your own public/private repo which contains all the component repos or rpms for each operating system flavor. This is a monumental task which was previously managed by many employees and component contributors at Hortonworks.
When you install Ambari from HDP, it comes out of the box with their repos including their HDP stack (HDFS,Yarn,MR,Hive, etc). When you install Ambari From Source, there is no stack. The only solution is to Build Your Own Stack which is something I am expert at doing.
I am currently building a DDP stack as an example to share with the public. I started this project by reverse engineering a HDF Management Pack which includes stack structure (files/folders) to role out NiFi, Kafka, Zookeeper, and more. I have customized it to be my own stack with my own services and components (NiFi, Hue, Elasticsearch, etc).
My goal with DDP is to eventually make my own repos for the Components and Services I want, with the versions I want to install in my cluster. Next I will copy some HDP Components like HDFS,YARN,HIVE from the HDP stack directly into my DDP stack using the last free public HDP Stack (HDP 3.1.5).

Installing Spark through Ambari

I've configured cluster of VMs via Ambari.
Now trying to install Spark.
In all tutorials (i.e. here) it's pretty simple; Spark installation is similar to other services:
But it appears that in my Ambari instance there is simply no such entry.
How can I add Spark entry to Ambari services?
There should be a SPARK folder under the /var/lib/ambari-server/resources/stacks/HDP/2.2/services directory. Additionally there should be spark folders identified by their version number under /var/lib/ambari-server/resources/common-services/SPARK. Either someone modified your environment or it's a bad and or non-standard install of Ambari.
I would suggest re-installing as it is hard to say exactly what you need to add as its unclear what other things may be missing in the environment.

Hadoop 2.7.1 Eclipse plugin creation

After reading almost all the previous posts and web links, I feel to write this post as a need. I am unable to find any directory named ${YOUR_HADOOP_HOME}/src/contrib/eclipse-plugin in my windows based build of Hadoop 2.7.1.
I have downloaded already compiled build from a source but as a matter of learning i want to build it myself. Is there any other way to have source files for creating Hadoop 2.7.1 eclipse plugin? or did i miss something at the time of building my own windows based hadoop? Please explain and if possible provide source for windows 7 build environment.
Thanks

Where to put JDBC drivers when using CDH4+Cloudera Manager?

I am trying to get JDBC jars recognized by Sqoop2 (CDH 4.4.0), but no matter where I place them, they do not seem to be recognized.
I have followed advice:
here,
here,
and asked a similar question here.
Can someone please provide a definitive answer to this?
I would strongly recommend to follow the official installation guide for your Hadoop distribution and it's associated version. It seems that you are using CDH 4.4.0, but looking into CDH 4.2.1 installation instructions. Whereas in CDH 4.2.1 the JDBC driver jar files were expected in /usr/lib/sqoop2, since CDH 4.3.0 they are expected in /var/lib/sqoop2 instead (documentation).

Resources