Can Hortonworks Ambari manage multiple clusters - hadoop

I have been looking all over the web to see if Ambari can manage multiple clusters like Cloudera does. Is this possible in Ambari? If so, how? I have looked all over the Ambari web UI and only see options to add a new host or service, but nothing about adding a cluster.

It's in roadmap. For now it's possible to do so in API level, from version 2.0 it would be possible to manage multiple clusters from web UI.

Related

Add Version Definition File URL in Ambari web UI

I'm trying to create a cluster in Ambari web UI.
Create cluster web UI (image)
I reach a point where I need to put Version Definition File URL.
Add version (image)
Where can I find this URL, without being from cloudera?
You would need to create a cluster services stack that define what Ambari will install and manage.
https://cwiki.apache.org/confluence/display/AMBARI/How-To+Define+Stacks+and+Services
One popular open-source version of services that can be used with Ambari is Apache BigTop

How do I install components such as Apache Drill and Apache Hue in IBM Bluemix BigInsights Apache Hadoop

I am new to IBM Bluemix platform and exploring its BigInsights service. I can see pre configured components such as Pig Hive Hbase and others. But I want to know How can I install services like Drill or say Hue which is not configured by default. Also ssh to cluster nodes allows restricted access with no sudo rights in case one need to run yum commands.Does bluemix allows root access as I cannot see one. Thanks In advance.
As far as I know, it is not possible.
But you can use http://www.softlayer.com/ to build your own IOP (IBM Open Platform) Cluster in the cloud.
If you are interested in IBM's value-adds and you just want to try out:
https://www.youtube.com/watch?v=4p7LDeu_qQQ it is a nice tutorial to set up your own cluster via Docker.
This tutorial should be still valid for Hue:
https://developer.ibm.com/hadoop/2015/06/02/deploying-hue-on-ibm-biginsights/
Installing Drill doesn't look complicated:
https://drill.apache.org/docs/installing-drill-in-distributed-mode/
In conclusion: You need to move away from Bluemix, if you want to have a more customised BigInsights. But there are options: Softlayer, AWS, .. or just on your local computer (if you got sufficient resources, since some components like Hbase need a minimum amount of nodes)

Accessing Hadoop data using REST service

I am trying to update HDP architecture so data residing in Hive tables can be accessed by REST APIs. What are the best approaches how to expose data from HDP to other services?
This is my initial idea:
I am storing data in Hive tables and I want to expose some of the information through REST API therefore I thought that using HCatalog/WebHCat would be the best solution. However, I found out that it allows only to query metadata.
What are the options that I have here?
Thank you
You can very well use WebHDFS which is basically a REST Service over Hadoop.
Please see documentation below:
https://hadoop.apache.org/docs/r1.0.4/webhdfs.html
The REST API gateway for the Apache Hadoop Ecosystem is called KNOX
I would check it before explore any other options. In other words, Do you have any reason to avoid using KNOX?
What version of HDP are you running?
The Knox component has been available for quite a while and manageable via Ambari.
Can you get an instance of HiveServer2 running in HTTP mode?
This would give you SQL access through J/ODBC drivers without requiring Hadoop config and binaries (other than those required for the drivers) on the client machines.

How to intergrate hadoop using ambari without HDP?

I have a hadoop cluster with apache hadoop 2.0.7.
I want to know how to integrate Ambari with the apache hadoop without the HDP(HortonWorks).
Actually, If I use HDP the solution is easy. but , I don't want to use the in my situation.
Do you have an any Idea?
Ambari relies on 'Stack' definitions to describe what services the Hadoop cluster consists of. Hortonworks defined a custom Ambari stack, its called HDP.
You could define your own stack and use any services and respective versions that you wanted. See the ambari wiki for more information about defining stacks and services.
That being said, I don't think it's possible to use your pre-existing installation of Hadoop with Ambari. Ambari is used to provision and manage hadoop clusters. It keeps track of the state of each of its stacks services, and the states of each services components. Since your cluster is already provisioned it would be difficult (maybe impossible) to add it to an Ambari instance.

How to deploy ambari for an existing hadoop cluster

As I mention in this title, can I skip the step of install hadoop cluster for that cluster already exist and which in service?
Ambari relies on 'Stack' definitions to describe what services the Hadoop cluster consists of. Hortonworks defined a custom Ambari stack, its called HDP.
You could define your own stack and use any services and respective versions that you wanted. See the ambari wiki for more information about defining stacks and services.
That being said, I don't think it's possible to use your pre-existing installation of Hadoop with Ambari. Ambari is used to provision and manage hadoop clusters. It keeps track of the state of each of its stacks services, and the states of each services components. Since your cluster is already provisioned it would be difficult (maybe impossible) to add it to an Ambari instance.
One of the minimum requierments of installing Ambari is removing the pre-existing installations of tools mentioned here.It is not mentioned to remove any pre-existing hadoop installation.

Resources