Automating Cloudera Management Services - hadoop

I am using Cloudera Express. The Cloudera Manager version is 5.12.0. I am trying to automate the bring-up of services like hdfs, hbase... I am able to do so by specifying necessary information of each service in host template, and pushing the host template to Cloudera Manager using curl command which uses Cloudera Manager API. Now, I want to automate the bring-up of Cloudera Management Services like host monitor, service monitor, event server, activity monitor and alert publisher. I have tried to do so by adding the corresponding role types and service types of each service in the host template. When I push the host template to Cloudera Manager using curl command, Cloudera Manager shows an error that It could not find service type 'MGMT' with version CDH 5.12.0. As the management services are different from cluster services like hdfs, yarn, hbase..., How should I automate the bring-up of management services? Is their a dedicated API to automate Management Services?

Unfortunately, the host template only applies to clusters not CM. To configure CM look at:
https://cloudera.github.io/cm_api/
https://cloudera.github.io/cm_api/apidocs/v17/index.html (in particular the /cm/service/* APIs)

Related

IBM Cloud: How to open Analytics Engine port 7070?

I want to use big data services on the IBM Cloud, so I found Analytics Engine(AE) and BigInsight, but unfortunately, BigInsight is going to stop, so I can only choose AE. However, IBM AE is different AWS and GCP big data services, AE prohibits users from having root permissions, so I cannot change some configurations on the clusters, but I want to install Kylin on the cluster, and I need to open Kylin's 7070 port, later, I found that Knox can map ports, but it looks like IBM change it. So how do I open port 7070 for external access? Can I get root permissions? Or is there any other big data service besides "Analytic Engine" and "BigInsight"?
The Analytics Engine service on IBM Cloud can be customized and additional services and packages installed. However, there are restrictions. As you pointed out, it includes that you cannot have root permissions and open ports for security reasons.
To be entirely flexible and have root permissions you could deploy a Hadoop cluster to a Kubernetes or OpenShift cluster or run it on virtual machines.

Started SQL Server Service on cluster remotely but cluster resource is offline

I have an issue while I start SQL Server service remotely using powershell. The Service is hosted on a cluster as a cluster instance. I start the service with service name, not cluster resource name. So, I am able to bring the service online and connect to the instance, however, the cluster administrator shows the resource is offline until I bring it online. Why is this problem. Why is cluster not able to detect service status.
I don't think starting the SQL Server service will bring the SQL Cluster Resource online. Because you are only attempting to start the SQL Svcs Locally. It is as good as starting the services using service control manager.
You should rather be using Start-ClusterResource -Name <SQLClusterResourceName>

Kubernetes windows agent

Hey I'm running a Kubernetes cluster on Azure using ACS.
My question is if there is any way to add a Windows agent to the cluster without completely rebuilding the cluster?
I know this is possible for Linux distro's depending on what you use but I wonder if anyone knows a way to do this for Windows agents?
If you have deployed your cluster using the Azure portal then you can simply follow the instructions here https://learn.microsoft.com/en-us/azure/container-service/container-service-scale
But if you have deployed using the ACS engine and ARM template then currently there is an issue that it does not creates the acs resource.

How does one install etcd in a cluster?

Newbie w/ etcd/zookeeper type services ...
I'm not quite sure how to handle cluster installation for etcd. Should the service be installed on each client or a group of independent servers? I ask because if I'm on a client, how would I query the cluster? Every tutorial I've read shows a curl command running against localhost.
For etcd cluster installation, you can install the service on independent servers and form a cluster. The cluster information can be queried by logging onto one of the machines and running curl or remotely by specifying the IP address of one of the cluster member node.
For more information on how to set it up, follow this article

What is the difference between apache Ambari Server and Agent

What is the difference between Apache Ambari Server and Agent?
What is the role\tasks of the server vs Agent?
Ambari server collects informations from all Ambari clients and sends operations to clients (start/stop/restart a service, change configuration for a service, ...).
Ambari client sends informations about machine and services installed on this machine.
You have one Ambari server for your cluster and one Ambari agent per machine on your cluster.
If you need more details, Ambari architecture's is explained here

Resources