Adding Hbase service in kerberos enabled CDH cluster - hadoop

I have a CDH cluster already running with kerberos authentication.
I have a requirement to add HBase service to the running cluster.
Looking for a documentation to enable hbase service since its kerberos enabled. Both command line and GUI options welcome.
Also, its good if there is a testing method like small table creation steps like that.
Thanks in advance!

If you add it through Coudera Manager-Add Service wizards, CDH takes care automatically (create/distribute Kerberos keytabs and add services)

Related

does Apache Kylin need a Apache Derby or Mysql for run the sample cube

I installed Java and Hadoop and Hbase and Hive and Spark and Kylin.
hadoop-3.0.3
hbase-1.2.6
apache-hive-2.3.3-bin
spark-2.2.2-bin-without-hadoop
apache-kylin-2.3.1-bin
I will be grateful if someone in Help me with Kyle's installation and configuration them.
http://kylin.apache.org/docs/ this may help you. You can send email to dev#kylin.apache.org, then the questions will be discussed and answered in the mailing list. There are some tips for sending the email: 1. provide Kylin version 2. provide log information 3.provide the usage scenario. If you want to get a quick start, you can run Kylin in a Hadoop sandbox VM or in the cloud, for example, start a small AWS EMR or Azure HDInsight cluster and then install Kylin in one of the nodes. When you use Kylin-2.3.1, I suggest you use Spark-2.1.2.

Installing Hue for an HDInsight HDP cluster

I am aware of installing Hue for HDInsight HDP cluster by deploying it on an edge node of the cluter (using a script action, link), it works fine but asks for the cluster credentials first and then directs me to the Hue login page. Is there a way to get rid of those credentials?
Else, is it possible to deploy Hue on a remote system and then point it to my HDInsight HDP cluster? If so how do I go about?
And which of the above two approaches is better?
Based on my understanding & experience, to answer your questions as below.
There is not any way to get rid of those credentials, due to the credential is to authenticate for Resource Management Template deployment, not only for cluster.
It's not possible to deploy Hue on a remote system, because of "Hue consists of a web service that runs on a special node in your cluster." as the Hue offical manual said from here.
Hope it helps.

Accessing Hadoop data using REST service

I am trying to update HDP architecture so data residing in Hive tables can be accessed by REST APIs. What are the best approaches how to expose data from HDP to other services?
This is my initial idea:
I am storing data in Hive tables and I want to expose some of the information through REST API therefore I thought that using HCatalog/WebHCat would be the best solution. However, I found out that it allows only to query metadata.
What are the options that I have here?
Thank you
You can very well use WebHDFS which is basically a REST Service over Hadoop.
Please see documentation below:
https://hadoop.apache.org/docs/r1.0.4/webhdfs.html
The REST API gateway for the Apache Hadoop Ecosystem is called KNOX
I would check it before explore any other options. In other words, Do you have any reason to avoid using KNOX?
What version of HDP are you running?
The Knox component has been available for quite a while and manageable via Ambari.
Can you get an instance of HiveServer2 running in HTTP mode?
This would give you SQL access through J/ODBC drivers without requiring Hadoop config and binaries (other than those required for the drivers) on the client machines.

No Access Audit found Ranger

I am working on Apache Ranger to enable data security across my Hadoop platform, which is working fine but I am not able to see Access Audit on Ranger Portal.
I have enabled Audit to DB, Audit to HDFS and Audit provider summary as well for respective components on ambari.
Please help me to see Access Audit on Ranger Portal.
check the namenode log (normally under /var/log/hadoop/hdfs/...-namenode.log) and see if the driver of your DB can be found or if an exception is thrown. If the latter is the case, add the driver JAR to e.g. /usr/share/java/ to make sure the driver class is available.
Had met the same problem.
Followed every instruction but the hdfs plugin didn't take effect.
Solved by upgrading hadoop from 2.6.3 to 2.7.2.
As in apach ranger offical site, it says ranger 0.5 only works with hadoop 2.7+.

Hiveserver2 Kerberos

If I want to configure my HiveServer2 (we use selfmade hadoop enviroment) to use Kerberos authentication it's require a secure hadoop envoriment too?
What I mean:
after I installed the kerberos I want to secure my Hiveserver2, but first I have to secure my hdfs, hadoop core-conf, mapred etc.. or not?
I hope it make sense, thank your for the help!
In order to kerborize hive server 2, It is must to kerborize all other hadoop services(for example hdfs, yarn etc) as well.
I don't know what is the meaning of selfmade hadoop environment. You better use cloudera manager(CDH) or ambari server(Hortonworks) to enable kerberos for your hadoop services.

Resources