Cloudera node /etc/krb5.conf replaced at every reboot - hadoop

I have a question, why are my cloudera nodes replacing the file /etc/krb5.conf ata every reboot ?? Im trying to make modifications, and when someone issues a reboot the file is again replaced by the old config file

Both CDH and HDP distros have an option to let their Hadoop cluster manager (Cloudera Manager vs. Ambari) also manage the Kerberos client config on all nodes.
Or rather, they have an option not to let it manage it for you...
From CDH 6.3 documentation
Choose whether Cloudera Manager should deploy and manage the krb5.conf on your cluster or not ...this page will let you configure the properties that will be emitted in it. In particular, the safety valves on this page can be used to configure cross-realm authentication.
From HDP 3.1 documentation
(Optional) To manage your Kerberos client krb5.conf manually (and not have Ambari manage the krb5.conf), expand the Advanced krb5-conf section and uncheck the "Manage" option.(Optional) To not have Ambari install the Kerberos client libraries on all hosts, expand the Advanced kerberos-env section and uncheck the “Install OS-specific Kerberos client package(s)” option

Related

How to configure ‘‑‑enable_orc_scanner’ to ture in cloudera manager 6.3, my impala version is 3.2

In cdh6.3.0, the version of impala is 3.2. Impala can support files in ORC format, but you need to set ‑enable_orc_scanner to true.
How do I set this property in the Cloudera manager console?
Based on Impala documentation you need to add that flag, possibly to IMPALA_SERVER_ARGS environment variable on the Impala Daemons.
I'm not sure Cloudera Manager exposes such settings.
Note: Just because you have Cloudera Manager, does not mean you can configure each little detail on the nodes; you might need to SSH to individual machines.
In Cloudera Manager, navigate to Clusters > Impala.
In the Configuration tab, set --enable_orc_scanner=true in the Impala Command Line Argument Advanced Configuration Snippet (Safety Valve)field

Plain vanilla Hadoop installation vs Hadoop installation using Ambari

I recently downloaded hadoop distribution from Apache and got it up and running quite fast; download the hadoop tar ball, untar it at a location and some configuration setting. The thing here is that I am able to see the various configuration files like: yarn-site.xml, hdfs-site.xml etc; and I know the hadoop home location.
Next, I installed hadoop (HDP) Using Ambari.
Here comes the confusion part. It seems Ambarin installs the hdp in /usr/hdp; however the directory structure in plain vanilla hadoop vs Ambari is totally different. I am not able to locate the configuration files e.g. yarn-site.xml etc.
So can anyone help me demystify this?
All the configuration changes must be done via the Ambari UI. There is no use for the configuration files since Ambari persists the configurations in Ambari Database.
If you still need them, they are under /etc/hadoop/conf/.
It's true that configuration changes must be made via Ambari UI and that those configurations are stored in a database.
Why is it necessary to change these configuration properties in Ambari UI and not directly on disk?
Every time a service is restarted and it has a stale configuration the ambari-agent is responsible for writing the latest configuration to disk. They are written to /etc/<service-name>/conf. If you were to make changes directly to the configuration files on disk they would get overwritten by the aforementioned process.
However the configuration files found on disk DO still have a use...
The configuration files (on disk) are used by the various hadoop daemons when they're started/running.
Basically the benefit of using Ambari UI in Cluster Hadoop deployment. It will give you central management point.
For example:
10 pcs Hadoop cluster setup.
Plain vanilla Hadoop:
If you change any configuration you must be changed in 10 pcs
Ambari UI :
Due to configuration store in db. you just change in management portal all changes effect reflected on all node by single point change.

How do I install components such as Apache Drill and Apache Hue in IBM Bluemix BigInsights Apache Hadoop

I am new to IBM Bluemix platform and exploring its BigInsights service. I can see pre configured components such as Pig Hive Hbase and others. But I want to know How can I install services like Drill or say Hue which is not configured by default. Also ssh to cluster nodes allows restricted access with no sudo rights in case one need to run yum commands.Does bluemix allows root access as I cannot see one. Thanks In advance.
As far as I know, it is not possible.
But you can use http://www.softlayer.com/ to build your own IOP (IBM Open Platform) Cluster in the cloud.
If you are interested in IBM's value-adds and you just want to try out:
https://www.youtube.com/watch?v=4p7LDeu_qQQ it is a nice tutorial to set up your own cluster via Docker.
This tutorial should be still valid for Hue:
https://developer.ibm.com/hadoop/2015/06/02/deploying-hue-on-ibm-biginsights/
Installing Drill doesn't look complicated:
https://drill.apache.org/docs/installing-drill-in-distributed-mode/
In conclusion: You need to move away from Bluemix, if you want to have a more customised BigInsights. But there are options: Softlayer, AWS, .. or just on your local computer (if you got sufficient resources, since some components like Hbase need a minimum amount of nodes)

HBase region servers going down when try to configure Apache Phoenix

I'm using CDH 5.3.1 and HBase 0.98.6-cdh5.3.1 and trying to configure Apache Phoenix 4.4.0
As per the documentation provided in Apache Phoenix Installation
Copied phoenix-4.4.0-HBase-0.98-server.jar file in lib directory (/opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/lib/hbase/lib) of both master and region servers
Restarted HBase service from Cloudera Manager.
When I check the HBase instances I see the region servers are down and I don't see any problem in log files.
I even tried to copy all the jars from the phoenix folder and still facing the same issue.
I have even tried to configure Phoenix 4.3.0 and 4.1.0 but still no luck.
Can someone point me what else I need to configure or anything else that I need to do to resolve this issue
I'm able to configure Apache Phoenix using Parcels. Following are the steps to install Phoenix using Cloudera Manager
In Cloudera Manager, go to Hosts, then Parcels.
Select Edit Settings.
Click the + sign next to an existing Remote Parcel Repository URL, and add the following URL: http://archive.cloudera.com/cloudera-labs/phoenix/parcels/1.0/. Click Save Changes.
Select Hosts, then Parcels.
In the list of Parcel Names, CLABS_PHOENIX is now available. Select it and choose Download.
The first cluster is selected by default. To choose a different cluster for distribution, select it. Find CLABS_PHOENIX in the list, and click Distribute.
If you plan to use secondary indexing, add the following to the hbase-site.xml advanced configuration snippet. Go to the HBase service, click Configuration, and choose HBase Service Advanced Configuration Snippet (Safety Valve) for hbase-site.xml. Paste in the following XML, then save the changes.
<property>
<name>hbase.regionserver.wal.codec</name>
<value>org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec</value>
</property>
Whether you edited the HBase configuration or not, restart the HBase service. Click Actions > Restart
For detailed installation steps and other details refer this link
I dont think, Phoenix4.4.0 is compatible with CDH version you are running. This discussion on mailing list will help you:http://search-hadoop.com/m/9UY0h2n4MOg1IX6OR1

Cluster has stale Kerberos client configuration

Use cloudera manager 5.3.0, One of the cluster have this configuation issue,another cluster no problem.
I can't find a clue to solve this issue, Hours of googling didnt help me. thanks!
cluster1
Cluster has stale Kerberos client configuration.
The /etc/krb5.conf file on your new node is probably different that the one present on your other nodes (or simply not even there). Either 'Deploy Kerberos Client Configuration' from your cluster menu in CM (this will create the krb5.conf file from CM's Kerberos configuration), or copy a working krb5.conf file from another node to the new one.

Resources