How to safely fix an AWOL ambari system user? - hadoop

I'm a student working on a test cluster, consisting of around 25 hosts. We installed using Ambari and have FreeIpa running on a host as a dns and ldap server. The rest are typical Hadoop
infrastructure. Hive was failing and I wondered whether the db connection parameters used during the Ambari installation were incorrect and I tried to find a way to re-run the db connection process. I didn't get anywhere and it was late so I left it, ambari interface working.
Next morning, ambari webUI seems to be down. I thought that maybe the webserver needed restarted so I tried the following:
[akidd#dw ~]$ sudo ambari-server start
Using python /usr/bin/python
Starting ambari-server
ERROR: Exiting with exit code 1.
REASON: Unable to detect a system user for Ambari Server.
- If this is a new setup, then run the "ambari-server setup" command to create the user
- If this is an upgrade of an existing setup, run the "ambari-server upgrade" command.
Refer to the Ambari documentation for more information on setup and upgrade.
Can anyone help me to understand what could have happened?
If I run ambari-server setup will the existing cluster be ok assuming I create everything like for like with how it was originally?
Thanks for your help!

#user3535074 You should try to start it with the user that installed it.
If you do run ambari-server setup as current user, remember to choose No the following options:
Customize user account for ambari-server daemon [y/n] (n)? n
Do you want to change Oracle JDK [y/n] (n)? n
Enter advanced database configuration [y/n] (n)? n
More info on the following post, including how to backup ambari database before running setup again:
https://community.cloudera.com/t5/Support-Questions/Ambari-server-failed-to-start-after-system-reboot-Below-is/td-p/203806

Related

Apache Zeppelin configuration for connect to Hive on HDP Virtualbox

I've been struggling with the Apache Zeppelin notebook version 0.10.0 setup for a while.
The idea is to be able to connect it to a remote Hortonworks 2.6.5 server that runs locally on Virtualbox in Ubuntu 20.04.
I am using an image downloaded from the:
https://www.cloudera.com/downloads/hortonworks-sandbox.html
Of course, the image has pre-installed Zeppelin which works fine on port 9995, but this is an old 0.7.3 version that doesn't support Helium plugins that I would like to use. I know that HDP version 3.0.1 has updated Zeppelin version 0.8 onboard, but its use due to my hardware resource is impossible at the moment. Additionally, from what I remember, enabling Leaflet Map Plugin there was a problem either.
The first thought was to update the notebook on the server, but after updating according to the instructions on the Cloudera forums (unfortunately they are not working at the moment, and I cannot provide a link or see any other solution) it failed to start correctly.
A simpler solution seemed to me now to connect the newer notebook version to the virtual server, unfortunately, despite many attempts and solutions from threads here with various configurations, I was not able to connect to Hive via JDBC. I am using Zeppelin with local Spark 3.0.3 too, but I have some geodata in Hive that I would like to visualize this way.
I used, among others, the description on the Zeppelin website:
https://zeppelin.apache.org/docs/latest/interpreter/jdbc.html#apache-hive
This is my current JDBC interpreter configuration:
hive.driver org.apache.hive.jdbc.HiveDriver
hive.url jdbc:hive2://sandbox-hdp.hortonworks.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
hive.user hive
Artifact org.apache.hive:hive-jdbc:3.1.2
Depending on the driver version, there were different errors, but this time after typing:
%jdbc(hive)
SELECT * FROM mydb.mytable;
I get the following error:
Could not open client transport for any of the Server URI's in
ZooKeeper: Could not establish connection to
jdbc:hive2://sandbox-hdp.hortonworks.com:10000/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;hive.server2.proxy.user=hive;?tez.application.tags=paragraph_1645270946147_194101954;mapreduce.job.tags=paragraph_1645270946147_194101954;:
Required field 'client_protocol' is unset!
Struct:TOpenSessionReq(client_protocol:null,
configuration:{set:hiveconf:mapreduce.job.tags=paragraph_1645270946147_194101954,
set:hiveconf:hive.server2.thrift.resultset.default.fetch.size=1000,
hive.server2.proxy.user=hive, use:database=default,
set:hiveconf:tez.application.tags=paragraph_1645270946147_194101954})
I will be very grateful to everyone for any help. Regards.
So, after many hours and trials, here's a working solution. First of all, the most important thing is to use drivers that correlate with your version of Hadoop. Needed are jar files like 'hive-jdbc-standalone' and 'hadoop-common' in their respective versions and to avoid adding all of them in the 'Artifact' field of the %jdbc interpreter in Zeppelin it is best to use one complete file containing all required dependencies.
Thanks to Tim Veil it is available in his Github repository below:
https://github.com/timveil/hive-jdbc-uber-jar/
This is my complete Zeppelin %jdbc interpreter settings:
default.url jdbc:postgresql://localhost:5432/
default.user gpadmin
default.password
default.driver org.postgresql.Driver
default.completer.ttlInSeconds 120
default.completer.schemaFilters
default.precode
default.statementPrecode
common.max_count 1000
zeppelin.jdbc.auth.type SIMPLE
zeppelin.jdbc.auth.kerberos.proxy.enable false
zeppelin.jdbc.concurrent.use true
zeppelin.jdbc.concurrent.max_connection 10
zeppelin.jdbc.keytab.location
zeppelin.jdbc.principal
zeppelin.jdbc.interpolation false
zeppelin.jdbc.maxConnLifetime -1
zeppelin.jdbc.maxRows 1000
zeppelin.jdbc.hive.timeout.threshold 60000
zeppelin.jdbc.hive.monitor.query_interval 1000
hive.driver org.apache.hive.jdbc.HiveDriver
hive.password
hive.proxy.user.property hive.server2.proxy.user
hive.splitQueries true
hive.url jdbc:hive2://sandbox-hdp.hortonworks.com:10000/default
hive.user hive
Dependencies
Artifact
/opt/zeppelin/interpreter/jdbc/hive-jdbc-uber-2.6.5.0-292.jar
Next step is to go to Ambari http://localhost:8080/ and login as admin. To do that first you must login on Hadoop root account via SSH:
ssh root#127.0.0.1 -p 2222
root#127.0.0.1's password: hadoop
After successful login, you will be prompted to change your password immediately, please do that and next set Ambari admin password with command:
[root#sandbox-hdp ~]# ambari-admin-password-reset
After that you can use admin account in Ambari (login and click Hive link in the left panel):
Ambari -> Hive -> Configs -> Advanced -> Custom hive-site
Click Add Property
Insert followings into the opening window:
hive.security.authorization.sqlstd.confwhitelist.append=tez.application.tags
And after saving, restart all Hive services in Ambari. Everything should be working now if you set the proper Java path in 'zeppelin-env.sh' and port in 'zeppelin-site.xml' (you must copy and rename 'zeppelin-env.sh.template' and 'zeppelin-site.xml.template' in Zeppelin/config directory, please remember that Ambari also use 8080 port!).
In my case, the only thing left to do is add or uncomment the fragment responsible for the Helium plug-in repository (in 'zeppelin-site.xml'):
<property>
<name>zeppelin.helium.registry</name>
<value>helium,https://s3.amazonaws.com/helium-package/helium.json</value>
<description>Enable helium packages</description>
</property>
Now you can go to the Helium tab in the top right corner of the Zeppelin sheet and install the plugins of your choice, in my case it is 'zeppelin-leaflet' visualization. And voilà! Sample vizualization from this Kaggle dataset in Hive:
https://www.kaggle.com/kartik2112/fraud-detection
Have a nice day!

cloudera host with bad health during install

Trying again & again with all required steps completed but cluster Installation when install selected Parcels, always shows every host with bad health. setup never completed at full.
i am installing cm 5.5 on CentOS 6.7 using virtualbox.
The Error
Host is in bad health cm.feuni.edu
Host is in bad health dn1.feuni.edu
Host is in bad health dn2.feuni.edu
Host is in bad health nn1.feuni.edu
Host is in bad health nn2.feuni.edu
Host is in bad health rm.feuni.edu
above error are shown on step 6 where setup says
The selected parcels are being downloaded and installed on all the hosts in the cluster
in previous step 5 all hosts were completed with heartbeat checks in the end
memory distributions
cm 8GB
all others with 1GB
i could not find proper answer anywhere else. What reason could be for the bad health?
I don't know if it will help you...
For me, after a few days I struggled with it,
I found the log files (at )
It had a comment there is a mismatch of the guid,
so I uninstalled everything from both machines (using the script they give,/usr/share/cmf/uninstall-cloudera-manager.sh , yum remove 'cloudera-manager-*' and deletion of every directory related to cloudera I found...)
and then removed the guid file:
rm /var/lib/cloudera-scm-agent/cm_guid
Afterwards I re-installed everything, and that fixed that issue for me...
I read online that there can be issues with the hostname and things like that, but I guess that if you get to this part of the installation, you already fixed all the domain/FDQN/hosname/hosts issues.
It saddens me there is no real manual/FAQ for this product.. :(
Good luck!
I faced the same problem. This is my solution:
First I edited config.ini
$ nano /etc/cloudera-scm-agent/config.ini
so that the hostname where the same as the command $ hostname returned.
then I restarted the agent and the server of cloudera:
$ service cloudera-scm-agent restart
$ service cloudera-scm-server restart
then in cloudera manager I deleted the cluster and added again. The wizard continued to run normally.

Hadoop installation: what is "This is comment for WebHCat Service (sic)"

Using Ambari, This is comment for WebHcat Service is the final selection in the “Services Selection” step.
If I don't select this service, then the Customize Services step hangs indefinitely. It doesn't matter which other services are selected.
If I select it, then the Customize Services step functions normally, but the installation will stop on step four with the error message:
“org.apache.ambari.server.controller.spi.SystemException:
An internal system exception occurred:
Configuration with tag version1439256707212 exists for webhcat-site
This is on a clean install, for a single node SLES 11 SP3 server.
What is the service This is comment for WebHcat Service, and why is it a comment instead of a service name?
If this is a fresh install, it's strange your getting configuration already exists errors. I would try to clean your ambari server instance by running:
sudo ambari-server reset
This will reset the postgres database that ambari-server uses, giving you a clean slate to retry another cluster install.

ProFTPD can't connect after install

Installed Webmin successfully on a Debian system.
Created a virtual server, added some users and a domain.
Installed ProFTPD via Webmin's unused modules.
Added a new user with same named group via System -> Users and Groups.
Tried to connect via ftp using my server's external ip and my new user's credentials.
This should work according to most tutorials but it doesn't.
I'm suspecting some other service handles FTP requests before ProFTPD.
Is there a way to monitor protocol handlers? Could it be something else?
Thanks in advance.
because webmin try start it as deamon, but maybe (like me on archlinux) you need to start it as system service... on root:
systemctl start proftpd.service
If you want to look at the logs error (if there is errors, but if server is not start, it should ne have error...) then use:
journalctl -xe command (as root), or
systemctl --failed , or
systemctl status proftpd.service (all of these commands under root user or sudoers users).
So first of all, check that service is running:
systemctl status
then check the config file of webmin service for proftpd use the correct protocol for call service (systemd for example), and then use correct sentence code for start/stop it. Check also it goes to look at the correct config file of proftpd current install place (depend of your distribution or the way you install it).
proftpd is not installed by webmin, proftpd is installed, then from webmin, you install a module who has to communicate with allready installed application proftpd. If this module is well configured for point on actual proftpd installation and correct call of service, then all will have to works.
(please, if this answer help you, do up vote for my answer, without notation when i help, i can not help more because i'm locked by the system, hope you understand)
Have a look at the server's log, check le ProFTP daemon status, check the firewall

Hadoop Ambari cannot confirm hosts

I tried to use Ambari to manage the installation and maintenance of the Hadoop cluster.
After I started ambari server, I use the web page to set up Hadoop cluster.
But at the 3rd step-- confirm hosts, the error shows below
And I check the log at /var/log/ambari-server, I found:
INFO:root:BootStrapping hosts ['qiao'] using /usr/lib/python2.6/site-packages/ambari_server cluster primary OS: redhat6 with user 'root' sshKey File /var/run/ambari-server/bootstrap/1/sshKey password File null using tmp dir /var/run/ambari-server/bootstrap/1 ambari: master; server_port: 8080; ambari version: 1.4.1.25
INFO:root:Executing parallel bootstrap
ERROR:root:ERROR: Bootstrap of host qiao fails because previous action finished with non-zero exit code (1)
INFO:root:Finished parallel bootstrap
Do you provide ssh rsa private key or paste it?
and from the place you are installing, make sure you can ssh to any hosts without typing any password.
If still the same error, try
ambari-server reset
ambari-server setup
Pls restart ambari-server
ambari-server restart
and then try accessing Ambari
It would work.
Make sure you can ssh to every single host on the list, including all master hosts.
To do this, ensure that Ambari host's .ssh/id_rsa.pub entry is included in every hosts' .ssh/authorized_keys file. Then ssh from Ambari's host to every single server - and check if it is asking for your password. You can use a tutorial like http://www.tecmint.com/ssh-passwordless-login-using-ssh-keygen-in-5-easy-steps/ to check if everything has been done properly.
You need to do the same on the Ambari host itself, if you added it to hosts list.

Resources