how to install apache phoenix to ambari 1.7 with hbase? - hadoop

I'm new to hadoop. I want to install phoenix with hbase but I have installed hadoop cluster using ambari 1.7 on ubuntu. I'm not able to find any tutorial to do so.

If you build up your own Hadoop stack:
https://phoenix.apache.org/download.html
https://phoenix.apache.org/installation.html
If you use e.g. IBM Open Platform (which is for free btw):
https://developer.ibm.com/hadoop/blog/2015/10/21/installing-apache-phoenix-ibm-open-platform-apache-hadoop-4-1/

hbase should be available as service under add service button on home page.
For installing phoenix i used this link
http://dev.hortonworks.com.s3.amazonaws.com/HDPDocuments/HDP2/HDP-2-trunk/bk_installing_manually_book/content/upgrade-22-7-a.html
basically yum install phoenix on each node and then create soft links to the phoenix server jar file
hth

Related

Install ambari on existing single node hadoop

I have single node hadoop setup on Ubuntu(personal dev env.).
Now want to install Ambari.
Question: Can we install Ambari on existing hadoop set up if Yes than assist me.
No, Ambari requires you first install Ambari agents, then Ambari server to monitor the agents + install / configure Hadoop.
In theory, if you installed Hadoop in the same way that Ambari expects, it might work, but adding any node into Ambari will throw lots of warnings if the node is running any extra process other than the agent

HDP 2.1 to 2.2 upgrade RHEL6

I have a cluster with 1 NameNode and 4 DataNodes on Red Hat Linux Enterprise 6. My HDP version is 2.1. Ambari version was 1.7 but I upgraded it to 2.1. I want to upgrade HDP to version 2.2. I read that if I want to upgrade HDP from 2.1 to 2.2 I have to do it before I upgrade Ambari to 2.1. When I am upgrading hdp to 2.2 ambari does not see any changes and everything is not working. I am using this tutorial:
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.2.0/HDP_Man_Upgrade_v22/index.html#Item1
How can I do it? I tried to downgrade ambari to 1.7 but I got many errors. What if I try upgrade now hdp to 2.2 and then my ambari from 2.1 to 2.1.1. Will it work? The problem is that I have very little time.
Thank you in advance
I am upgrading from HDP-2.0/Hadoop-2.2 to HDP-2.2/Hadoop-2.6 (maybe temporarily, on the way to HDP-2.3) on a development/testing cluster. So far I have gotten the updated HDFS up and running. Not yet starting/stopping HDFS via Ambari, nor do I have YARN running yet.
Update: I got YARN, MapReduce, and Hive to run after finding HDP-2.2 documentation (current links added below).
Here are my rough notes on how I got this far:
upgrade ambari-1.4 to 1.7
update HDP yum repo file on each node and update with yum (cssh)
http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.2.6.0/hdp.repo
hdp-select
sudo su -l hdfs -c "/usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh start namenode -upgrade"
/etc/hadoop/conf.empty/core-site.xml.rpmsave, hdfs-site.xml
sudo su -l hdfs -c "/usr/hdp/current/hadoop-hdfs-datanode/../hadoop/sbin/hadoop-daemon.sh start datanode"
hadoop dfsadmin -finalizeUpgrade
[update]
upgrade ambari-1.7 to 2.1
configure yarn & mapreduce - yarn-site.xml, yarn-env.sh[?], container-executor.cfg, mapred-site.xml
sudo ln -s /usr/hdp/2.2.6.0-2800/hadoop/libexec/ /usr/lib/hadoop/ # ambari insisting /usr/lib/hadoop
stop-start resourcemanager, start nodemanagers
hive-site.xml in both conf.dist & conf.server; manual start
The following resources were useful:
https://cwiki.apache.org/confluence/display/AMBARI/Install+Ambari+1.7.0+from+Public+Repositories
https://developer.ibm.com/hadoop/blog/2015/10/08/back-up-and-restore-ambari-server-postgresql-database/
http://docs.hortonworks.com/HDPDocuments/Ambari-2.1.0.0/bk_Installing_HDP_AMB/content/_hdp_stack_repositories.html
http://docs.hortonworks.com/HDPDocuments/Ambari-2.1.0.0/bk_upgrading_Ambari/content/_Upgrade_HDFS_mamiu.html
https://wiki.apache.org/hadoop/Hadoop_Upgrade
http://docs.hortonworks.com/HDPDocuments/Ambari-2.1.0.0/bk_upgrading_Ambari/content/_complt_upgrd_21-23_upgrade_hdfs.html
http://solaimurugan.blogspot.com/2014/11/upgrade-hadoop-with-latest-version.html
https://cwiki.apache.org/confluence/display/AMBARI/Install+Ambari+2.0.2+from+Public+Repositories
[update]
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.2.0/bk_upgrading_hdp_manually/content/index.html
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.2.0/bk_upgrading_hdp_manually/content/configure-yarn-mr-21.html
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.2.0/bk_upgrading_hdp_manually/content/start-hive-hcat-21.html
https://brucebcampbell.wordpress.com/2014/12/11/hortonworks-fix-missing-jar-error-in-hive-after-upgrade-to-hdp-2-2/

How to find cdh version hadoop

When connecting to Hadoop cluster, how can I know which version of Hadoop this cluster is running? In particular this is important for proper configuration of libraries when compiling and packaging Hadoop Java jobs with Maven.
The simplest way if you have ssh access to hadoop node is by running command
$ hadoop version
If you are looking for CDH version then check /usr/lib/hadoop/cloudera/cdh_version.properties
In cdh, in the cluster I am using, there is not any cdh_version.properties (or I couldn't find it)
If your cluster uses "Parcels", you could check which version of cdh is used by doing:
/opt/cloudera/parcels
And you could see the version as the name of the folder:
CDH-5.5.1-1.cdh5.5.1.p0.11
Note: I know that this is a not a general rule for getting which cdh version is used. I am trying to show an alternative way that it worked to me.
We can check the installed version with the help of following command:
cat /usr/lib/hadoop/cloudera/cdh_version.properties
Hope this may help you.

Cloudera Installation Error I want to know can we cloudera manager for Hadoop single node Cluster on ubuntu?

I am using ubuntu 12.04 64bit, I installed and ran sample hadoop programs with single node successfully.
I am getting the following error while installing cloudera manager on my ubuntu
Refreshing repository metadata failed. See
/var/log/cloudera-manager-installer/2.refresh-repo.log for details.
Click OK to revert this installation.
I want to know can we install Cloudera for Hadoop's Single node cluster on ubuntu. Please response me that Is it possible to install cloudera manager for single node or not. Or else Am i want to create multiple nodes for using cloudera with my hadooop
Yes, CM can run in a single node.
This error is because CM can not use apt-get install to get the packages. Which tutorial do you follow?
However, you can manually add the cloudera repo. See this thread.

How to install cloudera impala on EMR?

Is there anyway i can install the only impala without cloudera manager and without cdh. I will be using the apache version of hadoop?
Yes, it is absolutely possible. Add the repository into your sources.list file and update the repository after that.
deb [arch=amd64]
http://archive.cloudera.com/impala/ubuntu/precise/amd64/impala
precise-impala1 contrib deb-src
http://archive.cloudera.com/impala/ubuntu/precise/amd64/impala
precise-impala1 contrib
After that, it's merely :
sudo apt-get install impala (Binaries for daemons)
sudo apt-get install impala-server (Service start/stop script)
sudo apt-get install impala-state-store (Service start/stop script)
But do not forget to meet all the prerequisites. For a detailed info you can go here
You can view detailed instructions on how to install and use Impala with Amazon EMR here: http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-impala.html
EMR is based on a Amazon Hadoop distribution that runs on top of Debian squeeze. So, yes it's possible using Cloudera's DEB repo.
You will need to SSH to your EMR master node, find the address on EMR console.
You will also need to enable security rules on the security group you have assigned to your EMR cluster, if you intend to connect to Impala using a JDBC/ODBC client form the outside world.

Resources