zookeeper and hadoop 2.6 + hbase 0.98 - hadoop

In hadoop 2.6 with hbase 0.98 is needed to install zookeeper explicitly ? because when I run hadoop and hbase I have a process named "HQuorumPeer" and I think this is zookeeper. Do hbase include zookeeper or I should install it seperately ?

https://www.quora.com/Do-I-have-to-install-Zookeeper-separately-even-for-HBase-Standalone-or-is-it-in-built-with-HBase-setup
If you are in Standalone mode, you don't need to, otherwise it can be better to install it on a diffrent cluster.

Related

presto + build presto cluster that will be join to exsiting hadoop cluster

we have hadoop cluster that contain all the relevant components/services as
HDFS
YARN
mapreduce
HIVE
Tez
pig
Zookeeper
hadoop clutser contain 3 masters machines and 12 data node machines and 3 kafka
now we want to use presto to run query against data sources ( hadoop cluster / hive )
so we build a new presto cluster as the follwing
1 presto coordinator
8 presto workers
all presto cluster machines are redhat 7.2
now we want to install the presto on all OS
but we are not sure if presto can be installed immodestly after Linux scratch OS
or maybe we need to install something in the middle after the OS and before the presto ?
The only requirement for Presto is a Java Virtual Machine (JVM). We recommend installing the latest OpenJDK 11 version, currently 11.0.2. After that, follow the Presto deployment instructions.
Python is required for the launcher (the script that starts the JVM), but this is normally available on a typical Linux distribution.

How to check the hadoop distribution used in my cluster?

How can I know whether my cluster has been setup using Hortonworks,Cloudera or normal installation of hadoop components?
Also how can I know the port number of various services?
It is difficult to identify hadoop distribution from port number, since Apache, Hortonworks, Cloudera distros uses different port numbers
Other options are to check for cluster management service agents (Cloudera Manager - agent start up script - /etc/init.d/cloudera-scm-agent , Hortonworks - Ambari agent start up script - /etc/init.d/ambari-agent, Vanilla Apache hadoop will not have any agents in the server
Another option is to check hadoop classpath, below command can be used to get the classpath.
`hadoop classpath`
Most of hadoop distributions include distro name in the classpath, If classpath doesn't contains any of below keywords, distribution/setup will be Apache/Normal installation.
hdp - (Hortonworks)
cdh - (Cloudera)
The simplest way is to run hadoop version command and in output you will see, what version of Hadoop you are having and also which distribution and its version you are running with. If you will find words like cdh or hdp then cdh stands for cloudera and hdp for hortonworks.
For example, here I am having cloudera and with hadoop version command below is output.
Here in first line Hadoop version followed by hadoop distribution and its version.
Hope this will help.
Command hdfs version will give you version of the hadoop and its distribution

Can open source hbase work on Cloudera distribution of Hadoop

I have a Cloudera Distribution installed as a 5 Node Cluster. Now I do not want to use the Hbase parcel that comes with cloudera,
but instead I want to use only HDFS from the cloudera setup and an opensource version of Hbase.
So my question is will this work or I will have to install normal open-source version of Apache Hadoop for HDFS and then go forward with the Opensource version of Apache Hbase on top of it.
As long as the version of hadoop matches the version of used by the hadoop client used by the version of hbase matches it should all work.

How to install impala on an already running hadoop cluster

I have an already up and running Hadoop, 5-node cluster. I want to install Impala on the HDFS cluster without the Cloudera Manager. Can anyone supposedly guide me through the process or a link of the same.
Thanks.

Which Hadoop 0.23.8 jars are needed for HBase 0.94.8

I'm using Hadoop 0.23.8 pseudo distributed and HBase 0.94.8. My HBase master is failing with:
Server IPC version 5 cannot communicate with client version 4
I think this is because HBase is using hadoop-core-1.0.4.jar in its lib folder.
Now http://cloudfront.blogspot.in/2012/06/how-to-configure-habse-in-pseudo.html#.UYfPYkAW38s suggests I should replace this jar by copying:
the hadoop-core-*.jar from your HADOOP_HOME ...
but there are no hadoop-core-*.jars in 0.23.8.
Will this process work for 0.23.8, and if so, which jars should I be using?
TIA!
I gave up with this and am using hadoop 2.2.0 which works well (ish) with HBase.

Resources