I've integrated my hadoop2 and hbase0.98 with phoenix and by typing command sqlline.py localhost phoenix shell starts, but when I try to run apache phoenix example by this command : psql.py /usr/local/phoenix/examples/WEB_STAT.sql /usr/local/phoenix/examples/WEB_STAT.csv /usr/local/phoenix/examples/WEB_STAT_QUERIES.sql I faced this error ERROR client.HConnectionManager$HConnectionImplementation: The node /hbase is not in ZooKeeper. It should have been written by the master. Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master.
I use hadoop 2.6 in single mode and hbase 0.98 in psudo distributed mod, in addition I didn't explicitly install zookeeper, is it required to install zookeeper explicitly?
my HBASE_HOME/conf/hbase-site.xml file contains :
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:54310/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>localhost</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hduser/hbase/zookeeper</value>
</property>
<property>
<name>zookeeper.znode.parent</name>
<value>/hbase</value>
</property>
<property>
<name>hbase.master</name>
<value>hadoop-master:60000</value>
</property>
</configuration>
and my running java process are
7415 DataNode
7262 NameNode
9119 Jps
7605 SecondaryNameNode
7893 NodeManager
8704 HRegionServer
8544 HMaster
8475 HQuorumPeer
7763 ResourceManager
Simply you should add the address of your server here localhost to your command. Pay attention to command you've already run, sqlline.py localhost that you gave the server address.
Are you using the HDP distribution? iirc they use /hbase-unsecure or for un-Kerberized clusters. I don't remember how it interacted with your config setting for /hbase
start the ZooKeeper cli
zkCli.sh or perhaps some variant of zookeepershell
query the existing root nodes
ls /
the HBase root node is probably named hbase-unsecure
Related
I am trying to set up a multi-node Hadoop cluster between 2 windows devices. I am using Hadoop 2.9.2.
how can I achieve that, please.
after a lot of trial and error the following did the job me.
do same configuration as previous answer by #AbsoluteBeginner.
disable windows firewall on all machines (i think you could keep it on and just mess around with the rules, but thats for you to find out)
hdfs namenode -format all nodes (master and slaves)
make sure that the datanode folder is empty in all 3 nodes (just shift+del)
in master node run start-all.cmd. all the following should appear.
50436 NameNode
54696 NodeManager
54744 DataNode
60028 Jps
7340 ResourceManager
in slave nodes run start-all.cmd. all the following should appear
6116 DataNode
2408 Jps
3208 NodeManager
note the reason that nameode and resource manager isn't appearing, is becuase they are running on master node and already occupy the port, and you only need the master resourcemanger and name node running
note if you saw multi-cluster tutorial of linux the master node also shows SeceondryNameNode when executing jps. not really sure why its not appearing in windows.
go to master:50070, and navigate to data nodes you should see something like this
go to master:8088, and navigate to Node you should see something like this
Install open-ssh server on both of your systems using this guide. Generating a new SSH public and private key pair on your local computer is the first step towards authenticating with a remote server without a password. Add the public key to the authorized_keys and add your hostname to list of known hosts. You can find guides on how to do this by searching the internet.
2.Add your hadoop master and slave ips to your hosts file. Open “C:\Windows\System32\drivers\etc\hosts”
and add
your-master-ip hadoopMaster
your-salve-ip hadoopSlave
you can use these names in your configuration files.
much like Linux systems, these are the steps you have to follow in order to run a Hadoop cluster on windows:
3. First you need to have Java installed on your system and JAVA_HOME must be added to your environment variables. You can download Java from Oracle website and install it.
Download Hadoop binary files from Apache website and extract it.
Note that you shouldn't have space in your folder names or you might encounter problems.
Next you have to add Java and Hadoop home and bin folders to your environment variables. just open start menu and type "environment variable" and open the edit environment variables window from control panel.
Add
HADOOP_HOME=”root of your hadoop extracted folder\hadoop-2.9.2″
HADOOP_BIN=”root of hadoop extracted folder\hadoop-2.9.2\bin”
JAVA_HOME=<Root of your JDK installation>”
Edit your "path" environment variable and add %JAVA_HOME%, %HADOOP_HOME%, %HADOOP_BIN%, %HADOOP_HOME%/sbin to your PATH one by one.
you can validate your additions by opening cmd and type in:
echo %HADOOP_HOME%
echo %HADOOP_BIN%
echo %PATH%
CONFIGURING HADOOP:
10. Open "your hadoop root\hadoop-2.9.2\etc\hadoop\hadoop-env.cmd" and add the following lines to the bottom of the file:
set HADOOP_PREFIX=%HADOOP_HOME%
set HADOOP_CONF_DIR=%HADOOP_PREFIX%\etc\hadoop
set YARN_CONF_DIR=%HADOOP_CONF_DIR%
set PATH=%PATH%;%HADOOP_PREFIX%\bin
11.Open "your-hadoop-root\hadoop-2.9.2\etc\hadoop\hdfs-site.xml" and add the below content:
<property>
<name>dfs.name.dir</name>
<value>your desired address</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>your desired address</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.datanode.use.datanode.hostname</name>
<value>false</value>
</property>
<property>
<name>dfs.namenode.datanode.registration.ip-hostname-check</name>
<value>false</value>
</property>
<property>
<name>dfs.namenode.http-address</name>
<value>hadoopMaster:50070</value>
<description>Your NameNode hostname for http access.</description>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoopMaster:50090</value>
<description>Your Secondary NameNode hostname for http access.</description>
</property>
edit your core-site.xml and add:
<property>
<name>fs.default.name</name>
<value>hdfs://hadoopMaster:9000</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>your-temp-directory</value>
<description>A base for other temporary directories.</description>
</property>
Open "root to hadoop\hadoop-2.9.2\etc\hadoop\mapred-site.xml" and add below content within tags. If you don’t see mapred-site.xml then open mapred-site.xml.template file and rename it to mapred-site.xml
<property>
<name>mapred.job.tracker</name>
<value>hadoopMaster:9001</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
14.Edit your yarn-site.xml and add:
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce.shuffle</value>
<description>Long running service which executes on Node Manager(s) and provides MapReduce Sort and Shuffle functionality.</description>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
<description>Enable log aggregation so application logs are moved onto hdfs and are viewable via web ui after the application completed. The default location on hdfs is '/log' and can be changed via yarn.nodemanager.remote-app-log-dir property</description>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>hadoopMaster:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>hadoopMaster:8031</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>hadoopMaster:8032</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>hadoopMaster:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>hadoopMaster:8088</value>
</property>
In your slaves file in "root-hadoop-directory/hadoop/bin" add
hadoopSlave
Do these steps on your slave nodes too.
open cmd and cd to your sbin folder in hadoop directory.
18.format your nameNode
hadoop namenode -format
19.run the following command:
start-dfs.sh
then run:
start-yarn.sh
When I start-hbase.sh, I get the following error
localhost: starting zookeeper, logging to /usr/lib/HBase/bin/../logs/hbase-hduser-zookeeper-nkhl.out
localhost: java.io.FileNotFoundException: /home/hduser/zookeeperpropertydataDir/myid (Permission denied)
localhost: at java.io.FileOutputStream.open0(Native Method)
localhost: at java.io.FileOutputStream.open(FileOutputStream.java:270)
localhost: at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
localhost: at java.io.FileOutputStream.<init>(FileOutputStream.java:162)
localhost: at java.io.PrintWriter.<init>(PrintWriter.java:263)
localhost: at org.apache.hadoop.hbase.zookeeper.HQuorumPeer.writeMyID(HQuorumPeer.java:162)
localhost: at org.apache.hadoop.hbase.zookeeper.HQuorumPeer.main(HQuorumPeer.java:70)
starting master, logging to /usr/lib/HBase/logs/hbase-hduser-master-nkhl.out
OpenJDK 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
regionserver running as process 25123. Stop it first.
After this, when I do hbase shell, it does open up, but when I list it throws this error:
ERROR: Can't get master address from ZooKeeper; znode data == null
Here is some help for this command:
List all tables in hbase. Optional regular expression parameter could
be used to filter the output. Examples:
hbase> list
hbase> list 'abc.*'
hbase> list 'ns:abc.*'
hbase> list 'ns:.*'
This is jps:
25123 HRegionServer
23975 SecondaryNameNode
23767 DataNode
24168 ResourceManager
26456 HMaster
26665 Jps
24297 NodeManager
23613 NameNode
Zookeeper starts fine:
ZooKeeper JMX enabled by default Using config:
/usr/lib/zookeeper/conf/zoo.cfg Starting zookeeper ... STARTED
My hbase-site.xml configuration:
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:54433/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>localhost</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hduser/zookeeperpropertydataDir</value>
</property>
<property >
<name>hbase.master.port</name>
<value>60010</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
<description> The port at which the clients will connect.</description>
</property>
</configuration>
This is my hbase-env.sh configuration:
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/
export HBASE_REGIONSERVERS=${HBASE_HOME}/conf/regionservers
export HBASE_MANAGES_ZK=true
Any help in this will be appreciated.
Hbase zookeeper deamon HQuorumPeer is not running. one of the reason can be below dir doesn't exist as shown in logs:-
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hduser/zookeeperpropertydataDir</value>
</property>
Make sure the file that ZooKeeper is using has been created and has the right privileges.
Use chmod to grant access to all users. This fixed my problem.
chmod -R 777 path/file_name
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hduser /zookeeperpropertydataDir</value>
</property>
I have installed hadoop(1.2.1) multinode on 1 master and 2 slaves. Now I am trying to install hbase over it. The problem is that when I start hbase on the master, it only shows one regionserver(the master itself) while the slaves are not being shown on the web browser. On the terminal each slave has its own regionserver but that is not reflected on the browser. Can anyone tell me as to what the problem is?
I had same problem, i solve it by adding port number in hbase.rootdir
And your hbase-site.xml should look like this
<property>
<name>hbase.zookeeper.quorum</name>
<value>master-IP-Adress,slave1-IP,slave2-IP</value>
<description>The directory shared by RegionServers.
</description>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/ravi/zookeeper-data</value>
<description>Property from ZooKeeper config zoo.cfg.
The directory where the snapshot is stored.
</description>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://master-IP:50000/hbase</value>
<description>The directory shared by RegionServers.
</description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
<description>The mode the cluster will be in. Possible values are
false: standalone and pseudo-distributed setups with managed Zookeeper
true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh)
</description>
</property>
It is normal that in ResourceManager (nodemanager:8088/cluster/nodes) i can see only one node?
In my test environment i setup two node cluster and command bin/hdfs dfsadmin -report show me two nodes.
Sorry but i am find the solution.
You need to add following property in your conf/yarn-site.xml file on all nodes:
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>resourcemanager_address:8030</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>resourcemanager_address:8032</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>resourcemanager_address:8088</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>resourcemanager_address:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>resourcemanager_address:8033</value>
</property>
That will be overwrite the default settings for resourcemanager address (default is 0.0.0.0).
Hope this helps someone.
You can also simply set
<property>
<name>yarn.resourcemanager.hostname</name>
<value>resourcemanager_address</value>
</property>
... and the rest of the properties will be set correctly automatically.
To point out the obvious, make sure you start/restart the nodemanager as well.
$HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start nodemanager
I run hbase in a distributed mode. Hbase starts region servers java processes on all nodes, but web ui doesn' show them
http://s1.ipicture.ru/uploads/20120517/16DXTnsU.png
here is hbase-site.xml
<configuration>
<property>
<name>hbase.zookeeper.quorum</name>
<value>10.3.6.44</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hadoop/hdfs/zookeeper</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://10.3.6.44:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
btw hadoop cluster is running normally and sees all the datanodes
thanks very much for your help.
problem was with dns and hosts file.
Add this property to your hbase-site.xml file and see if it works for you
name - hbase.zookeeper.property.clientPort
value - 2181