hbase standalone and zookeeper standalone in single machine

hbase standalone and zookeeper standalone in single machine - hadoop

Is this possible to run both Hbase and external zookeeper in standalone mode in single machine?
It stucks with clientPort issue.
Please clarify?

Yes , it is possible. To do that , you will have to change the client port of external zookeeper server. Go to conf directory of external zookeeper. Open the zoo.cfg file. If its not there, rather zoo_sample.cfg is there, then do mv conf/zoo_sample.cfg conf/zoo.cfg to create it. In zoo.cfg file, change the default port no of clientPort=2181 to 2182 . Also change the dataDir to some directory you wish .For example - I will do dataDir=/home/ckant/zookeeper1 clientPort=2182 on my machine. Now run ./zkServer.sh to start the server . To connect the client to this zkserver run ./zkCli.sh -server 127.0.0.1:2182. Your client is now connected to the external zookeeper server running on port 2182. Any time to connect to zookeeper started by hbase, just change the port number to 2181 in above command.

Related

Hue UI is not accessible from a remote host

I'am trying to use Hue as a file browser for HDFS. So for that I have clone the hue repository and build the app with the following commands given in README.md of the hue repository.
git clone https://github.com/cloudera/hue.git
cd hue
make apps
build/env/bin/hue runserver
Hue UI is accessible in local machine using default port using the url http://localhost:8000 and everything works fine. But when I use my machine ip address http://x.x.x.x:8000 and try to access the Hue UI it keeps on processing and waiting.
Other observations -:
I can ping from remote machine to the host machine.
There is no firewall blocking the ports. (checked with nmap port scanner)
Machines are in same network.
I can access other ports for Hadoop NameNodes UI and DataNodes.
Changing the http_host in hue.ini doesn't affect the result

The ideal setup for Hue is configuring a reverse proxy (Nginx or Apache HTTP, for example)
However, you should refer to the Configuration documentation to externally run the server outside of 127.0.0.1
[desktop]
# Webserver listens on this address and port
http_host=0.0.0.0
http_port=8888

I was able to find a solution to the issue.. First hue run on a CherryPy web server so starting server by command build/env/bin/hue runserver will start the development server where hue.ini configuration is neglected.
So the correct command to start the production server after setting up correct configuration in hue.ini file is build/env/bin/hue runcpserver. Then I was able to access it using remote host without any problem. You also can use supervisor to start the production server. More information about that can be found here

How to update Kafka config file with Docker IP address

I am running Kafka inside a Docker container. Kafka requires a connection to Zookeeper, and so I am running Zookeeper in another container. I am running Docker on OSX and so my VM has the IP address: 192.168.99.99.
What I can't figure out, is how do I update my Kafka Docker installation to point to the instance of Zookeeper running inside its own separate Docker container, i.e. with IP address of 192.168.99.9 and port 2181?
Kafka has a config file called server.properties which has a property of zookeeper.connect which I can set, but I want this value to be overridden dynamically, rather than hard-coding the IP here. How do I achieve this?
And, as an additional question, I want my Docker file to work across OS's - so whatever I do should work on Linux too..

You should not need to set an ip in that config file:
Through docker-compose v2 (docker 1.10+), a bridge network is created which means both containers are in that network and see each other.
See more at "Networking in Compose".
If Zookeeper expose its port 2181, the config file from Kafka can simply reference zookeeper by its container name.
And that will work on any docker (boot2docker on Mac or native docker on Linux)

Change IP address of a Hadoop HDFS data node server and avoid Block pool errors

I'm using the cloudera distribution of Hadoop and recently had to change the IP addresses of a few nodes in the cluster. After the change, on one of the nodes (Old IP:10.88.76.223, New IP: 10.88.69.31) the following error comes up when I try to start the data node service.
Initialization failed for block pool Block pool BP-77624948-10.88.65.174-13492342342 (storage id DS-820323624-10.88.76.223-50010-142302323234) service to hadoop-name-node-01/10.88.65.174:6666
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException): Datanode denied communication with namenode: DatanodeRegistration(10.88.69.31, storageID=DS-820323624-10.88.76.223-50010-142302323234, infoPort=50075, ipcPort=50020, storageInfo=lv=-40;cid=cluster25;nsid=1486084428;c=0)
at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:656)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:3593)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:899)
at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:91), I was unable to start the datanode service due to the following error:
Has anyone had success with changing the IP address of a hadoop data node and join it back to the cluster without data loss?

CHANGE HOST IP IN CLOUDERA MANAGER
Change Host IP on all node
sudo nano /etc/hosts
Edit the ip cloudera config.ini on all node if the master node ip changes
sudo nano /etc/cloudera-scm-agent/config.ini
Change IP in PostgreSQL Database
For the password Open PostgreSQL password
cat /etc/cloudera-scm-server/db.properties
Find the password lines
Example. com.cloudera.cmf.db.password=gUHHwvJdoE
Open PostgreSQL
psql -h localhost -p 7432 -U scm
Select table in PostgreSQL
select name,host_id,ip_address from hosts;
Update table IP
update hosts set ip_address = 'xxx.xxx.xxx.xxx' where host_id=x;
Exit the tool
\q
Restart the service on all node
service cloudera-scm-agent restart
Restart the service on master node
service cloudera-scm-server restart

Turns out its better to:
Decommission the server from the cluster to ensure that all blocks are replicated to other nodes in the cluster.
Remove the server from the cluster
Connect to the server and change the IP address then restart the cloudera agent
Notice that cloudera manager now shows two entries for this server. Delete the entry with the old IP and longest heartbeat time
Add the server to the required cluster and add required roles back to the server (e.g. HDFS datanode, HBASE RS, Yarn)
HDFS will read all data disks and recognize the block pool and cluster IDs, then register the datanode.
All data will be available and the process will be transparent to any client.
NOTE: If you run into name resolution errors from HDFS clients, the application has likely cached the old IP and will most likely need be restarted. Particularly Java clients that previously referenced this server e.g. HBASE clients must be restarted due to the JVM caching IPs indefinitely. Java based clients will likely throw errors relating to connectivity to the server with changed IP because they have the old IP cached until they are restarted.

unable to start regionservers in HBase

i have a problem starting regionservers on slave pc,s. when i enlist only master pc in conf/regionservers every thing works fine but when i add two more slaves to it the hbase does not start .....
if i delete all hbase folders in the tmp folder from all pc,s and then start regionserver (with 3 regionservers enlisted)the hbase gets started but when i try to create a table it again fails(gets stuck)....
pls anyone help
i am using hadoop 0.20.0 which is working fine and hbase 0.92.0
i have 3 pc's in cluster one master and two slaves
also tell that is DNS (both forward and backward lookup working)necessary for hbase in my case????
is there any way to replicate hbase table to all region servers i.e. i want to have a copy of table at each pc and want to access them locally(when i execute map task they should use their local copy of hbase table)
plz help..!!
thanx in advance

make your hosts file as following:
127.0.0.1 localhost
For Hadoop
192.168.56.1 master
192.168.56.101 slave
and in hbase conf put following entries :
hbase.rootdir
hdfs://master:9000/hbase
hbase.master
master:60000
The host and port that the HBase master runs at.
hbase.regionserver.port
60020
The host and port that the HBase master runs at.
hbase.cluster.distributed
true
hbase.tmp.dir
/home/cluster/Hadoop/hbase-0.90.4/temp
hbase.zookeeper.quorum
master
dfs.replication
2
hbase.zookeeper.property.clientPort
2181
Property from ZooKeeper's config zoo.cfg.
The port at which the clients will connect.
If you are using localhost anywhere remove that and replace it with "master" which is name for namenode in your hostfile....
one morething you can do
sudo gedit /etc/hostname
this will open the hostname file bydefault ubuntu will be there so make it master. and restart your system.
For hbase specify in "regionserver" file inside conf dir put these entries
master
slave
and restart.everything.

hadoop api configuration on the client machine

ultra-noob. I have a server machine with cdh3u1 pseudo-distrib, and a client machine with a java application using the cdh3u1 API.
How do I configure the client to talk to the server? I've been googling for hours and couldn't find where is the "client configuration" file. The "hdfs-default", "core-default" and "mapred-default" and their "-site" counterparts all look like server (namenode and datanode) config to me.
Is it just "multipurpose client server" config and I should cherry-pick the attributes in these files that are appropriate to the client? which are they? probably missing something big here...
Thanks, Ido

make sure that the client machine can access the hadoop server machine ip. If you use a virtualbox for the hadoop server (cdh3 vm), then add a "host-only" network interface (see details here: host-only networking with virtualbox. I'm assuming that your static ip for the hadoop server is 192.168.56.101 and that you're able to ping it from your client.
configure a hostname for your hadoop server machine in both the server and client machine. If you want to name your hadoop server "local-elephant", add the following line to /etc/hosts in both machines: 192.168.56.101 local-elephant.
in the server machine goto /etc/hadoop/conf change the values of the following properties from "localhost" to "local-elephant": in core-site.xml the value of fs.default.name and in mapred-site.xml the value of mapred.job.tracker.
in the client machine, create core-site.xml and mapred-site.xml in the classpath of your java application. In those files put only the fs.default.name and mapred.job.tracker properties.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

hbase standalone and zookeeper standalone in single machine - hadoop

Is this possible to run both Hbase and external zookeeper in standalone mode in single machine? It stucks with clientPort issue. Please clarify?

Related

Hue UI is not accessible from a remote host

How to update Kafka config file with Docker IP address

Change IP address of a Hadoop HDFS data node server and avoid Block pool errors

unable to start regionservers in HBase

hadoop api configuration on the client machine

Categories

Resources