Slave could not connect to Master regionserver - connection refused - hadoop

I tried to set up HBase on my Hadoop installation and I have bug in slave logs connected with regionserver:
2016-01-09 23:54:59,829 WARN [regionserver60020] regionserver.HRegionServer: error telling master we are up
com.google.protobuf.ServiceException: java.net.ConnectException: Connection refused
my /etc/hosts
10.156.207.48 hadoop-master
10.156.207.31 hadoop-slave-1
my hbase-site.xml (on master)
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://hadoop-master:54310/hbase</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>hadoop-master, hadoop-slave-1</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
my hbase-site.xml (on slave)
<configuration>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://hadoop-master:54310/hbase</value>
</property>
</configuration>
Please for any help how can I fix it?

Related

Fail to start HBase in Pseudo-Distributed mode throws "Failed construction RegionServer"

I am trying to run HBase pseudo-distributed in a docker image of ubuntu.
After start-hbase.sh, HMaster and RegionServer don't run properly.
Both RegionServer and Master log shows:
ERROR [main] regionserver.HRegionServer: Failed construction RegionServer
java.io.IOException: Couldn't create proxy provider class org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
at org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:261)
at org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:224)
at org.apache.hadoop.hdfs.NameNodeProxiesClient.createProxyWithClientProtocol(NameNodeProxiesClient.java:134)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:374)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:308)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:184)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3414)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:158)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3474)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3442)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:524)
at org.apache.hadoop.hbase.fs.HFileSystem.<init>(HFileSystem.java:91)
at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeFileSystem(HRegionServer.java:763)
at org.apache.hadoop.hbase.regionserver.HRegionServer.<init>(HRegionServer.java:653)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
at org.apache.hadoop.hbase.regionserver.HRegionServer.constructRegionServer(HRegionServer.java:3155)
at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:63)
at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:87)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:149)
at org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:3173)
jps shows:
31168 HQuorumPeer
14801 NodeManager
2049 Jps
12435 SecondaryNameNode
12105 NameNode
14699 ResourceManager
14141 DataNode
core-site.xml is :
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/bigdata/hadoop/tmp</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
The hdfs-site.xml shows:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/usr/local/hadoop/yarn_data/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/usr/local/hadoop/yarn_data/hdfs/datanode</value>
</property>
<property>
<name>dfs.namenode.http-address</name>
<value>localhost:50070</value>
</property>
<configuration>
<property>
<name>dfs.client.failover.proxy.provider.hdfscluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
</configuration>
</configuration>
Both of the config files are soft linked from hadoop/etc/hadoop/
I don't know how to fix this issue base on the log. Thanks for the help!
Update:
After fixing the syntax error in hdfs-site.xml pointing out by majid.
"ERROR [main] regionserver.HRegionServer: Failed construction RegionServer
java.lang.IllegalArgumentException: java.net.UnknownHostException: hdfs
at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:448)
at org.apache.hadoop.hdfs.NameNodeProxiesClient.createProxyWithClientProtocol(NameNodeProxiesClient.java:139)"
your hdfs-site.xml is not in correct format.
It should be:
<configuration>
<property>
<name>dfs.nameservices</name>
<value>hdfscluster</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/usr/local/hadoop/yarn_data/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/usr/local/hadoop/yarn_data/hdfs/datanode</value>
</property>
<property>
<name>dfs.namenode.http-address</name>
<value>localhost:50070</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.hdfscluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
</configuration>
Or remove dfs.client.failover.proxy.provider.hdfscluster property and hdfs-site.xml should be
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/usr/local/hadoop/yarn_data/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/usr/local/hadoop/yarn_data/hdfs/datanode</value>
</property>
<property>
<name>dfs.namenode.http-address</name>
<value>localhost:50070</value>
</property>
</configuration>
Make sure format namenode before start HBase.

How to start datanode in hadoop slave machine?

I'm creating hadoop cluster using yarn configuration, i have 2 VMs from virtual box, but when i run the command start-all.sh (start-dfs.sh and start-yarn.sh), i get a possitive anwser with jps both on master and slave terminal, but when i access master-ip:9870 on web there is no datanode started
core-site.xml:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop-master:9000</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/hadoopuser/hadoop/data/nameNode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/hadoopuser/hadoop/data/dataNode</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<property>
<name>mapreduce.reduce.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.acl.enable</name>
<value>0</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop-master</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
workers
hadoop-slave1
/etc/hosts
master-ip hadoop-master
slave-ip hadoop-slave1
The configuration above is in both master and slave machine.
I also have the JAVA_HOME, HADOOP_HOME and PDSH_RCMD_TYPE in my .bashrc. And i have created the ssh key in master and shared it with the slave authorized for allows ssh connection.
In master machine i have this output:
In my slave machine:
I have 0 nodes in my hdfs web visualization:
But i can see the slave node in yarn configuration:
I deleted hadoop tmp files and the datanode folders before format my hdfs on master, and start all processes. I'm using hadoop 3.2.1

Hadoop - failed to specify server's Kerberos principal name

Error - Failed to specify server's Kerberos principal name
I am trying to setup a Hadoop cluster using Kerberos. I managed to get the cluster working with Spark and Yarn before starting the Kerberos configuration. Currently my master and three nodes are running but i'm getting an error in the yarn logs.
Error:
java.io.IOException: Failed on local exception: java.io.IOException: java.lang.IllegalArgumentException : Failed to specify server's Kerberos principal name
core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://hadoopmaster:9000</value>
</property>
<!--Kerberos configuration-->
<property>
<name>hadoop.security.authentication</name>
<value>kerberos</value>
</property>
<property>
<name>hadoop.security.authorization</name>
<value>true</value>
</property>
<property>
<name>hadoop.security.auth_to_local</name>
<value>
RULE:[2:$1#$0](hdfs/.*#.*EXAMPLEREALM.COM)s/.*/hdfs/
RULE:[2:$1#$0](HTTP/.*#.*EXAMPLEREALM.COM)s/.*/hdfs/
RULE:[2:$1#$0](yarn/.*#.*EXAMPLEREALM.COM)s/.*/yarn/
DEFAULT
</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/hadoop/data/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/hadoop/data/datanode</value>
</property>
<property>
<name>dfs.replication<name>
<value>2</value>
</property>
<!-- General HDFS security config -->
<property>
<name>dfs.block.access.token.enable</name>
<value>true</value>
</property>
<!-- NameNode security config -->
<property>
<name>dfs.namenode.keytab.file</name>
<value>/etc/security/keytabs/hdfs.service.keytab</value> <!-- path to the HDFS keytab -->
</property>
<property>
<name>dfs.namenode.kerberos.principal</name>
<value>hdfs/hadoopslave1.examplerealm.com#EXAMPLEREALM.COM</value>
</property>
<property>
<name>dfs.namenode.kerberos.internal.spnego.principal</name>
<value>HTTP/hadoopslave1.examplerealm.com#EXAMPLEREALM.COM</value>
</property>
<!-- Secondary NameNode security config -->
<property>
<name>dfs.secondary.namenode.keytab.file</name>
<value>/etc/security/keytabs/hdfs.service.keytab</value> <!-- path to the HDFS keytab -->
</property>
<property>
<name>dfs.secondary.namenode.kerberos.principal</name>
<value>hdfs/hadoopslave1.examplerealm.com#EXAMPLEREALM.COM</value>
</property>
<property>
<name>dfs.secondary.namenode.kerberos.internal.spnego.principal</name>
<value>HTTP/hadoopslave1.examplerealm.com#EXAMPLEREALM.COM</value>
</property>
<!-- DataNode security config -->
<property>
<name>dfs.datanode.data.dir.perm</name>
<value>700</value>
</property>
<property>
<name>dfs.datanode.address</name>
<value>0.0.0.0:1004</value>
</property>
<property>
<name>dfs.datanode.http.address</name>
<value>0.0.0.0:1006</value>
</property>
<property>
<name>dfs.datanode.keytab.file</name>
<value>/etc/security/keytabs/hdfs.service.keytab</value> <!-- path to the HDFS keytab -->
</property>
<property>
<name>dfs.datanode.kerberos.principal</name>
<value>hdfs/hadoopslave1.examplerealm.com#EXAMPLEREALM.COM</value>
</property>
<!-- Web Authentication config -->
<property>
<name>dfs.web.authentication.kerberos.principal</name>
<value>HTTP/hadoopslave1.examplerealm.com#EXAMPLEREALM.COM</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.acl.enable</name>
<value>0</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoopmaster</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.nodemanager.principal</name>
<value>yarn/hadoopslave1.examplerealm.com#EXAMPLEREALM.COM</value>
</property>
<property>
<name>yarn.nodemanager.keytab</name>
<value>/etc/security/keytabs/yarn.service.keytab</value>
</property>
</configuration>
Have you installed krb5-libs and krb5-workstation on all nodes?
on Centos:
yum install krb5-server krb5-libs
yum install krb5-libs krb5-workstation
In that case, trying this might help you:
systemctl enable krb5kdc
systemctl start krb5kdc
systemctl enable kadmin
systemctl start kadmin
Also check: https://community.hortonworks.com/questions/176262/failed-to-specify-servers-kerberos-principal-name.html

Hadoop Resource Manager only show 1 node (cluster)

i have 2 hadoop node and have started hdfs and yarn. i see jps status in master are below :
12642 Jps
11271 NameNode
12075 NodeManager
11421 DataNode
11614 SecondaryNameNode
11775 ResourceManager
and jps in slave are below :
8445 DataNode
9469 Jps
8574 NodeManager
but when see in Hadoop yarn cluster, i see only 1 live node
http://localhost:8088/cluster/nodes and also in http://localhost:50070/
My yarn-site.xml :
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>localhost</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>localhost:8030</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>localhost:8032</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>localhost:8088</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>localhost:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>localhost:8033</value>
</property>
</configuration>
and mapred-site.xml :
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
</property>
<property>
<name>mapred.framework.name</name>
<value>yarn</value>
</property>
</configuration>
Note: on slave, the localhost become master's hostname.
Any Idea what are miss in my configuration ?

HBase not connecting to ZooKeeper

I am struggling for getting my HBase shell running.
It throws me the above exception in subject line. I have checked that hbase-site.xml matches perfectly with hadoop one.
Please help. I am struggling for 2 days and have a project due. I am attaching the two xml files of hadoop and hbase.
hbase-site.xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:54310/hbase</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hduser/zookeeper</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2222</value>
<description>Property from ZooKeeper's config zoo.cfg.
The port at which the clients will connect.
</description>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>localhost</value>
<description>Comma separated list of servers in the ZooKeeper Quorum.
</description>
</property>
</configuration>
Core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system.</description>
</property>
</configuration>
make sure zookeeper is running on 2222 port and there should be entery in zookeeper/conf/zoo.cfg
# the port at which the clients will connect
clientPort=2222
or make it 2181, start zookeeper by ./zkServer.sh start
and change this default port in hbase-site.xml
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>

Resources