Unable to start name node while configuring Hadoop for Lustre - hadoop

I'm trying to integrate hadoop with intel lustre. I have added hadoop-lustre-plugin-3.1.0 to hadoop-2.7.3/lib/native folder. Lustre is mounted at /mnt/lustre. I'm getting following error when I start hadoop using start-all.sh
[root#master hadoop]# start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
17/04/06 17:36:55 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Incorrect configuration: namenode address dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not configured.
Starting namenodes on [ ]
...
core-site.xml :
<property>
<name>fs.defaultFS</name>
<value>lustre:///</value>
</property>
<property>
<name>fs.lustre.impl</name>
<value>org.apache.hadoop.fs.LustreFileSystem</value>
</property>
<property>
<name>fs.AbstractFileSystem.lustre.impl</name>
<value>org.apache.hadoop.fs.LustreFileSystemlustre</value>
</property
<property>
<name>fs.lustrefs.mount</name>
<value>/mnt/lustre/hadoop</value>
<description>This is the directory on Lustre that acts as the root level for Hadoop services</description>
</property>
<property>
<name>lustre.stripe.count</name>
<value>1</value>
</property>
<property>
<name>lustre.stripe.size</name>
<value>4194304</value>
</property>
<property>
<name>fs.block.size</name>
<value>1073741824</value>
</property>
maprd-site.xml
<property>
<name>mapreduce.job.map.output.collector.class</name>
<value>org.apache.hadoop.mapred.SharedFsPlugins$MapOutputBuffer</value>
</property>
<property>
<name>mapreduce.job.reduce.shuffle.consumer.plugin.class</name>
<value>org.apache.hadoop.mapred.SharedFsPlugins$Shuffle</value>
</property>
hdfs-site.xml
<property>
<name>dfs.name.dir</name>
<value>/mnt/lustre/hadoop/hadoop_tmp/namenode</value>
<description>true</description>
</property>
Is there any configuration that I have missed in configuration files?

As fs.defaultFS holds the lustre specific URI, the startup script is unable to determine the host in which Namenode has to be started.
Add this property in hdfs-site.xml,
<property>
<name>dfs.namenode.rpc-address</name>
<value>namenode_host:port</value>
</property>

Related

HBase Region servers will not start in Hadoop HA environment

I've created an HBase cluster in a Hadoop HA cluster. My region servers are failing to start with the following exception in the logs:
2017-09-12 11:41:32,116 ERROR [regionserver/my.hostname.com/10.10.30.28:16020] regionserver.HRegionServer: Failed init
java.io.IOException: Failed on local exception: java.net.SocketException: Invalid argument; Host Details : local host is: "my.hostname.com/10.10.30.28"; destination host is: "0.0.0.1":8020;
I'm pretty sure the problem is caused by the hadoop HA configuration
I think Hbase doesn't understand the nameservice and thinks it's an IP address.
excerpt from core-site.xml:
<property>
<name>fs.defaultFS</name>
<value>hdfs://001</value>
<description>NameNode URI</description>
</property>
excerpt from hdfs-site.xml:
<property>
<name>dfs.nameservices</name>
<value>001</value>
</property>
my hbase-site.xml:
<configuration>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://001/hbase</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>zk1:2181,zk2:2181,zk3:2181</value>
</property>
</configuration>
Help?
It was a silly mistake. Hbase was missing the path to the hadoop configuration files. Simply added HADOOP_CONF_DIR to hbase-env.sh

Connection refused in Hbase Shell while Connecting HBase to HDFS

I am trying to connect my HBase to HDFS. I have my hdfs namenode(bin/hdfs namenode) and datnode(/bin/hdfs datanode) running. I can also start my Hbase (sudo ./bin/start-hbase.sh) and local region servers (sudo ./bin/local-regionservers.sh start 1 2). But when I try to execute a command from Hbase shell it gives the following error:
cis655stu#cis655stu-VirtualBox:/teaching/14f-cis655/proj-dtracing/hbase/hbase-0.99.0-SNAPSHOT$ ./bin/hbase shell
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.99.0-SNAPSHOT, rUnknown, Sat Aug 9 08:59:57 EDT 2014
hbase(main):001:0> list
TABLE
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/teaching/14f-cis655/proj-dtracing/hbase/hbase-0.99.0-SNAPSHOT/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/teaching/14f-cis655/proj-dtracing/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2015-01-19 13:33:07,179 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
ERROR: Connection refused
Here is some help for this command:
List all tables in hbase. Optional regular expression parameter could
be used to filter the output. Examples:
hbase> list
hbase> list 'abc.*'
hbase> list 'ns:abc.*'
hbase> list 'ns:.*'
Below are my configuration files for HBase and Hadoop:
HBase-site.xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:9000/hbase</value>
</property>
<!--for psuedo-distributed execution-->
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.master.wait.on.regionservers.mintostart</name>
<value>1</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/teaching/14f-cis655/tmp/zk-deploy</value>
</property>
<!--for enabling collection of traces
-->
<property>
<name>hbase.trace.spanreceiver.classes</name>
<value>org.htrace.impl.LocalFileSpanReceiver</value>
</property>
<property>
<name>hbase.local-file-span-receiver.path</name>
<value>/teaching/14f-cis655/tmp/server-htrace.out</value>
</property>
</configuration>
Hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/teaching/14f-cis655/proj-dtracing/hadoop-2.6.0/yarn/yarn_data/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/teaching/14f-cis655/proj-dtracing/hadoop-2.6.0/yarn/yarn_data/hdfs/datanode</value>
</property>
<property>
<name>hadoop.trace.spanreceiver.classes</name>
<value>org.htrace.impl.LocalFileSpanReceiver</value>
</property>
<property>
<name>hadoop.local-file-span-receiver.path</name>
<value>/teaching/14f-cis655/proj-dtracing/hadoop-2.6.0/logs/htrace.out</value>
</property>
</configuration>
Core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
Please check does you HDFS is available from shell:
$ hdfs dfs -ls /hbase
Also make sure that you've all environment variables in hdfs-env.sh file:
HADOOP_CONF_LIB_NATIVE_DIR="/hadoop/lib/native"
HADOOP_OPTS="-Djava.library.path=/hadoop/lib"
HADOOP_HOME=/hadoop
YARN_HOME=/hadoop
HBASE_HOME=/hbase
HADOOP_HDFS_HOME=/hadoop
HBASE_MANAGES_ZK=true
Do you run Hadoop and HBase using the same OS user? If you use separate users, please check if HBase user is allowed to access HDFS.
Make sure that you have a copy of hdfs-site.xml and core-stie.xml (or symlink) files in ${HBASE_HOME}/conf directory.
Also fs.default.name option is deprecated for YARN (but it must still work), you must consider using fs.defaultFS instead.
Do you use Zookeeper? Because you've specified hbase.zookeeper.property.dataDir option, but there is no hbase.zookeeper.quorum there, and other significant options. Please read http://hbase.apache.org/book.html#zookeeper for more information.
Please add next option to hdfs-site.xml to make HBase work correctly (replace $HBASE_USER variable by your system user, which is used to run HBase):
<property>
<name>hadoop.proxyuser.$HBASE_USER.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.$HBASE_USER.hosts</name>
<value>*</value>
</property>
<property>
<name>dfs.support.append</name>
<value>true</value>
</property>

distcp between nameservice1 and nameservice2

we have CDH 5.2 with Cloudera Manager 5.
We want to copy data from nameservice2 to nameservice1
Both clusters are on same CDH version
When I tried hadoop distcp hdfs://nameservice2/foo/bar hdfs://nameservice1/bar/foo
I got error
java.lang.IllegalArgumentException: java.net.UnknownHostException: nameservice2
So I added following config from Nameservice2 to Nameservice1
HDFS Client Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml in Cloudera manager (Gateway Default Group)
<property>
<name>dfs.nameservices</name>
<value>nameservices2</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.nameservices2</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.namenodes.nameservices2</name>
<value>namenode36,namenode405</value>
</property>
<property>
<name>dfs.namenode.rpc-address.nameservices2.namenode36</name>
<value>hnn001.prod.cc:8020</value>
</property>
<property>
<name>dfs.namenode.servicerpc-address.nameservices2.namenode36</name>
<value>hnn001.prod.com:54321</value>
</property>
<property>
<name>dfs.namenode.http-address.nameservices2.namenode36</name>
<value>hnn001.prod.com:50070</value>
</property>
<property>
<name>dfs.namenode.https-address.nameservices2.namenode36</name>
<value>hnn001.prod.com:50470</value>
</property>
<property>
<name>dfs.namenode.rpc-address.nameservices2.namenode405</name>
<value>hnn002.prod.com:8020</value>
</property>
<property>
<name>dfs.namenode.servicerpc-address.nameservices2.namenode405</name>
<value>hnn002.prod.com:54321</value>
</property>
<property>
<name>dfs.namenode.http-address.nameservices2.namenode405</name>
<value>hnn002.prod.com:50070</value>
</property>
<property>
<name>dfs.namenode.https-address.nameservices2.namenode405</name>
<value>hnn002.prod.com:50470</value>
</property>
But I am still getting same error.
Any workaround this ?
thanks
In HA enabled HDFS namenode nameservice1,nameservice2 are logical names, you cannot use ports along with that logical name.
you have two methods.
Easy method is to find the active namenodes and use the active namenode:port in the distcp command as follows. Namenode web UI can be used for finding active namenodes of two clusters.
hadoop distcp hdfs://hnn001.prod.cc:8020:8020/foo/bar hdfs://<dest-cluster-active-nn-hostname>:8020/bar/foo
Another method is to use logical names of two clusters as follow, But before trying the below command make sure you have properly configured nameservice1 and nameservice2 in your client hdfs-site.xml.
hadoop distcp hdfs://nameservice2/foo/bar hdfs://nameservice1/bar/foo
Confiruting remote cluster's nameservice in local cluster.
Looks like nameservice2 is your local and nameservice1 is your remote. You need to keep the all associated properties of nameservice1 and nameservice2 in the local cluster ie. Your local cluster's client hdfs-site.xml files should be as follows.
<configuration>
<!-- Available nameservices -->
<property>
<name>dfs.nameservices</name>
<value>nameservices1,nameservices2</value>
</property>
<!-- Local nameservice2 properties -->
<property>
<name>dfs.client.failover.proxy.provider.nameservices2</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.namenodes.nameservices2</name>
<value>namenode36,namenode405</value>
</property>
<property>
<name>dfs.namenode.rpc-address.nameservices2.namenode36</name>
<value>hnn001.prod.cc:8020</value>
</property>
<property>
<name>dfs.namenode.servicerpc-address.nameservices2.namenode36</name>
<value>hnn001.prod.com:54321</value>
</property>
<property>
<name>dfs.namenode.http-address.nameservices2.namenode36</name>
<value>hnn001.prod.com:50070</value>
</property>
<property>
<name>dfs.namenode.https-address.nameservices2.namenode36</name>
<value>hnn001.prod.com:50470</value>
</property>
<property>
<name>dfs.namenode.rpc-address.nameservices2.namenode405</name>
<value>hnn002.prod.com:8020</value>
</property>
<property>
<name>dfs.namenode.servicerpc-address.nameservices2.namenode405</name>
<value>hnn002.prod.com:54321</value>
</property>
<property>
<name>dfs.namenode.http-address.nameservices2.namenode405</name>
<value>hnn002.prod.com:50070</value>
</property>
<property>
<name>dfs.namenode.https-address.nameservices2.namenode405</name>
<value>hnn002.prod.com:50470</value>
</property>
<!-- Remote nameservice1 properties -->
<!-- You can find these properties in the remote machine's hdfs-site.xml file -->
<property>
<name>dfs.client.failover.proxy.provider.nameservices1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.namenodes.nameservices1</name>
<value>namenodeXX,namenodeYY</value>
</property>
<property>
<name>dfs.namenode.rpc-address.nameservices1.namenodeXX</name>
<value><Remote-nn1>:8020</value>
</property>
<property>
<name>dfs.namenode.servicerpc-address.nameservices1.namenodeXX</name>
<value><Remote-nn1>:54321</value>
</property>
<property>
<name>dfs.namenode.http-address.nameservices1.namenode**XX**</name>
<value><Remote-nn1>:50070</value>
</property>
<property>
<name>dfs.namenode.https-address.nameservices1.namenodeXX</name>
<value><Remote-nn1>:50470</value>
</property>
<property>
<name>dfs.namenode.rpc-address.nameservices1.namenodeYY</name>
<value><Remote-nn2>:8020</value>
</property>
<property>
<name>dfs.namenode.servicerpc-address.nameservices1.namenodeYY</name>
<value><Remote-nn2>:54321</value>
</property>
<property>
<name>dfs.namenode.http-address.nameservices1.namenodeYY</name>
<value><Remote-nn2>:50070</value>
</property>
<property>
<name>dfs.namenode.https-address.nameservices1.namenodeYY</name>
<value><Remote-nn2>:50470</value>
</property>
<!-- Other properties -->
</configuration>
In the above configuration files replace all place holders like YY XX with corresponding values in the remote machine's hdfs site.xml.

Hadoop not communicating with resourcemanager

Hi currently I'm running hadoop 2.4.1. I have created a simple java program DefaultMapperClass.java using eclipse and packaged it into ex1.jar
When I try to invoke this program via hadoop shell using the command,
**hadoop jar /home/Maddy/ex1.jar DefaultMapperClass hdfs://localhost/users/root/input/Hadoop.txt hdfs://localhost/users/root/output**
I get the below output in hadoop shell
**[root#localhost Maddy]# hadoop jar /home/Maddy/ex1.jar DefaultMapperClass hdfs://localhost/users/root/input/Hadoop.txt hdfs://localhost/users/root/output
14/09/05 19:26:35 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Job started: Fri Sep 05 19:26:35 CDT 2014
14/09/05 19:26:35 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:8032
[root#localhost Maddy]#**
Seems like hadoop shell is trying to connect to resource manager but unsuccessful but there is no error message
mapred-site.xml file:
**<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>**
yarn-site.xml:
**<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>localhost:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>localhost:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>localhost:8031</value>
</property>
</configuration>**
What is missing here? Why execution is terminated after attempting to connect to resource manager?
I would suggest removing the following configurations from the yarnsite.xml as they are unnecessary :
<property>
<name>yarn.resourcemanager.address</name>
<value>localhost:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>localhost:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>localhost:8031</value>
</property>
You can access the resource manager at localhost:8088

Hadoop LZO & SnappyCodec error in Hadoop and Hive

I am using Ubuntu-12.04,Hadoop-1.0.2,Hive-0.10.0
while reading data about 1 million records from hive I got error below for query
select * from raw_pos limit 10000;
WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
so I installed Snappy for Hadoop in $HADOOP_HOME/lib folder,which produces files libsnappy.a, libsnappy.la,libsnappy.so,libsnappy.so.1,libsnappy.so.1.1.4
also add hadoop-lzo-0.4.3.jar in $HADOOP_HOME/lib/ & make changes in cor-site.xml,mapred-site.xml as follow
Core-site.xml:-
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/apache/hadoop-1.0.4/hadoop_temp/</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>io.compression.codecs</name>
<value>
org.apache.hadoop.io.compress.SnappyCodec
</value>
mapred-site.xml :-
<property>
<name>mapred.job.tracker</name>
<value>hdfs://localhost:54311</value>
</property>
<property>
<name>mapred.compress.map.output</name>
<value>true</value>
</property>
<property>
<name>mapred.map.output.compression.codec</name>
<value>org.apache.hadoop.io.compress.SnappyCodec</value>
</property>
but when I started hive & do show databases, gives error:-
Failed with exception java.io.IOException:java.io.IOException: Cannot create an instance of InputFormat class org.apache.hadoop.mapred.TextInputFormat as specified in mapredWork!
Modify your core-site.xml to this and see if it helps :
<property>
<name>io.compression.codecs</name>
<value>com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec,org.apache.hadoop.io.compress.SnappyCodec</value>
</property>
<property>
<name>io.compression.codec.lzo.class</name>
<value>com.hadoop.compression.lzo.LzoCodec</value>
</property>

Resources