Hadoop previous.checkpoint location - hadoop

I tried hadoop-1.0.3 as well as 1.0.4. Both under pseudo cluster mode.
My understanding is that the previous.checkpoint directory should be created under secondary name node designated by "fs.checkpoint.dir"? On all occasions I am finding it under the namenode directory designated by "dfs.name.dir". It this something to do with Pseudo mode or my understanding is wrong? Can someone please help!
Below is my hdfs-site.xml file.
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/home/hadoop/lab/hdfs/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/hadoop/lab/hdfs/datanode</value>
</property>
<property>
<name>fs.checkpoint.dir</name>
<value>/home/hadoop/lab/hdfs/secnamenode</value>
</property>
</configuration>
The below is high level directory structure for the daemons
hadoop#ubuntu:~/lab/hdfs$ ls -l
total 12
drwxr-xr-x 6 hadoop hadoop 4096 Mar 14 03:41 datanode
drwxrwxr-x 5 hadoop hadoop 4096 Mar 14 03:41 namenode
drwxrwxr-x 4 hadoop hadoop 4096 Mar 14 04:46 secnamenode
Below is the NameNode directory details
hadoop#ubuntu:~/lab/hdfs$ ls -l namenode
total 12
drwxrwxr-x 2 hadoop hadoop 4096 Mar 14 04:46 current
drwxrwxr-x 2 hadoop hadoop 4096 Mar 14 03:13 image
-rw-rw-r-- 1 hadoop hadoop 0 Mar 14 03:41 in_use.lock
drwxrwxr-x 2 hadoop hadoop 4096 Mar 14 03:34 previous.checkpoint
Below is the SNN directory details
hadoop#ubuntu:~/lab/hdfs$ ls -l secnamenode
total 8
drwxrwxr-x 2 hadoop hadoop 4096 Mar 14 04:46 current
drwxrwxr-x 2 hadoop hadoop 4096 Mar 14 03:46 image
-rw-rw-r-- 1 hadoop hadoop 0 Mar 14 03:41 in_use.lock
If you need any further details please let me know.
Thanks
Rags

I have done an extensive and desperate search about this. There seems to be a long pending Bug HDFS-1839 which removes previous.checkpoint directory from the SecondaryNameNode. The same bug might be responsible for having this directory created under the NameNode.
I have seen all the versions of hadoop so far and under all these the previous.checkpoint directory is consistently being created under the NameNode.
Hope soon this bug will be fixed or Apache Hadoop clarifies why directory is created under NameNode

Related

Hadoop Edge HDFS points to local FS

I have done my Hadoop cluster including 1 NameNode and 2 DataNodes and everything works perfectly :)
Now, I want to add a Hadoop Edge (aka Hadoop Gateway), I followed instructions here and finally, I execute :
hadoop fs -ls /
But unfortunately, I expected to see my HDFS's content but I see my local FS :
Found 22 items
-rw-r--r-- 1 root root 0 2017-03-30 16:44 /autorelabel
dr-xr-xr-x - root root 20480 2017-03-30 16:49 /bin
...
drwxr-xr-x - root root 20480 2016-07-08 17:31 /home
I think my core-site.xml is configurated as needed with specific property :
<property>
<name>fs.default.name</name>
<value>hdfs://hadoopnodemaster1:8020/</value>
</property>
hadoopmaster1 is my namenode and is reachable ..
I don't understand why I see my Local FS and not my HDFS .. Thank you :)

Hadoop namenode is not starting CDH4.7

After newly installed hadoop from CDH4.7 in Mint-17(Linux) operating system the namenode is not starting, but secondary-namenode, tasktracker, jobtracker and datanode are started.
Here is the related informations
/etc/hadoop/conf/hdfs-site.xml
<?xml version="1.0"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<!-- Immediately exit safemode as soon as one DataNode checks in.
On a multi-node cluster, these configurations must be removed. -->
<property>
<name>dfs.safemode.extension</name>
<value>0</value>
</property>
<property>
<name>dfs.safemode.min.datanodes</name>
<value>1</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/var/lib/hadoop-hdfs/cache/${user.name}</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///var/lib/hadoop-hdfs/cache/${user.name}/dfs/name</value>
</property>
<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>file:///var/lib/hadoop-hdfs/cache/${user.name}/dfs/namesecondary</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///var/lib/hadoop-hdfs/cache/${user.name}/dfs/data</value>
</property>
<property>
<name>dfs.datanode.max.xcievers</name>
<value>4096</value>
</property>
<property>
<name>dfs.client.read.shortcircuit</name>
<value>true</value>
</property>
<property>
<name>dfs.domain.socket.path</name>
<value>/var/run/hadoop-hdfs/dn._PORT</value>
</property>
<property>
<name>dfs.client.file-block-storage-locations.timeout.millis</name>
<value>10000</value>
</property>
<property>
<name>dfs.client.use.legacy.blockreader.local</name>
<value>true</value>
</property>
<property>
<name>dfs.datanode.data.dir.perm</name>
<value>750</value>
</property>
<property>
<name>dfs.block.local-path-access.user</name>
<value>impala</value>
</property>
<property>
<name>dfs.client.file-block-storage-locations.timeout.millis</name>
<value>10000</value>
</property>
<property>
<name>dfs.datanode.hdfs-blocks-metadata.enabled</name>
<value>true</value>
</property>
</configuration>
/etc/hadoop/conf/core-site.xml
<?xml version="1.0"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:8020</value>
</property>
</configuration>
/etc/hadoop/conf/mapred-site.xml
<?xml version="1.0"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:8021</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>classic</value>
</property>
</configuration>
ls -l /etc/hadoop/conf
total 88
-rw-r--r-- 1 root root 2998 May 28 22:57 capacity-scheduler.xml
-rw-r--r-- 1 root hadoop 1335 May 28 22:57 configuration.xsl
-rw-r--r-- 1 root root 233 May 28 22:57 container-executor.cfg
-rw-r--r-- 1 root hadoop 1002 Sep 25 19:29 core-site.xml
-rw-r--r-- 1 root hadoop 1774 May 28 22:57 hadoop-metrics2.properties
-rw-r--r-- 1 root hadoop 2490 May 28 22:57 hadoop-metrics.properties
-rw-r--r-- 1 root hadoop 9196 May 28 22:57 hadoop-policy.xml
-rw-r--r-- 1 root hadoop 2802 Sep 27 18:20 hdfs-site.xml
-rw-r--r-- 1 root hadoop 8735 May 28 22:57 log4j.properties
-rw-r--r-- 1 root root 4113 May 28 22:57 mapred-queues.xml.template
-rw-r--r-- 1 root root 1097 Sep 25 19:34 mapred-site.xml
-rw-r--r-- 1 root root 178 May 28 22:57 mapred-site.xml.template
-rw-r--r-- 1 root hadoop 10 May 28 22:57 slaves
-rw-r--r-- 1 root hadoop 2316 May 28 22:57 ssl-client.xml.example
-rw-r--r-- 1 root hadoop 2251 May 28 22:57 ssl-server.xml.example
-rw-r--r-- 1 root root 2513 May 28 22:57 yarn-env.sh
-rw-r--r-- 1 root root 2262 May 28 22:57 yarn-site.xml
sudo hadoop namenode -format
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
14/09/27 18:44:16 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = surendhar/127.0.1.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.0.0-cdh4.7.0
STARTUP_MSG: classpath = /etc/hadoop/conf:/usr/lib/hadoop/lib/jettison-1.1.jar:/usr/lib/hadoop/lib/jersey-core-1.8.jar:/usr/lib/hadoop/lib/paranamer-2.3.jar:/usr/lib/hadoop/lib/servlet-api-2.5.jar:/usr/lib/hadoop/lib/commons-logging-1.1.1.jar:/usr/lib/hadoop/lib/commons-compress-1.4.1.jar:/usr/lib/hadoop/lib/jaxb-impl-2.2.3-1.jar:/usr/lib/hadoop/lib/commons-beanutils-1.7.0.jar:/usr/lib/hadoop/lib/zookeeper-3.4.5-cdh4.7.0.jar:/usr/lib/hadoop/lib/junit-4.8.2.jar:/usr/lib/hadoop/lib/jackson-core-asl-1.8.8.jar:/usr/lib/hadoop/lib/snappy-java-1.0.4.1.jar:/usr/lib/hadoop/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop/lib/slf4j-api-1.6.1.jar:/usr/lib/hadoop/lib/commons-collections-3.2.1.jar:/usr/lib/hadoop/lib/commons-math-2.1.jar:/usr/lib/hadoop/lib/jsch-0.1.42.jar:/usr/lib/hadoop/lib/commons-configuration-1.6.jar:/usr/lib/hadoop/lib/jets3t-0.6.1.jar:/usr/lib/hadoop/lib/cloudera-jets3t-2.0.0-cdh4.7.0.jar:/usr/lib/hadoop/lib/xmlenc-0.52.jar:/usr/lib/hadoop/lib/avro-1.7.4.jar:/usr/lib/hadoop/lib/guava-11.0.2.jar:/usr/lib/hadoop/lib/jersey-server-1.8.jar:/usr/lib/hadoop/lib/slf4j-log4j12-1.6.1.jar:/usr/lib/hadoop/lib/kfs-0.3.jar:/usr/lib/hadoop/lib/log4j-1.2.17.jar:/usr/lib/hadoop/lib/commons-io-2.1.jar:/usr/lib/hadoop/lib/jsr305-1.3.9.jar:/usr/lib/hadoop/lib/xz-1.0.jar:/usr/lib/hadoop/lib/jasper-runtime-5.5.23.jar:/usr/lib/hadoop/lib/jasper-compiler-5.5.23.jar:/usr/lib/hadoop/lib/jackson-mapper-asl-1.8.8.jar:/usr/lib/hadoop/lib/stax-api-1.0.1.jar:/usr/lib/hadoop/lib/jsp-api-2.1.jar:/usr/lib/hadoop/lib/mockito-all-1.8.5.jar:/usr/lib/hadoop/lib/jaxb-api-2.2.2.jar:/usr/lib/hadoop/lib/jersey-json-1.8.jar:/usr/lib/hadoop/lib/jetty-util-6.1.26.cloudera.2.jar:/usr/lib/hadoop/lib/commons-el-1.0.jar:/usr/lib/hadoop/lib/asm-3.2.jar:/usr/lib/hadoop/lib/jline-0.9.94.jar:/usr/lib/hadoop/lib/commons-beanutils-core-1.8.0.jar:/usr/lib/hadoop/lib/commons-net-3.1.jar:/usr/lib/hadoop/lib/protobuf-java-2.4.0a.jar:/usr/lib/hadoop/lib/commons-codec-1.4.jar:/usr/lib/hadoop/lib/jackson-jaxrs-1.8.8.jar:/usr/lib/hadoop/lib/jetty-6.1.26.cloudera.2.jar:/usr/lib/hadoop/lib/jackson-xc-1.8.8.jar:/usr/lib/hadoop/lib/commons-lang-2.5.jar:/usr/lib/hadoop/lib/commons-digester-1.8.jar:/usr/lib/hadoop/lib/activation-1.1.jar:/usr/lib/hadoop/lib/commons-cli-1.2.jar:/usr/lib/hadoop/.//parquet-avro-1.2.5-cdh4.7.0-sources.jar:/usr/lib/hadoop/.//parquet-generator-1.2.5-cdh4.7.0.jar:/usr/lib/hadoop/.//parquet-avro-1.2.5-cdh4.7.0-javadoc.jar:/usr/lib/hadoop/.//parquet-common-1.2.5-cdh4.7.0.jar:/usr/lib/hadoop/.//parquet-scrooge-1.2.5-cdh4.7.0.jar:/usr/lib/hadoop/.//parquet-thrift-1.2.5-cdh4.7.0.jar:/usr/lib/hadoop/.//hadoop-common.jar:/usr/lib/hadoop/.//hadoop-annotations.jar:/usr/lib/hadoop/.//parquet-test-hadoop2-1.2.5-cdh4.7.0.jar:/usr/lib/hadoop/.//hadoop-annotations-2.0.0-cdh4.7.0.jar:/usr/lib/hadoop/.//parquet-column-1.2.5-cdh4.7.0-javadoc.jar:/usr/lib/hadoop/.//parquet-format-1.0.0-cdh4.7.0-sources.jar:/usr/lib/hadoop/.//parquet-encoding-1.2.5-cdh4.7.0.jar:/usr/lib/hadoop/.//hadoop-common-2.0.0-cdh4.7.0.jar:/usr/lib/hadoop/.//parquet-format-1.0.0-cdh4.7.0-javadoc.jar:/usr/lib/hadoop/.//parquet-scrooge-1.2.5-cdh4.7.0-sources.jar:/usr/lib/hadoop/.//parquet-format-1.0.0-cdh4.7.0.jar:/usr/lib/hadoop/.//parquet-generator-1.2.5-cdh4.7.0-sources.jar:/usr/lib/hadoop/.//parquet-hadoop-1.2.5-cdh4.7.0-sources.jar:/usr/lib/hadoop/.//parquet-encoding-1.2.5-cdh4.7.0-sources.jar:/usr/lib/hadoop/.//parquet-hive-1.2.5-cdh4.7.0.jar:/usr/lib/hadoop/.//parquet-avro-1.2.5-cdh4.7.0.jar:/usr/lib/hadoop/.//parquet-scrooge-1.2.5-cdh4.7.0-javadoc.jar:/usr/lib/hadoop/.//parquet-pig-bundle-1.2.5-cdh4.7.0.jar:/usr/lib/hadoop/.//parquet-encoding-1.2.5-cdh4.7.0-javadoc.jar:/usr/lib/hadoop/.//parquet-pig-bundle-1.2.5-cdh4.7.0-sources.jar:/usr/lib/hadoop/.//parquet-hive-1.2.5-cdh4.7.0-javadoc.jar:/usr/lib/hadoop/.//parquet-pig-1.2.5-cdh4.7.0-javadoc.jar:/usr/lib/hadoop/.//parquet-pig-1.2.5-cdh4.7.0-sources.jar:/usr/lib/hadoop/.//parquet-hadoop-1.2.5-cdh4.7.0.jar:/usr/lib/hadoop/.//parquet-thrift-1.2.5-cdh4.7.0-javadoc.jar:/usr/lib/hadoop/.//hadoop-auth.jar:/usr/lib/hadoop/.//hadoop-auth-2.0.0-cdh4.7.0.jar:/usr/lib/hadoop/.//parquet-column-1.2.5-cdh4.7.0-sources.jar:/usr/lib/hadoop/.//parquet-hive-1.2.5-cdh4.7.0-sources.jar:/usr/lib/hadoop/.//parquet-common-1.2.5-cdh4.7.0-javadoc.jar:/usr/lib/hadoop/.//parquet-common-1.2.5-cdh4.7.0-sources.jar:/usr/lib/hadoop/.//parquet-cascading-1.2.5-cdh4.7.0.jar:/usr/lib/hadoop/.//parquet-cascading-1.2.5-cdh4.7.0-sources.jar:/usr/lib/hadoop/.//parquet-cascading-1.2.5-cdh4.7.0-javadoc.jar:/usr/lib/hadoop/.//parquet-pig-1.2.5-cdh4.7.0.jar:/usr/lib/hadoop/.//parquet-thrift-1.2.5-cdh4.7.0-sources.jar:/usr/lib/hadoop/.//hadoop-common-2.0.0-cdh4.7.0-tests.jar:/usr/lib/hadoop/.//parquet-hadoop-1.2.5-cdh4.7.0-javadoc.jar:/usr/lib/hadoop/.//parquet-generator-1.2.5-cdh4.7.0-javadoc.jar:/usr/lib/hadoop/.//parquet-column-1.2.5-cdh4.7.0.jar:/usr/lib/hadoop-hdfs/./:/usr/lib/hadoop-hdfs/lib/jersey-core-1.8.jar:/usr/lib/hadoop-hdfs/lib/servlet-api-2.5.jar:/usr/lib/hadoop-hdfs/lib/commons-logging-1.1.1.jar:/usr/lib/hadoop-hdfs/lib/zookeeper-3.4.5-cdh4.7.0.jar:/usr/lib/hadoop-hdfs/lib/jackson-core-asl-1.8.8.jar:/usr/lib/hadoop-hdfs/lib/xmlenc-0.52.jar:/usr/lib/hadoop-hdfs/lib/guava-11.0.2.jar:/usr/lib/hadoop-hdfs/lib/jersey-server-1.8.jar:/usr/lib/hadoop-hdfs/lib/log4j-1.2.17.jar:/usr/lib/hadoop-hdfs/lib/commons-io-2.1.jar:/usr/lib/hadoop-hdfs/lib/jsr305-1.3.9.jar:/usr/lib/hadoop-hdfs/lib/jasper-runtime-5.5.23.jar:/usr/lib/hadoop-hdfs/lib/commons-daemon-1.0.3.jar:/usr/lib/hadoop-hdfs/lib/jackson-mapper-asl-1.8.8.jar:/usr/lib/hadoop-hdfs/lib/jsp-api-2.1.jar:/usr/lib/hadoop-hdfs/lib/jetty-util-6.1.26.cloudera.2.jar:/usr/lib/hadoop-hdfs/lib/commons-el-1.0.jar:/usr/lib/hadoop-hdfs/lib/asm-3.2.jar:/usr/lib/hadoop-hdfs/lib/jline-0.9.94.jar:/usr/lib/hadoop-hdfs/lib/protobuf-java-2.4.0a.jar:/usr/lib/hadoop-hdfs/lib/commons-codec-1.4.jar:/usr/lib/hadoop-hdfs/lib/jetty-6.1.26.cloudera.2.jar:/usr/lib/hadoop-hdfs/lib/commons-lang-2.5.jar:/usr/lib/hadoop-hdfs/lib/commons-cli-1.2.jar:/usr/lib/hadoop-hdfs/.//hadoop-hdfs-2.0.0-cdh4.7.0-tests.jar:/usr/lib/hadoop-hdfs/.//hadoop-hdfs.jar:/usr/lib/hadoop-hdfs/.//hadoop-hdfs-2.0.0-cdh4.7.0.jar:/usr/lib/hadoop-yarn/lib/jersey-core-1.8.jar:/usr/lib/hadoop-yarn/lib/paranamer-2.3.jar:/usr/lib/hadoop-yarn/lib/commons-compress-1.4.1.jar:/usr/lib/hadoop-yarn/lib/jackson-core-asl-1.8.8.jar:/usr/lib/hadoop-yarn/lib/snappy-java-1.0.4.1.jar:/usr/lib/hadoop-yarn/lib/jersey-guice-1.8.jar:/usr/lib/hadoop-yarn/lib/avro-1.7.4.jar:/usr/lib/hadoop-yarn/lib/jersey-server-1.8.jar:/usr/lib/hadoop-yarn/lib/guice-3.0.jar:/usr/lib/hadoop-yarn/lib/log4j-1.2.17.jar:/usr/lib/hadoop-yarn/lib/commons-io-2.1.jar:/usr/lib/hadoop-yarn/lib/xz-1.0.jar:/usr/lib/hadoop-yarn/lib/guice-servlet-3.0.jar:/usr/lib/hadoop-yarn/lib/jackson-mapper-asl-1.8.8.jar:/usr/lib/hadoop-yarn/lib/javax.inject-1.jar:/usr/lib/hadoop-yarn/lib/asm-3.2.jar:/usr/lib/hadoop-yarn/lib/protobuf-java-2.4.0a.jar:/usr/lib/hadoop-yarn/lib/netty-3.2.4.Final.jar:/usr/lib/hadoop-yarn/lib/aopalliance-1.0.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-applications-distributedshell-2.0.0-cdh4.7.0.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-nodemanager.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-client.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-tests-2.0.0-cdh4.7.0-tests.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-resourcemanager.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-common-2.0.0-cdh4.7.0.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-resourcemanager-2.0.0-cdh4.7.0.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-applications-unmanaged-am-launcher-2.0.0-cdh4.7.0.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-api.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-tests-2.0.0-cdh4.7.0.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-web-proxy.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-common.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-applications-distributedshell.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-web-proxy-2.0.0-cdh4.7.0.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-nodemanager-2.0.0-cdh4.7.0.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-common.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-common-2.0.0-cdh4.7.0.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-applications-unmanaged-am-launcher.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-client-2.0.0-cdh4.7.0.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-api-2.0.0-cdh4.7.0.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-site-2.0.0-cdh4.7.0.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-site.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-tests.jar:/usr/lib/hadoop-0.20-mapreduce/./:/usr/lib/hadoop-0.20-mapreduce/lib/jettison-1.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jersey-core-1.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/paranamer-2.3.jar:/usr/lib/hadoop-0.20-mapreduce/lib/servlet-api-2.5.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-logging-1.1.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-compress-1.4.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jaxb-impl-2.2.3-1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-beanutils-1.7.0.jar:/usr/lib/hadoop-0.20-mapreduce/lib/zookeeper-3.4.5-cdh4.7.0.jar:/usr/lib/hadoop-0.20-mapreduce/lib/junit-4.8.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jackson-core-asl-1.8.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/snappy-java-1.0.4.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/slf4j-api-1.6.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-collections-3.2.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-math-2.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-configuration-1.6.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/cloudera-jets3t-2.0.0-cdh4.7.0.jar:/usr/lib/hadoop-0.20-mapreduce/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20-mapreduce/lib/avro-1.7.4.jar:/usr/lib/hadoop-0.20-mapreduce/lib/hadoop-fairscheduler.jar:/usr/lib/hadoop-0.20-mapreduce/lib/guava-11.0.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jersey-server-1.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/kfs-0.3.jar:/usr/lib/hadoop-0.20-mapreduce/lib/log4j-1.2.17.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-io-2.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jsr305-1.3.9.jar:/usr/lib/hadoop-0.20-mapreduce/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20-mapreduce/lib/xz-1.0.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jasper-runtime-5.5.23.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jasper-compiler-5.5.23.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jackson-mapper-asl-1.8.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/stax-api-1.0.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jsp-api-2.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/mockito-all-1.8.5.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jaxb-api-2.2.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/avro-compiler-1.7.4.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jersey-json-1.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jetty-util-6.1.26.cloudera.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20-mapreduce/lib/asm-3.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jline-0.9.94.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-beanutils-core-1.8.0.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-net-3.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/protobuf-java-2.4.0a.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jackson-jaxrs-1.8.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jetty-6.1.26.cloudera.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jackson-xc-1.8.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-lang-2.5.jar:/usr/lib/hadoop-0.20-mapreduce/lib/hadoop-fairscheduler-2.0.0-mr1-cdh4.7.0.jar:/usr/lib/hadoop-0.20-mapreduce/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-digester-1.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/activation-1.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-core-2.0.0-mr1-cdh4.7.0.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-examples.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-test-2.0.0-mr1-cdh4.7.0.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-tools-2.0.0-mr1-cdh4.7.0.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-examples-2.0.0-mr1-cdh4.7.0.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-ant-2.0.0-mr1-cdh4.7.0.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-ant.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-tools.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-test.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-core.jar
STARTUP_MSG: build = git://localhost/data/1/jenkins/workspace/generic-package-ubuntu64-10-04/CDH4.7.0-Packaging-Hadoop-2014-05-28_09-36-51/hadoop-2.0.0+1604-1.cdh4.7.0.p0.17~lucid/src/hadoop-common-project/hadoop-common -r 8e266e052e423af592871e2dfe09d54c03f6a0e8; compiled by 'jenkins' on Wed May 28 10:11:49 PDT 2014
STARTUP_MSG: java = 1.7.0_55
************************************************************/
14/09/27 18:44:16 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
Formatting using clusterid: CID-61d4b942-4b4f-4693-a4c5-6bc3cce2a408
14/09/27 18:44:17 INFO namenode.FSNamesystem: fsLock is fair:true
14/09/27 18:44:17 INFO blockmanagement.HeartbeatManager: Setting heartbeat recheck interval to 30000 since dfs.namenode.stale.datanode.interval is less than dfs.namenode.heartbeat.recheck-interval
14/09/27 18:44:17 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000
14/09/27 18:44:17 INFO util.GSet: Computing capacity for map BlocksMap
14/09/27 18:44:17 INFO util.GSet: VM type = 64-bit
14/09/27 18:44:17 INFO util.GSet: 2.0% max memory 889 MB = 17.8 MB
14/09/27 18:44:17 INFO util.GSet: capacity = 2^21 = 2097152 entries
14/09/27 18:44:17 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false
14/09/27 18:44:17 INFO blockmanagement.BlockManager: defaultReplication = 1
14/09/27 18:44:17 INFO blockmanagement.BlockManager: maxReplication = 512
14/09/27 18:44:17 INFO blockmanagement.BlockManager: minReplication = 1
14/09/27 18:44:17 INFO blockmanagement.BlockManager: maxReplicationStreams = 2
14/09/27 18:44:17 INFO blockmanagement.BlockManager: shouldCheckForEnoughRacks = false
14/09/27 18:44:17 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000
14/09/27 18:44:17 INFO blockmanagement.BlockManager: encryptDataTransfer = false
14/09/27 18:44:17 INFO blockmanagement.BlockManager: maxNumBlocksToLog = 1000
14/09/27 18:44:17 INFO namenode.FSNamesystem: fsOwner = root (auth:SIMPLE)
14/09/27 18:44:17 INFO namenode.FSNamesystem: supergroup = supergroup
14/09/27 18:44:17 INFO namenode.FSNamesystem: isPermissionEnabled = true
14/09/27 18:44:17 INFO namenode.FSNamesystem: HA Enabled: false
14/09/27 18:44:17 INFO namenode.FSNamesystem: Append Enabled: true
14/09/27 18:44:17 INFO namenode.NameNode: Caching file names occuring more than 10 times
14/09/27 18:44:17 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
14/09/27 18:44:17 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0
14/09/27 18:44:17 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension = 0
Re-format filesystem in Storage Directory /var/lib/hadoop-hdfs/cache/root/dfs/name ? (Y or N) Y
14/09/27 18:44:21 INFO namenode.NNStorage: Storage directory /var/lib/hadoop-hdfs/cache/root/dfs/name has been successfully formatted.
14/09/27 18:44:21 INFO namenode.FSImage: Saving image file /var/lib/hadoop-hdfs/cache/root/dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
14/09/27 18:44:21 INFO namenode.FSImage: Image file of size 119 saved in 0 seconds.
14/09/27 18:44:21 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
14/09/27 18:44:21 INFO util.ExitUtil: Exiting with status 0
14/09/27 18:44:21 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at surendhar/127.0.1.1
************************************************************/
sudo service hadoop-hdfs-namenode start
* Starting Hadoop namenode:
starting namenode, logging to /var/log/hadoop-hdfs/hadoop-hdfs-namenode-surendhar.out
sudo jps
3131 Bootstrap
6321 Jps
cat /var/log/hadoop-hdfs/hadoop-hdfs-namenode-surendhar.out
ulimit -a for user hdfs
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 30083
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 30083
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
sudo ls -lR /var/lib/hadoop-hdfs/cache
/var/lib/hadoop-hdfs/cache:
total 16
drwxrwxr-x 3 hdfs hdfs 4096 Sep 25 19:39 hdfs
drwxrwxr-x 3 mapred mapred 4096 Sep 25 19:39 mapred
drwxr-xr-x 3 root root 4096 Sep 25 19:44 root
drwxr-xr-x 3 surendhar surendhar 4096 Sep 25 19:35 surendhar
/var/lib/hadoop-hdfs/cache/hdfs:
total 4
drwxrwxr-x 4 hdfs hdfs 4096 Sep 25 19:39 dfs
/var/lib/hadoop-hdfs/cache/hdfs/dfs:
total 8
drwxr-x--- 2 hdfs hdfs 4096 Sep 25 19:39 data
drwxrwxr-x 2 hdfs hdfs 4096 Sep 27 18:18 namesecondary
/var/lib/hadoop-hdfs/cache/hdfs/dfs/data:
total 0
/var/lib/hadoop-hdfs/cache/hdfs/dfs/namesecondary:
total 0
/var/lib/hadoop-hdfs/cache/mapred:
total 4
drwxrwxr-x 3 mapred mapred 4096 Sep 25 19:39 mapred
/var/lib/hadoop-hdfs/cache/mapred/mapred:
total 4
drwxr-xr-x 7 mapred mapred 4096 Sep 27 18:12 local
/var/lib/hadoop-hdfs/cache/mapred/mapred/local:
total 20
drwxr-xr-x 2 mapred mapred 4096 Sep 27 18:12 taskTracker
drwxrwxr-x 2 mapred mapred 4096 Sep 27 18:12 toBeDeleted
drwxr-xr-x 2 mapred mapred 4096 Sep 27 18:12 tt_log_tmp
drwx------ 2 mapred mapred 4096 Sep 27 18:12 ttprivate
drwxr-xr-x 2 mapred mapred 4096 Sep 25 19:40 userlogs
/var/lib/hadoop-hdfs/cache/mapred/mapred/local/taskTracker:
total 0
/var/lib/hadoop-hdfs/cache/mapred/mapred/local/toBeDeleted:
total 0
/var/lib/hadoop-hdfs/cache/mapred/mapred/local/tt_log_tmp:
total 0
/var/lib/hadoop-hdfs/cache/mapred/mapred/local/ttprivate:
total 0
/var/lib/hadoop-hdfs/cache/mapred/mapred/local/userlogs:
total 0
/var/lib/hadoop-hdfs/cache/root:
total 4
drwxr-xr-x 3 root root 4096 Sep 25 19:44 dfs
/var/lib/hadoop-hdfs/cache/root/dfs:
total 4
drwxr-xr-x 3 root root 4096 Sep 27 18:44 name
/var/lib/hadoop-hdfs/cache/root/dfs/name:
total 4
drwxr-xr-x 2 root root 4096 Sep 27 18:44 current
/var/lib/hadoop-hdfs/cache/root/dfs/name/current:
total 16
-rw-r--r-- 1 root root 119 Sep 27 18:44 fsimage_0000000000000000000
-rw-r--r-- 1 root root 62 Sep 27 18:44 fsimage_0000000000000000000.md5
-rw-r--r-- 1 root root 2 Sep 27 18:44 seen_txid
-rw-r--r-- 1 root root 201 Sep 27 18:44 VERSION
/var/lib/hadoop-hdfs/cache/surendhar:
total 4
drwxr-xr-x 3 surendhar surendhar 4096 Sep 25 19:35 dfs
/var/lib/hadoop-hdfs/cache/surendhar/dfs:
total 4
drwxr-xr-x 3 surendhar surendhar 4096 Sep 25 19:35 name
/var/lib/hadoop-hdfs/cache/surendhar/dfs/name:
total 4
drwxr-xr-x 2 surendhar surendhar 4096 Sep 25 19:35 current
/var/lib/hadoop-hdfs/cache/surendhar/dfs/name/current:
total 16
-rw-r--r-- 1 surendhar surendhar 124 Sep 25 19:35 fsimage_0000000000000000000
-rw-r--r-- 1 surendhar surendhar 62 Sep 25 19:35 fsimage_0000000000000000000.md5
-rw-r--r-- 1 surendhar surendhar 2 Sep 25 19:35 seen_txid
-rw-r--r-- 1 surendhar surendhar 201 Sep 25 19:35 VERSION
Here error is not clear, it could be because of permission, xml validations, etc.
You better use hadoop command to start namenode, instead of using service hadoop-hdfs-namenode start command. The advantage is that you will get the error message in the console itself. No need to go and check you namenode log files. Execute the below command and post the logs you are getting in the console.
sudo hadoop namenode
Looks like because of some issues, you are not able to start namenode using hdfs command.
When you execute the /etc/init.d/hadoop-hdfs-namenode script, which internally invokes in the following script /usr/lib/hadoop/sbin/hadoop-daemon.sh for starting hadoop daemons.
As a workaround you might need to change the line 151 ( nohup nice -n $HADOOP_NICENESS $hdfsScript --config $HADOOP_CONF_DIR $command "$#" > "$log" 2>&1 < /dev/null & ) in the file /usr/lib/hadoop/sbin/hadoop-daemon.sh as follows
nohup nice -n $HADOOP_NICENESS $hadoopScript --config $HADOOP_CONF_DIR $command "$#" > "$log" 2>&1 < /dev/null &
Above command uses $hadoopScript instead of $hdfsScript env variable.

Namenode HA (UnknownHostException: nameservice1)

We enable Namenode High Availability through Cloudera Manager, using
Cloudera Manager >> HDFS >> Action > Enable High Availability >> Selected Stand By Namenode & Journal Nodes
Then nameservice1
Once the whole process completed then Deployed Client Configuration.
Tested from Client Machine by listing HDFS directories (hadoop fs -ls /) then manually failover to standby namenode & again listing HDFS directories (hadoop fs -ls /). This test worked perfectly.
But When I ran hadoop sleep job using following command it failed
$ hadoop jar /opt/cloudera/parcels/CDH-4.6.0-1.cdh4.6.0.p0.26/lib/hadoop-0.20-mapreduce/hadoop-examples.jar sleep -m 1 -r 0
java.lang.IllegalArgumentException: java.net.UnknownHostException: nameservice1
at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:414)
at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:164)
at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:448)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:410)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:128)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2308)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:87)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2342)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2324)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:351)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:103)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:980)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:974)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:416)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:974)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:948)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1410)
at org.apache.hadoop.examples.SleepJob.run(SleepJob.java:174)
at org.apache.hadoop.examples.SleepJob.run(SleepJob.java:237)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.examples.SleepJob.main(SleepJob.java:165)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:622)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:622)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Caused by: java.net.UnknownHostException: nameservice1
... 37 more
I dont know why its not able to resolved nameservice1 even after deploying client configuration.
When I google this issue I found only one solution to this issue
Add the below entry in configuration entry for fix the issue
dfs.nameservices=nameservice1
dfs.ha.namenodes.nameservice1=namenode1,namenode2
dfs.namenode.rpc-address.nameservice1.namenode1=ip-10-118-137-215.ec2.internal:8020
dfs.namenode.rpc-address.nameservice1.namenode2=ip-10-12-122-210.ec2.internal:8020
dfs.client.failover.proxy.provider.nameservice1=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
My impression was Cloudera Manager take cares of it. I checked client for this configuration & configuration was there (/var/run/cloudera-scm-agent/process/1998-deploy-client-config/hadoop-conf/hdfs-site.xml).
Also some more details of config files :
[11:22:37 root#datasci01.dev:~]# ls -l /etc/hadoop/conf.cloudera.*
/etc/hadoop/conf.cloudera.hdfs:
total 16
-rw-r--r-- 1 root root 943 Jul 31 09:33 core-site.xml
-rw-r--r-- 1 root root 2546 Jul 31 09:33 hadoop-env.sh
-rw-r--r-- 1 root root 1577 Jul 31 09:33 hdfs-site.xml
-rw-r--r-- 1 root root 314 Jul 31 09:33 log4j.properties
/etc/hadoop/conf.cloudera.hdfs1:
total 20
-rwxr-xr-x 1 root root 233 Sep 5 2013 container-executor.cfg
-rw-r--r-- 1 root root 1890 May 21 15:48 core-site.xml
-rw-r--r-- 1 root root 2546 May 21 15:48 hadoop-env.sh
-rw-r--r-- 1 root root 1577 May 21 15:48 hdfs-site.xml
-rw-r--r-- 1 root root 314 May 21 15:48 log4j.properties
/etc/hadoop/conf.cloudera.mapreduce:
total 20
-rw-r--r-- 1 root root 1032 Jul 31 09:33 core-site.xml
-rw-r--r-- 1 root root 2775 Jul 31 09:33 hadoop-env.sh
-rw-r--r-- 1 root root 1450 Jul 31 09:33 hdfs-site.xml
-rw-r--r-- 1 root root 314 Jul 31 09:33 log4j.properties
-rw-r--r-- 1 root root 2446 Jul 31 09:33 mapred-site.xml
/etc/hadoop/conf.cloudera.mapreduce1:
total 24
-rwxr-xr-x 1 root root 233 Sep 5 2013 container-executor.cfg
-rw-r--r-- 1 root root 1979 May 16 12:20 core-site.xml
-rw-r--r-- 1 root root 2775 May 16 12:20 hadoop-env.sh
-rw-r--r-- 1 root root 1450 May 16 12:20 hdfs-site.xml
-rw-r--r-- 1 root root 314 May 16 12:20 log4j.properties
-rw-r--r-- 1 root root 2446 May 16 12:20 mapred-site.xml
[11:23:12 root#datasci01.dev:~]#
I doubt its issue with old configuration in /etc/hadoop/conf.cloudera.hdfs1 & /etc/hadoop/conf.cloudera.mapreduce1 , but not sure.
looks like /etc/hadoop/conf/* never got updated
# ls -l /etc/hadoop/conf/
total 24
-rwxr-xr-x 1 root root 233 Sep 5 2013 container-executor.cfg
-rw-r--r-- 1 root root 1979 May 16 12:20 core-site.xml
-rw-r--r-- 1 root root 2775 May 16 12:20 hadoop-env.sh
-rw-r--r-- 1 root root 1450 May 16 12:20 hdfs-site.xml
-rw-r--r-- 1 root root 314 May 16 12:20 log4j.properties
-rw-r--r-- 1 root root 2446 May 16 12:20 mapred-site.xml
Anyone has any idea about this issue?
Looks like you are using wrong client configuration in /etc/hadoop/conf directory. Sometimes Cloudera Manager (CM) deploy client configurations option may not work.
As you have enabled NN HA, you should have valid core-site.xml and hdfs-site.xml files in your hadoop client configuration directory. For getting the valid site files, Go to HDFS service from CM Choose Download client configuration option from the Actions Button. you will get configuration files in zip format, extract the zip files and replace /etc/hadoop/conf/core-site.xml and /etc/hadoop/conf/hdfs-site.xml files with the extracted core-site.xml,hdfs-site.xml files.
Got it resolved. wrong config was linked to "/etc/hadoop/conf/" --> "/etc/alternatives/hadoop-conf/" --> "/etc/hadoop/conf.cloudera.mapreduce1"
It has to be "/etc/hadoop/conf/" --> "/etc/alternatives/hadoop-conf/" --> "/etc/hadoop/conf.cloudera.mapreduce"
below statement in my code resolved problem by specifying the host and port
val dfs = sqlContext.read.json("hdfs://localhost:9000//user/arvindd/input/employee.json")
I resolved this issue my putting the complete line to create RDD
myfirstrdd = sc.textFile("hdfs://192.168.35.132:8020/BUPA.txt")
and then I was able to do other RDD transformation .. Make sure you have the w/r/x to the file or you can do chmod 777

Invalid URI for NameNode address

I'm trying to set up a Cloudera Hadoop cluster, with a master node containing the namenode, secondarynamenode and jobtracker, and two others nodes containing the datanode and tasktracker. The Cloudera version is 4.6, the OS is ubuntu precise x64. Also, this cluster is being created from a AWS instance. ssh passwordless has been set as well, Java instalation Oracle-7.
Whenever I execute sudo service hadoop-hdfs-namenode start I get:
2014-05-14 05:08:38,023 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join
java.lang.IllegalArgumentException: Invalid URI for NameNode address (check fs.defaultFS): file:/// has no authority.
at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:329)
at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:317)
at org.apache.hadoop.hdfs.server.namenode.NameNode.getRpcServerAddress(NameNode.java:370)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loginAsNameNodeUser(NameNode.java:422)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:442)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:621)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:606)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1177)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1241)
My core-site.xml:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://<master-ip>:8020</value>
</property>
</configuration>
mapred-site.xml:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>hdfs://<master-ip>:8021</value>
</property>
</configuration>
hdfs-site.xml:
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
I tried using public ip, private-ip, public dns and fqdn, but the result is the same.
The directory /etc/hadoop/conf.empty looks like:
-rw-r--r-- 1 root root 2998 Feb 26 10:21 capacity-scheduler.xml
-rw-r--r-- 1 root hadoop 1335 Feb 26 10:21 configuration.xsl
-rw-r--r-- 1 root root 233 Feb 26 10:21 container-executor.cfg
-rwxr-xr-x 1 root root 287 May 14 05:09 core-site.xml
-rwxr-xr-x 1 root root 2445 May 14 05:09 hadoop-env.sh
-rw-r--r-- 1 root hadoop 1774 Feb 26 10:21 hadoop-metrics2.properties
-rw-r--r-- 1 root hadoop 2490 Feb 26 10:21 hadoop-metrics.properties
-rw-r--r-- 1 root hadoop 9196 Feb 26 10:21 hadoop-policy.xml
-rwxr-xr-x 1 root root 332 May 14 05:09 hdfs-site.xml
-rw-r--r-- 1 root hadoop 8735 Feb 26 10:21 log4j.properties
-rw-r--r-- 1 root root 4113 Feb 26 10:21 mapred-queues.xml.template
-rwxr-xr-x 1 root root 290 May 14 05:09 mapred-site.xml
-rw-r--r-- 1 root root 178 Feb 26 10:21 mapred-site.xml.template
-rwxr-xr-x 1 root root 12 May 14 05:09 masters
-rwxr-xr-x 1 root root 29 May 14 05:09 slaves
-rw-r--r-- 1 root hadoop 2316 Feb 26 10:21 ssl-client.xml.example
-rw-r--r-- 1 root hadoop 2251 Feb 26 10:21 ssl-server.xml.example
-rw-r--r-- 1 root root 2513 Feb 26 10:21 yarn-env.sh
-rw-r--r-- 1 root root 2262 Feb 26 10:21 yarn-site.xml
and slaves lists the ip addresses of the two slave machines:
<slave1-ip>
<slave2-ip>
Executing
update-alternatives --get-selections | grep hadoop
hadoop-conf auto /etc/hadoop/conf.empty
I've done a lot of search, but didn't get anything that could help me fix my problem. Could someone offer any clue what's going on?
I was facing the same issue and fixed by formatting the namenode. Below is the command:
hdfs namenode -format
core-site.xml entry is :
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
That will definitely solve the problem.
I ran into this same thing. I found I had to add a fs.defaultFS property to hdfs-site.xml to match the fs.defaultFS property in core-site.xml:
<property>
<name>fs.defaultFS</name>
<value>hdfs://<master-ip>:8020</value>
</property>
Once I added this, the secondary namenode started OK.
Make sure you have set the HADOOP_PREFIX variable correctly as indicated in the link:
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html
Even i faced the same issue as yours and it got rectified by setting this variable
Might be you had given a wrong syntax for dfs.datanode.data.dir
or dfs.namenode.data.dir in hdfs-site.xml. If you miss / in the value you will get this error.
Check the syntax of
file:///home/hadoop/hdfs/

How can I troubshoot this Hadoop filesystem installation error?

I'm trying to install Hadoop on a non-Cloudera Ubuntu test image. Everything seems to have been going well until I ran ./bin/start-all.sh. The name node never comes up so I can't even run a hadoop fs -ls to connect to the filesystem.
Here's the namenode log:
2011-03-24 11:38:00,256 INFO org.apache.hadoop.ipc.Server: Stopping server on 54310
2011-03-24 11:38:00,257 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /usr/local/hadoop-datastore/hadoop-hadoop/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:290)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:88)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:312)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:293)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:224)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:306)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1006)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1015)
2011-03-24 11:38:00,258 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at Brash/192.168.1.5
************************************************************/
I've chmod -R 755 on the root directory and even gone so far as to make sure the directory exists by creating it with mkdir -p.
hadoop#Brash:/usr/lib/hadoop$ ls -la /usr/local/hadoop-datastore/hadoop-hadoop/dfs/
total 16
drwxr-xr-x 4 hadoop hadoop 4096 2011-03-24 11:41 .
drwxr-xr-x 4 hadoop hadoop 4096 2011-03-24 11:31 ..
drwxr-xr-x 2 hadoop hadoop 4096 2011-03-24 11:31 data
drwxr-xr-x 2 hadoop hadoop 4096 2011-03-24 11:41 name
Here's my /conf/hdfs-site.xml:
hadoop#Brash:/usr/lib/hadoop$ cat conf/hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
</configuration>
You should never have to create the directory yourself. It will create it on its own. Did you forget to format namenode? Delete the existing directory, then reformat the namenode (bin/hadoop namenode -format) and try again.

Resources