I checked solutions in this site.
I went to the (hadoop folder)/data/dfs/datanode to change ID.
but, there are not anything in datanode folder.
what can I do?
Thank for reading.
And If you help me, I will be appreciate you.
PS
2017-04-11 20:24:05,507 WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to add storage directory [DISK]file:/tmp/hadoop-knu/dfs/data/
java.io.IOException: Incompatible clusterIDs in /tmp/hadoop-knu/dfs/data: namenode clusterID = CID-4491e2ea-b0dd-4e54-a37a-b18aaaf5383b; datanode clusterID = CID-13a3b8e1-2f8e-4dd2-bcf9-c602420c1d3d
2017-04-11 20:24:05,509 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool (Datanode Uuid unassigned) service to localhost/127.0.0.1:9010. Exiting.
java.io.IOException: All specified directories are failed to load.
2017-04-11 20:24:05,509 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool (Datanode Uuid unassigned) service to localhost/127.0.0.1:9010
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9010</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/knu/hadoop/hadoop-2.7.3/data/dfs/namenode</value>
</property>
<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>/home/knu/hadoop/hadoop-2.7.3/data/dfs/namesecondary</value>
</property>
<property>
<name>dfs.dataode.data.dir</name>
<value>/home/knu/hadoop/hadoop-2.7.3/data/dfs/datanode</value>
</property>
<property>
<name>dfs.http.address</name>
<value>localhost:50070</value>
</property>
<property>
<name>dfs.secondary.http.address</name>
<value>localhost:50090</value>
</property>
</configuration>
PS2
[knu#localhost ~]$ ls -l /home/knu/hadoop/hadoop-2.7.3/data/dfs/
drwxrwxr-x. 2 knu knu 6 4월 11 21:28 datanode
drwxrwxr-x. 3 knu knu 40 4월 11 22:15 namenode
drwxrwxr-x. 3 knu knu 40 4월 11 22:15 namesecondary
The problem is with the property name dfs.datanode.data.dir, it is misspelt as dfs.dataode.data.dir. This invalidates the property from being recognised and as a result, the default location of ${hadoop.tmp.dir}/hadoop-${USER}/dfs/data is used as data directory.
hadoop.tmp.dir is /tmp by default, on every reboot the contents of this directory will be deleted and forces datanode to recreate the folder on startup. And thus Incompatible clusterIDs.
Edit this property name in hdfs-site.xml before formatting the namenode and starting the services.
My solution :
rm -rf ./tmp/hadoop-${user}/dfs/data/*
./bin/hadoop namenode -format
./sbin/hadoop-daemon.sh start datanode
Try formatting the namenode and then restarting HDFS.
Copy the cluster under directory /hadoop/bin$:
./hdfs namenode -format -clusterId CID-a5a9348a-3890-4dce-94dc-0fec2ba999a9
Related
I have 4 nodes, one master and 3 slaves.
master: * .* .*.18, slaves: * .*. *.12, 104, 36.
Configurations for Hadoop on Namenode:
core-site.xml:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///home/hduser/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///home/hduser/hadoop_store/hdfs/datanode</value>
</property>
</configuration>
hadoop-env.sh:
export JAVA_HOME="/usr/lib/jvm/java-8-openjdk-amd64"
export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"}
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true"
export HADOOP_PID_DIR=${HADOOP_PID_DIR} // default to /tmp
export HADOOP_SECURE_DN_PID_DIR=${HADOOP_PID_DIR}
export HADOOP_IDENT_STRING=$USER
mapred-site.xml:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
</property>
</configuration>
slaves:
10.0.3.12
10.0.3.36
10.0.3.104
yarn-site.xml:
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.resourcemanager.address</name>
<value>localhost:8050</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
In the slave nodes the configurations for hadoop are:
yarn-site.xml:
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.resourcemanager.address</name>
<value>10.0.3.18:8050</value>
</property>
<property>
<name>yarn.nodemanager.address</name>
<value>localhost:8035</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
the rest of the files remain the same in all the slave nodes as in the master node. With respect to the Hbase configuration,
hbase-env.sh(in all):
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -XX:PermSize=128m -XX:MaxPermSize=128m -XX:ReservedCodeCacheSize=256m"
export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -XX:PermSize=128m -XX:MaxPermSize=128m -XX:ReservedCodeCacheSize=256m"
export HBASE_REGIONSERVERS=${HBASE_HOME}/conf/regionservers
export HBASE_MANAGES_ZK=true
hbase-site.xml(in all):
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>10.0.3.18,10.0.3.12,10.0.3.104,10.0.3.36</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hduser/Downloads/hbase/zookeeper</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>zookeeper.session.timeout</name>
<value>1200000</value>
</property>
<property>
<name>hbase.zookeeper.property.tickTime</name>
<value>6000</value>
</property>
</configuration>
except that in slaves, localhost is changed to 10.0.3.18(address of namenode)
regionservers:
10.0.3.12
10.0.3.104
10.0.3.36
I formatted namenode and when I start hdfs with commands: start-dfs.sh and start-yarn.sh, output is as follows:
...succefully formatted namenode...
localhost: starting namenode, logging to /home/hduser/Downloads/hadoop/logs/hadoop-hduser-namenode-saichanda-OptiPlex-9020.out
10.0.3.12: starting datanode, logging to /home/hduser/Downloads/hadoop/logs/hadoop-hduser-datanode-aaron.out
10.0.3.36: starting datanode, logging to /home/hduser/Downloads/hadoop/logs/hadoop-hduser-datanode-dmacs-OptiPlex-9020.out
10.0.3.104: starting datanode, logging to /home/hduser/Downloads/hadoop/logs/hadoop-hduser-datanode-hadoop-104.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /home/hduser/Downloads/hadoop/logs/hadoop-hduser-secondarynamenode-saichanda-OptiPlex-9020.out
starting yarn daemons
starting resourcemanager, logging to /home/hduser/Downloads/hadoop/logs/yarn-hduser-resourcemanager-saichanda-OptiPlex-9020.out
10.0.3.12: starting nodemanager, logging to /home/hduser/Downloads/hadoop/logs/yarn-hduser-nodemanager-aaron.out
10.0.3.36: starting nodemanager, logging to /home/hduser/Downloads/hadoop/logs/yarn-hduser-nodemanager-dmacs-OptiPlex-9020.out
10.0.3.104: starting nodemanager, logging to /home/hduser/Downloads/hadoop/logs/yarn-hduser-nodemanager-hadoop-104.out
when I run jps command (on master):
28032 SecondaryNameNode
28481 Jps
28198 ResourceManager
27720 NameNode
when I run jps command (on slaves):
11303 DataNode
11595 Jps
11436 NodeManager
Then I started Hbase with the command: ./start-hbase.sh. output is:
10.0.3.12: running zookeeper, logging to /home/hduser/Downloads/hbase/bin/../logs/hbase-hduser-zookeeper-aaron.out
10.0.3.36: running zookeeper, logging to /home/hduser/Downloads/hbase/bin/../logs/hbase-hduser-zookeeper-dmacs-OptiPlex-9020.out
10.0.3.104: running zookeeper, logging to /home/hduser/Downloads/hbase/bin/../logs/hbase-hduser-zookeeper-hadoop-104.out
10.0.3.18: running zookeeper, logging to /home/hduser/Downloads/hbase/bin/../logs/hbase-hduser-zookeeper-saichanda-OptiPlex-9020.out
running master, logging to /home/hduser/Downloads/hbase/logs/hbase-hduser-master-saichanda-OptiPlex-9020.out
OpenJDK 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
10.0.3.12: running regionserver, logging to /home/hduser/Downloads/hbase/bin/../logs/hbase-hduser-regionserver-aaron.out
10.0.3.36: running regionserver, logging to /home/hduser/Downloads/hbase/bin/../logs/hbase-hduser-regionserver-dmacs-OptiPlex-9020.out
10.0.3.104: running regionserver, logging to /home/hduser/Downloads/hbase/bin/../logs/hbase-hduser-regionserver-hadoop-104.out
10.0.3.12: OpenJDK 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
10.0.3.12: OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
10.0.3.36: OpenJDK 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
10.0.3.36: OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
10.0.3.104: OpenJDK 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
10.0.3.104: OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
when I run jps on namenode:
28032 SecondaryNameNode
28821 HQuorumPeer
29126 Jps
28198 ResourceManager
27720 NameNode
when I run jps on slaves:
11776 HRegionServer
11669 HQuorumPeer
11303 DataNode
11899 Jps
11436 NodeManager
What I observed was that HMaster is not running on the namenode. Can anyone help understand the problem why HMaster is crashing out. After sometime even NodeManager crashes out in the slaves. Also I observed that When I shutdown hbase, HRegionservers on the slaves donot go down, they continue to be running even after I give stop-hbase.sh command in the master node. Key warnings and errors observed in My logs are as follows.
hadoop-namenode.log: multiple times I get this Exception...
java.io.IOException: File /hbase/.tmp/hbase.version could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
hadoop-secondary-namenode.log: multiple times I get this ERROR...
ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in doCheckpoint
java.io.IOException: Inconsistent checkpoint fields.
No error found in yarn-resourcemanager.log.
For hbase logs: in hbase-master.log:
FATAL [saichanda-OptiPlex-9020:16000.activeMasterManager] master.HMaster: Failed to become active master
File /hbase/.tmp/hbase.version could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
In hbase-zookeeper.log: I see this line, as such no errors were there in the log.
019-01-29 10:09:49,431 INFO [main] server.NIOServerCnxnFactory: binding to port 0.0.0.0/0.0.0.0:2181
on one of the slaves, regionserver.log:
client.ZooKeeperRegistry: ClusterId read in ZooKeeper is null
on one of the slaves, hadoop-datanode.log gives multiple times the following warning.
WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to server: localhost/127.0.0.1:9000
AMONG ALL THE ABOVE WARNINGS AND ERRORS, I FEEL THE ERROR PERTAINING TO HBASE-MASTER.LOG SEEMS CRITICAL, WHERE IT SAYS, replicated to 0 nodes instead of minReplication (=1). Please help me solve this issue.
Also, when I finally run the hbase shell, I get the error:
ERROR: Can't get master address from ZooKeeper; znode data == null
Thank you.
I have just started learning hadoop from the book Hadoop: The definitive guide.
I followed the tutorial for Hadoop installation in Pseudodistribution mode. I enabled the passwordless login to ssh.
Formatted the hdfs filesystem before using it for the first time. It started successfully for the first time.
After that I copied a text file using copyFromLocal to HDFS and everything went fine. But if I restart the system and start the daemons again and look at the web UI , only YARN is started successfully.
When I issue the stop-dfs.sh commmand I get
Stopping namenodes on [localhost]
localhost: no namenode to stop
localhost: stopping datanode
Stopping secondary namenodes [0.0.0.0]
0.0.0.0: stopping secondarynamenode
If I format the hdfs file system again and then try starting the daemons then they all start successfully.
Here are my configuration files.Exactly as what is told in hadoop definitive guide book.
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost/</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>localhost</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
This is the error in the namenode log file
WARN org.apache.hadoop.hdfs.server.common.Storage: Storage directory /tmp/hadoop/dfs/name does not exist
WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception loading fsimage
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /tmp/hadoop/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:327)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:215)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:975)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:681)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:585)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:645)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:812)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:796)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1493)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1559)
This is from mapred log
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:614)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712)
at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528)
at org.apache.hadoop.ipc.Client.call(Client.java:1451)
... 33 more
I visited apache hadoop : connection refused which says
Check that there isn't an entry for your hostname mapped to 127.0.0.1 or 127.0.1.1 in /etc/hosts (Ubuntu is notorious for this).
I found there is an entry in my /etc/hosts, but if I remove it my sudo breaks causing error sudo: unable to resolve host . What should I append in /etc/hosts if not remove my hostname mapped to 127.0.1.1
I cannot understand what is the root cause of this problem.
Well it says in your Namenode log file that default storage of your namenode directory is /tmp/hadoop. The /tmp directory is formatted in linux on reboot by some systems. So it must be the problem.
You need to change your default namenode and datanode directory by changing your hdfs-site.xml configuration file.
Add this in your hdfs-site.xml
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///home/"your-user-name"/hadoop</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///home/"your-user-name"/datanode</value>
</property>
After this format your namenode by hdfs namenode -format command.
I think this will end your problem.
If configuration file is not a problem, please try following:
1.first delete all contents from temporary folder:
rm -Rf <tmp dir> (my was /usr/local/hadoop/tmp)
2.format the namenode:
bin/hadoop namenode -format
3.start all processes again:
bin/start-all.sh
I am getting below error while starting hadoop:
2015-09-04 08:49:05,648 ERROR org.apache.hadoop.hdfs.server.common.Storage: It appears that another node 854#ip-1-2-3-4 has already locked the storage directory: /mnt/xvdb/tmp/dfs/namesecondary
java.nio.channels.OverlappingFileLockException
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.tryLock(Storage.java:712)
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:678)
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:499)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.recoverCreate(SecondaryNameNode.java:962)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:243)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.(SecondaryNameNode.java:192)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:671)
2015-09-04 08:49:05,650 INFO org.apache.hadoop.hdfs.server.common.Storage: Cannot lock storage /mnt/xvdb/tmp/dfs/namesecondary. The directory is already locked
2015-09-04 08:49:05,650 FATAL org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Failed to start secondary namenode
java.io.IOException: Cannot lock storage /mnt/xvdb/tmp/dfs/namesecondary. The directory is already locked
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:683)
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:499)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.recoverCreate(SecondaryNameNode.java:962)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:243)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.(SecondaryNameNode.java:192)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:671)
2015-09-04 08:49:05,652 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2015-09-04 08:49:05,653 INFO org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down SecondaryNameNode at ip-#ip-1-2-3-4/#ip-1-2-3-4
************************************************************/
Hadoop version: 2.7.1(3 node cluster)
hdfs-site.xml configuration file:
<configuration>
<property>
<name>dfs.data.dir</name>
<value>/mnt/xvdb/hadoop/dfs/data</value>
<final>true</final>
</property>
<property>
<name>dfs.name.dir</name>
<value>/mnt/xvdb/hadoop/dfs/name</value>
<final>true</final>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
</configuration>
I have tried formatting name node as well, but it didn't help. Can anyone help me with this?
I found solution to above problem here : http://misconfigurations.blogspot.in/2014/10/hadoop-initialization-failed-for-block.html
If there is any other solution,would like to have a look.
P.S: I have deleted the directory pointed out by "dfs.datanode.data.dir" and it has erased all data on HDFS but helped me to fix the issue. So You can use an alternate way, if has any, for fixing this issue.
I have been trying to set up and run Hadoop in the pseudo Distributed Mode.But when I type
bin/hadoop fs -mkdir input
I get
mkdir: Call From h1/192.168.1.13 to h1:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
here is the details
core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/grid/tmp</value>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://h1:9000</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>h1:9001</value>
</property>
<property>
<name>mapred.map.tasks</name>
<value>20</value>
</property>
<property>
<name>mapred.reduce.tasks</name>
<value>4</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobtracker.http.address</name>
<value>h1:50030</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>h1:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>h1:19888</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.http.address</name>
<value>h1:50070</value>
</property>
<property>
<name>dfs.namenode.rpc-address</name>
<value>h1:9001</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>h1:50090</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/grid/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
/etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.1.13 h1
192.168.1.14 h2
192.168.1.15 h3
After hadoop namenode -format and start-all.sh
1702 ResourceManager
1374 DataNode
1802 NodeManager
2331 Jps
1276 NameNode
1558 SecondaryNameNode
the problem occurs
[grid#h1 hadoop-2.6.0]$ bin/hadoop fs -mkdir input
15/05/13 16:37:57 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
mkdir: Call From h1/192.168.1.13 to h1:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
Where is the problems?
hadoop-grid-datanode-h1.log
2015-05-12 11:26:20,329 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = h1/192.168.1.13
STARTUP_MSG: args = []
STARTUP_MSG: version = 2.6.0
hadoop-grid-namenode-h1.log
2015-05-08 16:06:32,561 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = h1/192.168.1.13
STARTUP_MSG: args = []
STARTUP_MSG: version = 2.6.0
why the port 9000 does not work?
[grid#h1 ~]$ netstat -tnl |grep 9000
[grid#h1 ~]$ netstat -tnl |grep 9001
tcp 0 0 192.168.1.13:9001 0.0.0.0:* LISTEN
Please start dfs and yarn.
[hadoop#hadooplab sbin]$ ./start-dfs.sh
[hadoop#hadooplab sbin]$ ./start-yarn.sh
Now try using "bin/hadoop fs -mkdir input"
The issue usually comes when you install hadoop in a VM and then shut it down. When you shut down VM, dfs and yarn also stops. So you need to start dfs and yarn each time you restart the VM.
Firstly try command
bin/hadoop dfs -mkdir input
If you have followed micheal-roll post properly then you should not have any issue. I suspect that passwordless ssh is not working in your configuration, recheck it.
Following procedure resolved the issue for me:
Stop all the services.
Delete namenode and datanode directories as specified in hdfs-site.xml.
Create new namenode and datanode directories and modify hdfs-site.xml accordingly.
In core-site.xml, make the following changes or add the following properties:
fs.defaultFS
hdfs://172.20.12.168/
fs.default.name
hdfs://172.20.12.168:8020
Make the following changes in hadoop-2.6.4/etc/hadoop/hadoop-env.sh file:
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_91.jdk/Contents/Home
Restart dfs, yarn and mr as follows:
start-dfs.sh
start-yarn.sh
mr-jobhistory-daemon.sh start historyserver
This command worked for me:
hadoop namenode -format
I just downloaded hadoop-0.20 tar and extracted. And i set JAVA_HOME and HADOOP_HOME. I modified core-site.xml, hdfs-site.xml and mapred-site.xml.
I started services.
jps
jps
JobTracker
TaskTracker
I check the logs. It says
2015-02-11 18:07:52,278 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = scspn0022420004.lab.eng.btc.netapp.in/10.72.40.68
STARTUP_MSG: args = []
STARTUP_MSG: version = 0.20.0
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.20 -r 763504; compiled by 'ndaley' on Thu Apr 9 05:18:40 UTC 2009
************************************************************/
2015-02-11 18:07:52,341 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.lang.NullPointerException
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:134)
at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:156)
at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:160)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:175)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:955)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:964)
2015-02-11 18:07:52,346 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at scspn0022420004.lab.eng.btc.netapp.in/10.72.40.68
************************************************************/
What i am mistaking?
My Conf files as follows:
core-site
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:8020</value>
</property>
</configuration>
hdfs-site
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<!-- Immediately exit safemode as soon as one DataNode checks in.
On a multi-node cluster, these configurations must be removed. -->
<property>
<name>dfs.safemode.extension</name>
<value>0</value>
</property>
<property>
<name>dfs.safemode.min.datanodes</name>
<value>1</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/var/lib/hadoop-hdfs/cache/${user.name}</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///var/lib/hadoop-hdfs/cache/${user.name}/dfs/name</value>
</property>
<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>file:///var/lib/hadoop-hdfs/cache/${user.name}/dfs/namesecondary</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///var/lib/hadoop-hdfs/cache/${user.name}/dfs/data</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:8021</value>
</property>
</configuration>
Any idea?
This is what i see in the console while starting start-dfs.sh
localhost: starting secondarynamenode, logging to /root/hadoop/hadoop-0.20.0/bin/../logs/hadoop-root-secondarynamenode- hostname.out
localhost: Exception in thread "main" java.lang.NullPointerException
localhost: at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:134)
localhost: at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:156)
localhost: at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:160)
localhost: at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:131)
localhost: at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.<init> (SecondaryNameNode.java:115)
localhost: at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:469)
I guess that you didn't set up your hadoop cluster correctly please follow these steps :
Step1: begin with setting up .bashrc:
vi $HOME/.bashrc
put the following lines at the end of the file: (change the hadoop home as yours)
# Set Hadoop-related environment variables
export HADOOP_HOME=/usr/local/hadoop
# Set JAVA_HOME (we will also configure JAVA_HOME directly for Hadoop later on)
export JAVA_HOME=/usr/lib/jvm/java-6-sun
# Some convenient aliases and functions for running Hadoop-related commands
unalias fs &> /dev/null
alias fs="hadoop fs"
unalias hls &> /dev/null
alias hls="fs -ls"
# If you have LZO compression enabled in your Hadoop cluster and
# compress job outputs with LZOP (not covered in this tutorial):
# Conveniently inspect an LZOP compressed file from the command
# line; run via:
#
# $ lzohead /hdfs/path/to/lzop/compressed/file.lzo
#
# Requires installed 'lzop' command.
#
lzohead () {
hadoop fs -cat $1 | lzop -dc | head -1000 | less
}
# Add Hadoop bin/ directory to PATH
export PATH=$PATH:$HADOOP_HOME/bin
step 2 : edit hadoop-env.sh as following:
# The java implementation to use. Required.
export JAVA_HOME=/usr/lib/jvm/java-6-sun
step 3 : Now create a directory and set the required ownerships and permissions
$ sudo mkdir -p /app/hadoop/tmp
$ sudo chown hduser:hadoop /app/hadoop/tmp
# ...and if you want to tighten up security, chmod from 755 to 750...
$ sudo chmod 750 /app/hadoop/tmp
step 4 : edit core-site.xml
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
</property>
step 5 : edit mapred-site.xml
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
</property>
step 6 : edit hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
finally format your hdfs (You need to do this the first time you set up a Hadoop cluster)
$ /usr/local/hadoop/bin/hadoop namenode -format
hope this will help you
I don't use 0.20.0 version, but are you sure the key in core-site.xml is fs.defaultFS?
In core-default.xml seems to be named fs.default.name.