Hadoop: no node available for block blk_-5883966349607013512_1099 - hadoop

I am very new to Hadoop. I start Hadoop with the following command...
[gpadmin#BigData1-ahandler root]$ /usr/local/hadoop-0.20.1/bin/start-all.sh
starting namenode, logging to /usr/local/hadoop-0.20.1/logs/hadoop-gpadmin-namenode-BigData1-ahandler.out
localhost: starting datanode, logging to /usr/local/hadoop-0.20.1/logs/hadoop-gpadmin-datanode-BigData1-ahandler.out
localhost: starting secondarynamenode, logging to /usr/local/hadoop-0.20.1/logs/hadoop-gpadmin-secondarynamenode-BigData1-ahandler.out
starting jobtracker, logging to /usr/local/hadoop-0.20.1/logs/hadoop-gpadmin-jobtracker-BigData1-ahandler.out
localhost: starting tasktracker, logging to /usr/local/hadoop-0.20.1/logs/hadoop-gpadmin-tasktracker-BigData1-ahandler.out
When I try to -cat the output from the following directory, I get an error: "no node available". What does this error mean? How can I fix it? Or start debuging it?
[gpadmin#BigData1-ahandler root]$ hadoop fs -cat output/d*/part-*
13/11/13 15:33:09 INFO hdfs.DFSClient: No node available for block: blk_-5883966349607013512_1099 file=/user/gpadmin/output/d15795/part-00000
13/11/13 15:33:09 INFO hdfs.DFSClient: Could not obtain block blk_-5883966349607013512_1099 from any node: java.io.IOException: No live nodes contain current block

This happens when you start the datanodes before the namenode.
When the datanodes start before the namenode starts, the datanode services try to check in to the namenode & fail saying "namenode not found". Then once the namenode starts, it has no datanodes checked in, therefore it cannot find the node on which the block of data being accessed is located.
You should go through the script start-all.sh and make sure that the namenode starts before the datanodes.

Related

Failed to retrieve data from /webhdfs/v1/?op=LISTSTATUS: Server Error

vijay#ubuntu:~$ start-all.sh
WARNING: Attempting to start all Apache Hadoop daemons as vijay in 10 seconds.
WARNING: This is not a recommended production deployment configuration.
WARNING: Use CTRL-C to abort.
Starting namenodes on [localhost]
localhost: namenode is running as process 22733. Stop it first and ensure /tmp/hadoop-vijay-namenode.pid file is empty before retry.
Starting datanodes
localhost: datanode is running as process 22866. Stop it first and ensure /tmp/hadoop-vijay-datanode.pid file is empty before retry.
Starting secondary namenodes [ubuntu]
ubuntu: secondarynamenode is running as process 23072. Stop it first and ensure /tmp/hadoop-vijay-secondarynamenode.pid file is empty before retry.
Starting resourcemanager
Starting nodemanagers
vijay#ubuntu:~$ jps
23072 SecondaryNameNode
22866 DataNode
22733 NameNode
24447 Jps
enter image description here
I am facing hadoop web console error
Currently installed java version "19.0.1" 2022-10-18 and Hadoop 3.3.4

Error in loading hadoop distributed file system

I installed Hadoop-3.3.4 in Ubuntu-20. I wrote the command for starting hadoop, i.e.
samar#pc:~$ $HADOOP_HOME/sbin/start-all.sh
Then it showed the output as.
WARNING: Attempting to start all Apache Hadoop daemons as samar in 10 seconds.
WARNING: This is not a recommended production deployment configuration.
WARNING: Use CTRL-C to abort.
Starting namenodes on [localhost]
Starting datanodes
Starting secondary namenodes [pc]
Starting resourcemanager
Starting nodemanagers
But when I tried to access the HDFS with the command
samar#pc:~$ hdfs dfs -ls
It gave a message as:
ls: Call From pc/127.0.1.1 to localhost:9000 failed on connection exception:
java.net.ConnectException: Connection refused; For more details see:
http://wiki.apache.org/hadoop/ConnectionRefused
and the output of jps was:
10485 Jps
10101 NodeManager
9946 ResourceManager
9739 SecondaryNameNode
9533 DataNode
Namenode did not start successfully (9000 is namenodes services port)
Are there more logs?

How to install Hadoop on M1 Mac

I followed serveral tuitorial and everytime I start Hadoop will have these
feiyechen#FEIYEdeMac-mini ~ % start-all.sh
WARNING: Attempting to start all Apache Hadoop daemons as feiyechen in 10 seconds.
WARNING: This is not a recommended production deployment configuration.
WARNING: Use CTRL-C to abort.
Starting namenodes on [localhost]
Starting datanodes
localhost: datanode is running as process 55832. Stop it first and ensure /tmp/hadoop-feiyechen-datanode.pid file is empty before retry.
Starting secondary namenodes [FEIYEdeMac-mini.local]
FEIYEdeMac-mini.local: secondarynamenode is running as process 55966. Stop it first and ensure /tmp/hadoop-feiyechen-secondarynamenode.pid file is empty before retry.
2022-01-28 20:35:24,311 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting resourcemanager
Starting nodemanagers
feiyechen#FEIYEdeMac-mini ~ % jps
55832 DataNode
57838 Jps
55966 SecondaryNameNode
57247 NameNode
Tutorial said should got these after run jps
I only have 4 items: DataNode, Jps, SecondaryNameNode, NameNode. Is that mean I failed?
It means you have a running HDFS installation, but not YARN.
You should be able to run start-yarn.sh separately if you want the ResourceManger + NodeManager
Otherwise, there are log files created for both the YARN processes that would include information about why they are failing.

Running Hadoop in full-distributed mode in a 5-machines cluster takes more time than in a single machine

I am running hadoop in a cluster of 5 machines (1 master and 4 slaves). I am running a map-reduce algorithm for friends-in-common recommandation, and I am using a file with 49995 lines (or 49995 people each one followed by his friends).
The problem is that it takes more time to execute the algorithm on the cluster than on one machine !!
I don't know if this is normal because the file is not big enough (and thus the time is slower due to latency between machines) or that I must change something to run the algorithm in parallel on the different nodes, but I think this is done automatically.
Typically, running the algorithm on one machine takes this:
real 3m10.044s
user 2m53.766s
sys 0m4.531s
While on the cluster it takes this time:
real 3m32.727s
user 3m10.229s
sys 0m5.545s
Here is the output when I execute the start_all.sh script on the master:
ubuntu#ip:/usr/local/hadoop-2.6.0$ sbin/start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [master]
master: starting namenode, logging to /usr/local/hadoop-2.6.0/logs/hadoop-ubuntu-namenode-ip-172-31-37-184.out
slave1: starting datanode, logging to /usr/local/hadoop-2.6.0/logs/hadoop-ubuntu-datanode-slave1.out
slave2: starting datanode, logging to /usr/local/hadoop-2.6.0/logs/hadoop-ubuntu-datanode-slave2.out
slave3: starting datanode, logging to /usr/local/hadoop-2.6.0/logs/hadoop-ubuntu-datanode-slave3.out
slave4: starting datanode, logging to /usr/local/hadoop-2.6.0/logs/hadoop-ubuntu-datanode-slave4.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop-2.6.0/logs/hadoop-ubuntu-secondarynamenode-ip-172-31-37-184.out
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop-2.6.0/logs/yarn-ubuntu-resourcemanager-ip-172-31-37-184.out
slave4: starting nodemanager, logging to /usr/local/hadoop-2.6.0/logs/yarn-ubuntu-nodemanager-slave4.out
slave1: starting nodemanager, logging to /usr/local/hadoop-2.6.0/logs/yarn-ubuntu-nodemanager-slave1.out
slave3: starting nodemanager, logging to /usr/local/hadoop-2.6.0/logs/yarn-ubuntu-nodemanager-slave3.out
slave2: starting nodemanager, logging to /usr/local/hadoop-2.6.0/logs/yarn-ubuntu-nodemanager-slave2.out
And here is the output when I execute the stop_all.sh script:
ubuntu#ip:/usr/local/hadoop-2.6.0$ sbin/stop-all.sh
This script is Deprecated. Instead use stop-dfs.sh and stop-yarn.sh
Stopping namenodes on [master]
master: stopping namenode
slave4: no datanode to stop
slave3: stopping datanode
slave1: stopping datanode
slave2: stopping datanode
Stopping secondary namenodes [0.0.0.0]
0.0.0.0: stopping secondarynamenode
stopping yarn daemons
stopping resourcemanager
slave2: no nodemanager to stop
slave3: no nodemanager to stop
slave4: no nodemanager to stop
slave1: no nodemanager to stop
no proxyserver to stop
Thank you in advance !
One possible reason is that your file is not uploaded on the HDFS. In other words it is stored on a single machine, and all the other running machines have to get their data from that machine.
Before you run your mapreduce program. You can do the following steps:
1- Make sure that the HDFS is up and running. Open the link:
master:50070
Where master is the IP for the node running the namenode, and check on that link that you have all the nodes live and running. So if you have 4 datanodes you should see: datanodes (4 live).
2- Call:
hdfs dfs -put yourfile /someFolderOnHDFS/yourfile
That way you have uploaded your input file to the HDFS and the data is now distributed among multiple nodes.
Try running your program now and see if it is faster
Best of luck

Datanode not starts correctly

I am trying to install Hadoop 2.2.0 in pseudo-distributed mode. While I am trying to start the datanode services it is showing the following error, can anyone please tell how to resolve this?
**2**014-03-11 08:48:15,916 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool <registering> (storage id unknown) service to localhost/127.0.0.1:9000 starting to offer service
2014-03-11 08:48:15,922 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2014-03-11 08:48:15,922 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 50020: starting
2014-03-11 08:48:16,406 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /home/prassanna/usr/local/hadoop/yarn_data/hdfs/datanode/in_use.lock acquired by nodename 3627#prassanna-Studio-1558
2014-03-11 08:48:16,426 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool BP-611836968-127.0.1.1-1394507838610 (storage id DS-1960076343-127.0.1.1-50010-1394127604582) service to localhost/127.0.0.1:9000
java.io.IOException: Incompatible clusterIDs in /home/prassanna/usr/local/hadoop/yarn_data/hdfs/datanode: namenode clusterID = CID-fb61aa70-4b15-470e-a1d0-12653e357a10; datanode clusterID = CID-8bf63244-0510-4db6-a949-8f74b50f2be9
at**** org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:391)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:191)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:219)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:837)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:808)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:280)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:222)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:664)
at java.lang.Thread.run(Thread.java:662)
2014-03-11 08:48:16,427 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool BP-611836968-127.0.1.1-1394507838610 (storage id DS-1960076343-127.0.1.1-50010-1394127604582) service to localhost/127.0.0.1:9000
2014-03-11 08:48:16,532 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool BP-611836968-127.0.1.1-1394507838610 (storage id DS-1960076343-127.0.1.1-50010-1394127604582)
2014-03-11 08:48:18,532 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2014-03-11 08:48:18,534 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0
2014-03-11 08:48:18,536 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
You can do the following method,
copy to clipboard datanode clusterID for your example, CID-8bf63244-0510-4db6-a949-8f74b50f2be9
and run following command under HADOOP_HOME/bin directory
./hdfs namenode -format -clusterId CID-8bf63244-0510-4db6-a949-8f74b50f2be9
then this code formatted the namenode with datanode cluster ids.
You must do as follow :
bin/stop-all.sh
rm -Rf /home/prassanna/usr/local/hadoop/yarn_data/hdfs/*
bin/hadoop namenode -format
I had the same problem until I found an answer in this web site.
Whenever you are getting below error, trying to start a DN on a slave machine:
java.io.IOException: Incompatible clusterIDs in /home/hadoop/dfs/data: namenode clusterID= ****; datanode clusterID = ****
It is because after you set up your cluster, you, for whatever reason, decided to reformat
your NN. Your DNs on slaves still bear reference to the old NN.
To resolve this simply delete and recreate data folder on that machine in local Linux FS, namely /home/hadoop/dfs/data.
Restarting that DN's daemon on that machine will recreate data/ folder's content and resolve
the problem.
Do following simple steps
Clear the data directory of hadoop
Format the namenode again
start the cluster
After this your cluster will start normally if you are not having any other configuration issue
DataNode dies because of incompatible Clusterids compared to the NameNode. To fix this problem you need to delete the directory /tmp/hadoop-[user]/hdfs/data and restart hadoop.
rm -r /tmp/hadoop-[user]/hdfs/data
I got similar issue in my pseudo distributed environment. I stopped cluster first, then I copied Cluster ID from NameNode's version file and put it in DataNode's version file, then after restarting cluster, its all fine.
my data path is here /usr/local/hadoop/hadoop_store/hdfs/datanode and /usr/local/hadoop/hadoop_store/hdfs/namenode.
FYI : version file is under /usr/local/hadoop/hadoop_store/hdfs/datanode/current/ ; likewise for NameNode.
Here, the datanode gets stopped immediately because the clusterID of datanode and namenode are different. So you have to format the clusterID of namenode with clusterID of datanode
Copy the datanode clusterID for your example, CID-8bf63244-0510-4db6-a949-8f74b50f2be9 and run following command from your home directory. You can go to your home dir by just typing cd on your terminal.
From your home dir now type the command:
hdfs namenode -format -clusterId CID-8bf63244-0510-4db6-a949-8f74b50f2be9
Delete the namenode and datanode directories as specified in the core-site.xml.
After that create the new directories and restart the dfs and yarn.
I also had the similar issue.
I deleted namenode and datanode folders from all the nodes, and rerun:
$HADOOP_HOME/bin> hdfs namenode -format -force
$HADOOP_HOME/sbin> ./start-dfs.sh
$HADOOP_HOME/sbin> ./start-yarn.sh
To check the health report from command line (which I would recommend)
$HADOOP_HOME/bin> hdfs dfsadmin -report
and I got all the nodes working correctly.
I had same issue for hadoop 2.7.7
I removed the namenode/current & datanode/current directory on namenode and all the datanodes
Removed files at /tmp/hadoop-ubuntu/*
then format namenode & datanode
restart all the nodes.
things work fine
steps:
stop all nodes/managers then attempt below steps
rm -rf /tmp/hadoop-ubuntu/* (all nodes)
rm -r /usr/local/hadoop/data/hdfs/namenode/current (namenode: check hdfs-site.xml for path)
rm -r /usr/local/hadoop/data/hdfs/datanode/current (datanode:check hdfs-site.xml for path)
hdfs namenode -format (on namenode)
hdfs datanode -format (on namenode)
Reboot namenode & data nodes
There's been different solutions to this problem, but I tested another easy solution and it worked like a charm :
So if someone get the same error, you just need to change the clusterID in the datanodes with clusterID of the namenode in the VERSION file.
With your case, here's were you can change it on datanode side :
namenode clusterID = CID-fb61aa70-4b15-470e-a1d0-12653e357a10; datanode clusterID = CID-8bf63244-0510-4db6-a949-8f74b50f2be9
Backup the current VERSION : cp /home/prassanna/usr/local/hadoop/yarn_data/hdfs/datanode/current/VERSION /home/prassanna/usr/local/hadoop/yarn_data/hdfs/datanode/current/VERSION.BK
vim /home/prassanna/usr/local/hadoop/yarn_data/hdfs/datanode/current/VERSION and change
clusterID=CID-8bf63244-0510-4db6-a949-8f74b50f2be9
with
clusterID=CID-fb61aa70-4b15-470e-a1d0-12653e357a10
Restart the datanode and it should work.

Resources