Hadoop 2.2 Add new Datanode to an existing hadoop installation

Hadoop 2.2 Add new Datanode to an existing hadoop installation - hadoop

I first installed hadoop 2.2 on my machine (called Abhishek-PC) and everything worked fine. I am able to run the entire system successfully. (both namenode and datanode).
Now I created 1 VM hdclient1 and I want to add this VM as a data node.
Here are the steps which I have followed
I setup SSH successfully and I can ssh into hdclient1 without a password and I can login from hdclient1 into my main machine without a password.
I setup hadoop 2.2 on this VM and I modified the configuration files as per many tutorials on the web. Here are my configuration files
Name Node configuration
https://drive.google.com/file/d/0B0dV2NMSGYPXdEM1WmRqVG5uYlU/edit?usp=sharing
Data Node configuration
https://drive.google.com/file/d/0B0dV2NMSGYPXRnh3YUo1X2Frams/edit?usp=sharing
Now when I start start-dfs.sh on my first machine, I can see that DataNode starts successfully on hdclient1. Here is a screenshot from my hadoop console.
https://drive.google.com/file/d/0B0dV2NMSGYPXOEJ3UV9SV1d5bjQ/edit?usp=sharing
As you can see both the machines appear in my cluster (main main and data node).
Although both are called "localhost" for some strange reason.
I can see that the logs are being created on hdclient1in those logs there are no exceptions.
here are the logs from the name node
https://drive.google.com/file/d/0B0dV2NMSGYPXM0dZTWVRUWlGaDg/edit?usp=sharing
Here are the logs from the data node
https://drive.google.com/file/d/0B0dV2NMSGYPXNV9wVmZEcUtKVXc/edit?usp=sharing
I can login to the namenode UI successfully http://Abhishek-PC:50070
but here the UI in the live nodes it says only 1 live node and there is no mention of hdclient1.
https://drive.google.com/file/d/0B0dV2NMSGYPXZmMwM09YQlI4RzQ/edit?usp=sharing
I can create a directory in hdfs successfully hadoop fs -mkdir /small
From the datanode I can see that this directory has been created by using this command hadoop fs -ls /
Now when I try to add a file to my HDFS and I say
hadoop fs -copyFromLocal ~/Downloads/book/war_and_peace.txt /small
i get an error message
abhishek#Abhishek-PC:~$ hadoop fs -copyFromLocal
~/Downloads/book/war_and_peace.txt /small 14/01/04 20:07:41 WARN
util.NativeCodeLoader: Unable to load native-hadoop library for your
platform... using builtin-java classes where applicable 14/01/04
20:07:41 WARN hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
/small/war_and_peace.txt.COPYING could only be replicated to 0 nodes
instead of minReplication (=1). There are 1 datanode(s) running and
no node(s) are excluded in this operation. at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477)
at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
So my question is What am I doing wrong here? Why do I get this exception when I try to copy the file into HDFS?

We have a 3-node cluster (all physical boxes) that's been working great for a couple of months. This article helped me the most to setup.

Related

Hadoop HDFS start up fails requires formatting

I have a multi-node standalone hadoop cluster for HDFS. I am able to load data to HDFS, however everytime I reboot my computer and start the cluster by start-dfs.sh, I don't see the dashboard until I perform hdfs namenode -format which erases all my data.
How do I start hadoop cluster without having to go through hdfs namenode -format?

You need to shutdown hdfs and the namenode cleanly (stop-dfs) before you shutdown your computer. Otherwise, you can corrupt the namenode, causing you to need to format to get back to a clean state

HDFS with Talend

I am trying to put a csv file from my local windows machine to HDFS using Talend 6.3
I have 4 node cluster ( all linux servers from azure):
1 Namenode and 3 Datanodes.
I am getting following error while running:
"Exception in component tHDFSPut_1
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/root/5/source.csv could only be replicated to 0 nodes instead of minReplication (=1). There are 3 datanode(s) running and 3 node(s) are excluded in this operation."
File is getting created in HDFS but it is empty file.
Note: I am writing my Namenode IP address in NameNode URL
I tried same with sandbox. Same error was there.
Update:
I created a complete new cluster(1 namenode and 3 datanodes). Still I am getting same error.
Everything is running and is is in green (in Ambari). and I am able to put file through hadoop fs -put. Might be some problem in talend?

Can't put files into HDFS

I'm trying to set up a hadoop multi-node cluster and I'm getting the following problem.
I have one node as master and another one as slave.
It seems that everything is all right because when I execute {jps} I get this processes for the master:
{
29983 SecondaryNameNode
30596 Jps
29671 NameNode
30142 ResourceManager
}
And this ones for the slave:
{
18096 NodeManager
17847 DataNode
18197 Jps
}
Unfortunately, when I try -put command, I get this error:
hduser#master:/usr/local/hadoop/bin$ ./hdfs dfs -put /home/hduser/Ejemplos/fichero /Ejemplos/
14/03/24 12:49:06 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/03/24 12:49:07 WARN hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /Ejemplos/fichero.COPYING could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
When I go to the WebUI, there are 0 live nodes and I don't know why!
I can't fix this error and I would appreciate some help!

File /Ejemplos/fichero.COPYING could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
the above error means, data nodes are either down or unable to properly communicate with namenode. You can check for the configuration which you specified in hdfs-site.xml, core-site.xml.

I had similar issue and solved it like this:
stop hadoop (stop-dfs.sh and stop-yarn.sh)
manually delete dfs/namenode and dfs/datanode directories
format namenode (hdfs namenode -format)
start hadoop (start-dfs.sh and start-yarn.sh)
There might be other issues like lack of disk space or slaves not configured in $HADOOP_HOME/etc/hadoop (this is actually configured to localhost by default.)

You will want to check the log files of your data node (slave) for errors in your set up. If you run cloudera CDH, you'll find these in /var/log/hadoop-hdfs, otherwise in the directory specified in your config.
The error "could only be replicated to 0 nodes" points to a problem there.
Also make sure that slave and master can connect via ssh with key authentication.
Just a quick question: you did format your namenode?

you are using as
./hdfs dfs -put....
Try
hadoop fs -put LocalFile.name /username/
or
hadoop fs -copyFromLocal LocalFile.name /username/

Cloudera CDH4 - how come I can't browse the hdfs filesystem from the nodes?

I installed my test cluster using Cloudera Manager free.
I can only browse the filesystem from the main NameNode. When running hadoop dfs -ls only shows the local folder.
JPS shows the Jps, TaskTracker, DataNode on the nodes.
MapReduce tasks/jobs run fine on all the nodes as a cluster.
With my custom setup Hadoop cluster (without Cloudera), I can easily browse and manipulate the hdfs filesystem (eg. I can run hadoop dfs -mkdir test1 on all the nodes - but only on the NameNode in CDH4)
why is this?

Try using the command ./bin/hadoop fs -ls / for HDFS browsing.

How do you establish single node Hadoop instance on AWS using Apache Whirr?

I am attempting to run a single-node instance of Hadoop on Amazon Web Services using Apache Whirr. I set whirr.instance-templates equal to 1 jt+nn+dn+tt. The instance starts up fine. I am able to create directories, but when I try to put files, I get a File could only be replicated to 0 nodes, instead of 1 error. When I do a hadoop fsck / I get a Exception in thread "main" java.net.ConnectException: Connection refused error. Does anyone know what is wrong with my configuration?

I made the experience that whirr does not always start all services reliable. It sounds like the namenode started (the namenode is responsible for storing directory information) but the datanode did not start (the datanode stores the data).
Try running
hadoop dfsadmin -report
to see if a datanode is available.
If not: often it helps to restart the cluster.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio