Backup hdfs directory from full-distributed to a local directory? - hadoop

I'm trying to back up a directory from hdfs to a local directory. I have a hadoop/hbase cluster running on ec2. I managed to do what I want running in pseudo-distributed on my local machine but now I'm fully distributed the same steps are failing. Here is what worked for pseudo-distributed
hadoop distcp hdfs://localhost:8020/hbase file:///Users/robocode/Desktop/
Here is what I'm trying on the hadoop namenode (hbase master) on ec2
ec2-user#ip-10-35-53-16:~$ hadoop distcp hdfs://10.35.53.16:8020/hbase file:///~/hbase
The errors I'm getting are below
13/04/19 09:07:40 INFO tools.DistCp: srcPaths=[hdfs://10.35.53.16:8020/hbase]
13/04/19 09:07:40 INFO tools.DistCp: destPath=file:/~/hbase
13/04/19 09:07:41 INFO tools.DistCp: file:/~/hbase does not exist.
With failures, global counters are inaccurate; consider running with -i
Copy failed: java.io.IOException: Failed to createfile:/~/hbase
at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1171)
at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)

You can't use the ~ character in Java to represent the current home directory, so change to a fully qualified path, e.g.:
file:///home/user1/hbase
But i think you're going to run into problems in a fully distributed environment as the distcp command runs a map reduce job, so the destination path will be interpreted as local to each cluster node.
If you want to pull data down from HDFS to a local directory, you'll need to use the -get or -copyToLocal switches to the hadoop fs command

Related

Hadoop errorcode -1000, No space available in any of the local directories

I'm using Windows 7 with Hadoop 2.10.1 installed as shown here: https://exitcondition.com/install-hadoop-windows/ and I get an error when running my job:
INFO mapreduce.Job:
Job job_1605374051781_0001 failed with state FAILED due to:
Application application_1605374051781_0001 failed 2 times
due to AM Container for appattempt_1605374051781_0001_000002 exited with
exitCode: -1000 Failing this attempt.Diagnostics:
[2020-11-14 18:17:54.217]No space available in any of the local directories.
The expected output is several lines of text and my disks are nowhere near full (at least 10GB free). The code is some generic mapreduce job that I cannot post here because it's the intellectual property of the university.
Any tips on how to solve the "No space available" error?
For clarification I'm using only my PC, I'm not connected to other machines.
PS: I've solved it, as said here: Hadoop map reduce example stuck on Running job by user "banu reddy" https://stackoverflow.com/users/4249076/banu-reddy the free HDD space needs to be at least 10% od the disk.
Hadoop's jobs are executed within the framework's distributed filesystem aka HDFS, which works independently from the local filesystem (even by operating in just one machine, as you clarified).
That basically means that the error you got referred to the disk space available in the HDFS and not on your hard drives in general. To check if the HDFS has enough disk space to run the job or not, you can execute the following command on the terminal:
hdfs dfs -df -h
Which can have an output like this (ignoring the warning I get on my Hadoop setup):
If the command output in your system indicates that the available disk space is low or non-existent, you can individualy delete directories from the HDFS
by firstly checking what directories and files are stored:
hadoop fs -ls
And then deleting each directory from the HDFS:
hadoop fs -rm -r name_of_the_folder
Or file from the HDFS:
hadoop fs -rm name_of_the_file
Alternatively, you can empty everything stored in the HDFS to be sure that you will not hit the disk space limit again any time soon. You can do that by stopping the YARN and HDFS daemons at first:
stop-all.sh
Then enabling only the HDFS daemon:
start-dfs.sh
Then formatting everything on the namenode (aka the HDFS in your system, not your local files of course):
hadoop namenode -format
And enabling YARN and HDFS daemons at last:
start-all.sh
Remember to re-run the hdfs dfs -df -h command after deleting stuff in the HDFS so you make sure you have free space on the HDFS.

Cannot start running on browser the namenode for Hadoop

It is my first time in installing Hadoop on my Linux (Fedora distro) running on VM (using Parallel on my Mac). And I followed every step on this video and including the textual version of it.And then when I run it on localhost (or the equivalent value from hostname) in port 50070, I got the following message.
...can't establish a connection to the server at localhost:50070
When I run the jps by the way command I don't have the datanode and namenode unlike at the end of the textual version tutorial which has the following:
While mine has only the following processes running:
6021 NodeManager
3947 SecondaryNameNode
5788 ResourceManager
8941 Jps
When I run the hadoop namenode command I have some of the following [redacted] error:
Cannot access storage directory /usr/local/hadoop_store/hdfs/namenode
16/10/11 21:52:45 WARN namenode.FSNamesystem: Encountered exception loading fsimage
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /usr/local/hadoop_store/hdfs/namenode is in an inconsistent state: storage directory does not exist or is not accessible.
I tried to access by the way the above mentioned directories and it existed.
Any hint for this newbie? ;-)
You would need to give read and write permission to user with which you are running the services on directory /usr/local/hadoop_store/hdfs/namenode.
Once done, you should run format command using hadoop namenode -format
Then try to start your services.
delete files /app/hadoop/tmp/*
and try again formatting the namenode and then start-dfs.sh & start-yarn.sh

spark with Hadoop 2.3.0 on Mesos 0.21.0 with error "sh: 1: hadoop: not found" on slave

I am setting up for spark with Hadoop 2.3.0 on Mesos 0.21.0. when I try spark on the master, I get these error messages fro stderr of mesos slave:
WARNING: Logging before InitGoogleLogging() is written to STDERR
I1229 12:34:45.923665 8571 fetcher.cpp:76] Fetching URI
'hdfs://10.170.207.41/spark/spark-1.2.0.tar.gz'
I1229 12:34:45.925240 8571 fetcher.cpp:105] Downloading resource from
'hdfs://10.170.207.41/spark/spark-1.2.0.tar.gz' to
'/tmp/mesos/slaves/20141226-161203-701475338-5050-6942-S0/frameworks/20141229-111020-701475338-5050-985-0001/executors/20141226-161203-701475338-5050-6942-S0/runs/8ef30e72-d8cf-4218-8a62-bccdf673b5aa/spark-1.2.0.tar.gz'
E1229 12:34:45.927089 8571 fetcher.cpp:109] HDFS copyToLocal failed:
hadoop fs -copyToLocal 'hdfs://10.170.207.41/spark/spark-1.2.0.tar.gz'
'/tmp/mesos/slaves/20141226-161203-701475338-5050-6942-S0/frameworks/20141229-111020-701475338-5050-985-0001/executors/20141226-161203-701475338-5050-6942-S0/runs/8ef30e72-d8cf-4218-8a62-bccdf673b5aa/spark-1.2.0.tar.gz'
sh: 1: hadoop: not found
Failed to fetch: hdfs://10.170.207.41/spark/spark-1.2.0.tar.gz
Failed to synchronize with slave (it's probably exited)
The interesting thing is that when i switch to the slave node and run the same command
hadoop fs -copyToLocal 'hdfs://10.170.207.41/spark/spark-1.2.0.tar.gz'
'/tmp/mesos/slaves/20141226-161203-701475338-5050-6942-S0/frameworks/20141229-111020-701475338-5050-985-0001/executors/20141226-161203-701475338-5050-6942-S0/runs/8ef30e72-d8cf-4218-8a62-bccdf673b5aa/spark-1.2.0.tar.gz'
, it goes well.
When starting mesos slave, you have to specify the path to your hadoop installation through the following parameter:
--hadoop_home=/path/to/hadoop
Without that it just didn't work for me, even though I had the HADOOP_HOME environment variable set up.

Hadoop Namenode not starting

I am getting java.io.IOException: Failed to load an FSImage file while starting Namenode
ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: Failed to load image from FSImageFile(file=/opt1/dfs/nn/current/fsimage_0000000000023479779, cpktTxId=0000000000023479779)
java.io.IOException: Unexpected block size: -1945969516689645797
java.io.IOException: Failed to load an FSImage file!
And namenode is not getting started because of this.
I have 1 namenode, 1 seconday namenode and 3 datanodes in my cluster.
Can someone help me recover my cluster.
Try formatting the namenode:
hadoop namenode -format
Note that in order to have access permission for formatting you must execute the command as hdfs user.
In order the switch for example from cloudera user to hdfs do the following:
sudo bash
su - hdfs
If you still get errors regarding the datanode I probably need to clear the datanode folder:
Try deleting first the datanode folder (you can find it in the configuration file under dos.data.dir, dfs.datanode.data.dir).
If you have a cloudera-quickstart-vm the location is /var/lib/hadoop-hdfs/cache/hdfs/dfs/data
Stop all data node and secondary name node and format name node using the command:
sudo -u hdfs hdfs namenode -format
Restart name node and data nodes

Hadoop 2.2 Add new Datanode to an existing hadoop installation

I first installed hadoop 2.2 on my machine (called Abhishek-PC) and everything worked fine. I am able to run the entire system successfully. (both namenode and datanode).
Now I created 1 VM hdclient1 and I want to add this VM as a data node.
Here are the steps which I have followed
I setup SSH successfully and I can ssh into hdclient1 without a password and I can login from hdclient1 into my main machine without a password.
I setup hadoop 2.2 on this VM and I modified the configuration files as per many tutorials on the web. Here are my configuration files
Name Node configuration
https://drive.google.com/file/d/0B0dV2NMSGYPXdEM1WmRqVG5uYlU/edit?usp=sharing
Data Node configuration
https://drive.google.com/file/d/0B0dV2NMSGYPXRnh3YUo1X2Frams/edit?usp=sharing
Now when I start start-dfs.sh on my first machine, I can see that DataNode starts successfully on hdclient1. Here is a screenshot from my hadoop console.
https://drive.google.com/file/d/0B0dV2NMSGYPXOEJ3UV9SV1d5bjQ/edit?usp=sharing
As you can see both the machines appear in my cluster (main main and data node).
Although both are called "localhost" for some strange reason.
I can see that the logs are being created on hdclient1in those logs there are no exceptions.
here are the logs from the name node
https://drive.google.com/file/d/0B0dV2NMSGYPXM0dZTWVRUWlGaDg/edit?usp=sharing
Here are the logs from the data node
https://drive.google.com/file/d/0B0dV2NMSGYPXNV9wVmZEcUtKVXc/edit?usp=sharing
I can login to the namenode UI successfully http://Abhishek-PC:50070
but here the UI in the live nodes it says only 1 live node and there is no mention of hdclient1.
https://drive.google.com/file/d/0B0dV2NMSGYPXZmMwM09YQlI4RzQ/edit?usp=sharing
I can create a directory in hdfs successfully hadoop fs -mkdir /small
From the datanode I can see that this directory has been created by using this command hadoop fs -ls /
Now when I try to add a file to my HDFS and I say
hadoop fs -copyFromLocal ~/Downloads/book/war_and_peace.txt /small
i get an error message
abhishek#Abhishek-PC:~$ hadoop fs -copyFromLocal
~/Downloads/book/war_and_peace.txt /small 14/01/04 20:07:41 WARN
util.NativeCodeLoader: Unable to load native-hadoop library for your
platform... using builtin-java classes where applicable 14/01/04
20:07:41 WARN hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
/small/war_and_peace.txt.COPYING could only be replicated to 0 nodes
instead of minReplication (=1). There are 1 datanode(s) running and
no node(s) are excluded in this operation. at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477)
at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
So my question is What am I doing wrong here? Why do I get this exception when I try to copy the file into HDFS?
We have a 3-node cluster (all physical boxes) that's been working great for a couple of months. This article helped me the most to setup.

Resources