Hadoop hdfs: input/output error when creating user folder - hadoop

I've followed the instructions in Hadoop the definitive guide, 4th edition : Appendix A to configure Hadoop in pseudo-distributed mode. Everything is working good, except for when I try to make a directory :
hadoop fs -mkdir -p /user/$USER
The commande is returning the following message : mkdir: /user/my_user_name': Input/output error.
Although, when I first log into my root account sudo -s and then type the hadoop fs -mkdir -p /user/$USER commande, the directory 'user/root'is created (all directories in the path).
I think I'm having Hadoop permission issues.
Any help would be really appreciated,
Thanks.

It means that you have a mistake in the 'core-site.xml' file. For instance, I had an error in the first line (name) in which I wrote 'fa.defaultFS' instead 'fs.defaultFS'.
After that, you have to execute the script 'stop-all.sh' to stop Hadoop. Probably, here, you will have to format the namenode with the commands: 'rm -Rf /app/tmp/your-username/*' and 'hdfs namenode -format'. Next, you have to start Hadoop with the 'start-all.sh' script.
Maybe, you have to reboot the system when you have executed the stop script.
After these steps, I could run that command again.

I Corrected the core-site.xml file based on standard commands and it works fine now.
<property>
<name>hadoop.tmp.dir</name>
<value>/home/your_user_name/hadooptmpdata</value>
<description>Where Hadoop will place all of its working files</description>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
<description>Where HDFS NameNode can be found on the network</description>
</prosperty>

Related

Access hdfs from outside the cluster

I have a hadoop cluster on aws and I am trying to access it from outside the cluster through a hadoop client. I can successfully hdfs dfs -ls and see all contents but when I try to put or get a file I get this error:
Exception in thread "main" java.lang.NullPointerException
at org.apache.hadoop.fs.FsShell.displayError(FsShell.java:304)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:289)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:340)
I have hadoop 2.6.0 installed in both my cluster and my local machine. I have copied the conf files of the cluster to the local machine and have these options in hdfs-site.xml (along with some other options).
<property>
<name>dfs.client.use.datanode.hostname</name>
<value>true</value>
</property>
<property>
<name>dfs.permissions.enable</name>
<value>false</value>
</property>
My core-site.xml contains a single property in both the cluster and the client:
<property>
<name>fs.defaultFS</name>
<value>hdfs://public-dns:9000</value>
<description>NameNode URI</description>
</property>
I found similar questions but wasn't able to find a solution to this.
How about you SSH into that machine?
I know this is a very bad idea but to get the work done, you can first copy that file on machine using scp and then SSH into that cluster/master and do hdfs dfs -put on that copied local file.
You can also automate this via a script but again, this is just to get the work done for now.
Wait for someone else to answer to know the proper way!
I had similar issue with my cluster when running hadoop fs -get and I could resolve it. Just check if all your data nodes are resolvable using FQDN(Fully Qualified Domain Name) from your local host. In my case nc command was successful using ip addresses for data nodes but not with host name.
run below command :
for i in cat /<host list file>; do nc -vz $i 50010; done
50010 is default datanode port
when you run any hadoop command it try to connect to data nodes using FQDN and thats where it gives this weird NPE.
Do below export and run your hadoop command
export HADOOP_ROOT_LOGGER=DEBUG,console
you will see this NPE comes when it is trying to connect to any datanode for data transfer.
I had a java code which was also doing hadoop fs -get using APIs and there ,exception was more clearer
java.lang.Exception: java.nio.channels.UnresolvedAddressException
Let me know if this helps you.

Hadoop permission issue

I've homebrew installed hadoop but now having permission control problems when doing
hadoop namenode -format and ./start-all.sh command.
I think it's because I put settings in "core-site.xml". The "hadoop.tmp.dir" I put "/tmp/${name}" under.
Now it's giving me error in namenode -format as: can't create folder, permission denied.
Even I sudo this command, but in the start-all.sh command, still a lot of permissions are denied. I tried to sudo start-all.sh but the password (I only use this pass for my admin on mac) but denied also.
I think it's because of the permission issues. Is there anyway I can fix it?
Thanks!
On your local system, it looks like you do not have the hduser user created.
As a typical setup process, it is a good process to create a hadoop group and a hduser user added to that group.
You can do that with the root/super user account with the following command:
$ sudo adduser --ingroup hadoop hduser
This assumes you have the hadoop group setup. If that is not setup, you can create a group with:
$ sudo addgroup hadoop
So when you run Hadoop it stores things in the data, name, and tmp dirs that you configure in the hdfs-site.xml file. If you don't set these settings they will point to ${hadoop.tmp.dir}/dfs/data, in your case the /tmp dir. This is not where you want your data stored. You will first need to add these to your hdfs config file, among other settings.
On master :
<property>
<name>dfs.data.dir</name>
<value>/app/hadoop/data</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/app/hadoop/name</value>
</property>
On slaves :
<property>
<name>dfs.data.dir</name>
<value>/app/hadoop/data</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>master:/app/hadoop/name</value>
</property>
Now once this is done you must actually make those directories. So create the following dirs on master :
/app/hadoop/name, /app/hadoop/data, and /app/hadoop/tmp.
Create the same on slaves except the name dir.
Now you need to set the permissions so that they can be used by Hadoop.
The second line just to be sure.
sudo chown <hadoop user>:<hadoop user> /app/hadoop/name /app/hadoop/data /app/hadoop/tmp
sudo chmod 0777 /app/hadoop/name /app/hadoop/data /app/hadoop/tmp
Try that, see if it works. I can answer questions if it's not the whole answer.

Hadoop keeps on writing mapred intermediate outuput in /tmp directory

I have limited capacity in /tmp so I want to move all the intermediate output of mapred in a bigger partition, say /home/hdfs/tmp_data .
If I understand correctly, I just need to set
<property>
<name>mapred.child.tmp</name>
<value>/home/hdfs/tmp_data</value>
in mapred-site.xml
I restart the cluster through Ambari, I check everything is written in the conf file,
however, when I run a pig script, it keeps writing in:
/tmp/hadoop-hdfs/mapred/local/taskTracker/hdfs/jobcache/job_localXXX/attempt_YY/output
I have also modified hadoop.tmp.dir in core-site.xml to be /home/hdfs/tmp_data , but nothing changes.
Is there any parameter that overwrite my settings?
Try override the following property in tasktracker nodes mapred-site.xml file and restart it.
<property>
<name>mapred.local.dir/name>
<value>/home/hdfs/tmp_data</value>
</property>

Why do we need to format HDFS after every time we restart machine?

I have installed Hadoop in pseudo distributed mode on my laptop, OS is Ubuntu.
I have changed paths where hadoop will store its data (by default hadoop stores data in /tmp folder)
hdfs-site.xml file looks as below :
<property>
<name>dfs.data.dir</name>
<value>/HADOOP_CLUSTER_DATA/data</value>
</property>
Now whenever I restart machine and try to start hadoop cluster using start-all.sh script, data node never starts. I confirmed that data node is not start by checking logs and by using jps command.
Then I
Stopped cluster using stop-all.sh script.
Formatted HDFS using hadoop namenode -format command.
Started cluster using start-all.sh script.
Now everything works fine even if I stop and start cluster again. Problem occurs only when I restart machine and try to start the cluster.
Has anyone encountered similar problem?
Why this is happening and
How can we solve this problem?
By changing dfs.datanode.data.dir away from /tmp you indeed made the data (the blocks) survive across a reboot. However there is more to HDFS than just blocks. You need to make sure all the relevant dirs point away from /tmp, most notably dfs.namenode.name.dir (I can't tell what other dirs you have to change, it depends on your config, but the namenode dir is mandatory, could be also sufficient).
I would also recommend using a more recent Hadoop distribution. BTW, the 1.1 namenode dir setting is dfs.name.dir.
For those who use hadoop 2.0 or above versions config file names may be different.
As this answer points out, go to the /etc/hadoop directory of your hadoop installation.
Open the file hdfs-site.xml. This user configuration will override the default hadoop configurations, that are loaded by the java classloader before.
Add dfs.namenode.name.dir property and set a new namenode dir (default is file://${hadoop.tmp.dir}/dfs/name).
Do the same for dfs.datanode.data.dir property (default is file://${hadoop.tmp.dir}/dfs/data).
For example:
<property>
<name>dfs.namenode.name.dir</name>
<value>/Users/samuel/Documents/hadoop_data/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/Users/samuel/Documents/hadoop_data/data</value>
</property>
Other property where a tmp dir appears is dfs.namenode.checkpoint.dir. Its default value is: file://${hadoop.tmp.dir}/dfs/namesecondary.
If you want, you can easily also add this property:
<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>/Users/samuel/Documents/hadoop_data/namesecondary</value>
</property>

Hadoop - namenode is not starting up

I am trying to run hadoop as a root user, i executed namenode format command hadoop namenode -format when the Hadoop file system is running.
After this, when i try to start the name node server, it shows error like below
13/05/23 04:11:37 ERROR namenode.FSNamesystem: FSNamesystem initialization failed.
java.io.IOException: NameNode is not formatted.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:330)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:100)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:411)
I tried to search for any solution, but cannot find any clear solution.
Can anyone suggest?
Thanks.
DFS needs to be formatted. Just issue the following command after stopping all and then restart.
hadoop namenode -format
Cool, i have found the solution.
Stop all running server
1) stop-all.sh
Edit the file /usr/local/hadoop/conf/hdfs-site.xml and add below configuration if its missing
<property>
<name>dfs.data.dir</name>
<value>/app/hadoop/tmp/dfs/name/data</value>
<final>true</final>
</property>
<property>
<name>dfs.name.dir</name>
<value>/app/hadoop/tmp/dfs/name</value>
<final>true</final>
</property>
Start both HDFS and MapReduce Daemons
2) start-dfs.sh
3) start-mapred.sh
Then now run the rest of the steps to run the map reduce sample given in this link
Note : You should be running the command bin/start-all.sh if the direct command is not running.
format hdfs when namenode stop.(just like the top answer).
I add some more details.
FORMAT command will check or create path/dfs/name, and initialize or reinitalize it.
then running start-dfs.sh would run namenode, datanode, then namesecondary.
when namenode check not exist path/dfs/name or not initialize, it occurs a fatal error, then exit.
that's why namenode not start up.
more details you can check HADOOP_COMMON/logs/XXX.namenode.log
Make sure the directory you've specified for your namenode is completely empty. Something like a "lost+found" folder in said directory will trigger this error.
hdfs-site.xml your value is wrong. You input the wrong folder that's why is not starting the name node.
First mkdir [folder], then set hdfs-site.xml then format
make sure that the directory to name(dfs.name.dir) and data (dfs.data.dir) folder is correctly listed in hdfs-site.xml
Formatting namenode worked for me
bin/hadoop namenode -format

Resources