Hadoop: FileNotFoundExcepion when getting file from DistributedCache - hadoop

I’ve 2 nodes cluster (v1.04), master and slave. On the master, in Tool.run() we add two files to the DistributedCache using addCacheFile(). Files do exist in HDFS.
In the Mapper.setup() we want to retrieve those files from the cache using
FSDataInputStream fs = FileSystem.get( context.getConfiguration() ).open( path ).
The problem is that for one file a FileNotFoundException is thrown, although the file exists on the slave node:
attempt_201211211227_0020_m_000000_2: java.io.FileNotFoundException: File does not exist: /somedir/hdp.tmp.dir/mapred/local/taskTracker/distcache/-7769715304990780/master/tmp/analytics/1.csv
ls –l on the slave:
[hduser#slave ~]$ ll /somedir/hdp.tmp.dir/mapred/local/taskTracker/distcache/-7769715304990780/master/tmp/ analytics/1.csv
-rwxr-xr-x 1 hduser hadoop 42701 Nov 22 10:18 /somedir/hdp.tmp.dir/mapred/local/taskTracker/distcache/-7769715304990780/master/tmp/ analytics/1.csv
My questions are:
Shouldn't all files exist on all nodes?
What should be done to fix that?
Thanks.

Solved - should have beed used:
FileSystem.getLocal( conf )
Thanks to Harsh J from Hadoop mailing list.

Related

Hadoop Nodemanager failing with error Can't get group information

I have kerberos configured Apache hadoop(2.8.5) installed. NameNode, DataNode and ResourceManager is running fine, but Nodemanager is failing to start with error:
Can't get group information for hadoop#configured value of yarn.nodemanager.linux-container-executor.group - Success.
file permissions:
container-executor.cfg: -rw------- 1 root hadoop
container-executor: ---Sr-s--- 1 root hadoop
container-executor.cfg
yarn.nodemanager.local-dirs=/hadoop/data/yarn/local
yarn.nodemanager.linux-container-executor.group=hadoop#configured value
of yarn.nodemanager.linux-container-executor.group
banned.users=hdfs,yarn,mapred,bin,root#comma separated list of users who can not run applications
min.user.id=1000#Prevent other super-users
Simply remove the comment:
#configured value
from the configuration line:
yarn.nodemanager.linux-container-executor.group
on the container-executor.cfg file
It should looke like this:
yarn.nodemanager.local-dirs=/hadoop/data/yarn/local
yarn.nodemanager.linux-container-executor.group=hadoop
of yarn.nodemanager.linux-container-executor.group
banned.users=hdfs,yarn,mapred,bin,root
min.user.id=1000
This configuration file has had historical problem with spaces, comments, etc..

DataNode is Not Starting in singlenode hadoop 2.6.0

I installed hadoop 2.6.0 in my laptop running Ubuntu 14.04LTS. I successfully started the hadoop daemons by running start-all.sh and I run a WourdCount example successfully, then I tried to run a jar example that didn't work with me so I decide to format using hadoop namenode -format and start all over again but when I start all daemons using start-dfs.sh && start-yarn.sh then jps all daemons runs but not the datanode as shown bellow:
hdferas#feras-Latitude-E4310:/usr/local/hadoop$ jps
12628 NodeManager
12110 NameNode
12533 ResourceManager
13335 Jps
12376 SecondaryNameNode
How to solve that?
I have faced this issue and it is very easy to solve. Your datanode is not starting because after your namenode and datanode started running you formatted the namenode again. That means you have cleared the metadata from namenode. Now the files which you have stored for running the word count are still in the datanode and datanode has no idea where to send the block reports since you formatted the namenode so it will not start.
Here are the things you need to do to fix it.
Stop all the Hadoop services (stop-all.sh) and close any active ssh connections.
cat /usr/local/hadoop/etc/hadoop/hdfs-site.xml
This step is important, see where datanode's data is gettting stored. It is the value associated for datanode.data.dir. For me it is /usr/local/hadoop/hadoop_data/hdfs/datanode. Open your terminal and navigate to above directory and delete the directory named current which will be there under that directory. Make sure you are only deleting the "current" directory.
sudo rm -r /usr/local/hadoop/hadoop_data/hdfs/datanode/current
Now format the namenode and check whether everything is fine.
hadoop namenode -format
say yes if it asks you for anything.
jps
Hope my answer solves the issue. If it doesn't let me know.
Little advice: Don't format your namenode. Without namenode there is no way to reconstruct the data. If your wordcount is not running that is some other problem.
I had this issue when formatting namenode too. What i did to solve the issue was:
Find your dfs.name.dir location. Consider for example, your dfs.name.dir is /home/hadoop/hdfs.
(a) Now go to, /home/hadoop/hdfs/current.
(b) Search for the file VERSION. Open it using a text editor.
(c) There will be a line namespaceID=122684525 (122684525 is my ID, yours will be different). Note the ID down.
Now find your hadoop.tmp.dir location. Mine is /home/hadoop/temp.
(a) Go to /home/hadoop/temp/dfs/data/current.
(b) Search for the file VERSION and open it using a text editor.
(c) There will be a line namespaceID=. The namespaceID in this file and previous one must be same.
(d) This is the main reason why my datanode was not started. I made them both same and now datanode starts fine.
Note: copy the namespaceID from /home/hadoop/hdfs/current/VERSION to
/home/hadoop/temp/dfs/data/current/VERSION. Dont do it in reverse.
Now do start-dfs.sh && start-yarn.sh. Datanode will be started.
You Just need To Remove All The Contents Of DataNode Folder And Format The Datanode By Using The Following Command
hadoop namenode -format
Even I had same issue and checked the log and found below error
Exception - Datanode log
FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain
java.io.IOException: All directories in dfs.datanode.data.dir are invalid: "/usr/local/hadoop_store/hdfs/datanode/
Ran the below command to resolve the issue
sudo chown -R hduser:hadoop /usr/local/hadoop_store
Note - I have create the namenode and datanode under the path /usr/local/hadoop_store
The above problem is occurred due to format the namenode (hadoop namenode -format) without stopping the dfs and yarn daemons. While formating namenode, the question given below is appeared and you press Y key for this.
Re-format filesystem in Storage Directory /tmp/hadoop-root/dfs/name ? (Y or N)
Solution,
You need to delete the files within the current(directory name) directory of dfs.name.dir, you mention in hdfs.site.xml. In my system dfs.name.dir is available in /tmp/hadoop-root/dfs/name/current.
rm -r /tmp/hadoop-root/dfs/name/current
By using the above comment I removed files inside in the current directory. Make sure you are only deleting the "current" directory.Again format the namenode after stopped the dfs and yarn daemons (stop-dfs.sh & stop-yarn.sh). Now datanode will start normally!!
at core-site.xml check for absolute path of temp directory, If this is not pointed correctly or not created (mkdir). The data node cant be started.
add below property in yarn-site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
not the right way to do it. but surely works~
remove files from your datanode ,namenode and tmp folder. any files/folders created inside these are owned by hadoop and may have some reference to the last run datanode details which may have failed or locked due to which the datanode does not star at the next attempt
I got the same issue (DataNode & TaskTracker would not come up).
RESOLUTION:
DELETE EVERY "CURRENT" SUB-DIRECTORY UNDER: data, name, and namesecondary to resolve DataNode/taskTracker not showing when you start-all.sh, then jps
(My dfs.name.dir location is: /home/training/hadoop-temp/dfs/data/current; /home/training/hadoop-temp/dfs/name/current; /home/training/hadoop-temp/dfs/namesecondary/current
Make sure you stop services: stop-all.sh
1. Go to each "current" sub-directory under data, name, namesecondary and remove/delete (example: rm -r name/current)
2. Then format: hadoop namenode -format
3. mkdir current under /home/training/hadoop-temp/dfs/data/current
4. Take the directory and contents from /home/training/hadoop-temp/dfs/name/current and copy into the /data/current directory
EXAMPLE: files under:
/home/training/hadoop-temp/dfs/name/current
[training#CentOS current]$ ls -l
-rw-rw-r--. 1 training training 9901 Sep 25 01:50 edits
-rw-rw-r--. 1 training training 582 Sep 25 01:50 fsimage
-rw-rw-r--. 1 training training 8 Sep 25 01:50 fstime
-rw-rw-r--. 1 training training 101 Sep 25 01:50 VERSION
5. Change the storageType=NAME_NODE in VERSION to storageType=DATA_NODE in the data/current/VERSION that you just copied over.
BEFORE:
[training#CentOS dfs]$ cat data/current/VERSION
namespaceID=1018374124
cTime=0
storageType=NAME_NODE
layoutVersion=-32
AFTER:
[training#CentOS dfs]$ cat data/current/VERSION
namespaceID=1018374124
cTime=0
storageType=DATA_NODE
layoutVersion=-32
6. Make sure each subdirectory below has the same files that name/current has for data, name, namesecondary
[training#CentOS dfs]$ pwd
/home/training/hadoop-temp/dfs/
[training#CentOS dfs]$ ls -l
total 12
drwxr-xr-x. 5 training training 4096 Sep 25 01:29 data
drwxrwxr-x. 5 training training 4096 Sep 25 01:19 name
drwxrwxr-x. 5 training training 4096 Sep 25 01:29 namesecondary
7. Now start the services: start-all.sh
You should see all 5 services when you type: jps
I am using hadoop-2.6.0.I resolved using:
1.Deleting all files within
/usr/local/hadoop_store/hdfs
command : sudo rm -r /usr/local/hadoop_store/hdfs/*
2.Format hadoop namenode
command : hadoop namenode -format
3.Go to ..../sbin directory(cd /usr/local/hadoop/sbin)
start-all.sh
use command==> hduser#abc-3551:/$ jps
Following services would be started now :
19088 Jps
18707 ResourceManager
19043 NodeManager
18535 SecondaryNameNode
18329 DataNode
18159 NameNode
When I had this same issue, the 'Current' folder wasn't even being created in my hadoop/data/datanode folder. If this is the case for you too,
~copy the contents of 'Current' from namenode and paste it into datanode folder.
~Then, open VERSION for datanode and change the storageType=NAME_NODE to storageType=DATA_NODE
~run jps to see that the datanode continues to run

hadoop fs -ls results in "no such file or directory"

I have installed and configured Hadoop 2.5.2 for a 10 node cluster. 1 is acting as masternode and other nodes as slavenodes.
I have problem in executing hadoop fs commands. hadoop fs -ls command is working fine with HDFS URI. It gives message "ls: `.': No such file or directory" when used without HDFS URI
ubuntu#101-master:~$ hadoop fs -ls
15/01/30 17:03:49 WARN util.NativeCodeLoader: Unable to load native-hadoop
ibrary for your platform... using builtin-java classes where applicable
ls: `.': No such file or directory
ubuntu#101-master:~$
Whereas, executing the same command with HDFS URI
ubuntu#101-master:~$ hadoop fs -ls hdfs://101-master:50000/
15/01/30 17:14:31 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
Found 3 items
drwxr-xr-x - ubuntu supergroup 0 2015-01-28 12:07 hdfs://101-master:50000/hvision-data
-rw-r--r-- 2 ubuntu supergroup 15512587 2015-01-28 11:50 hdfs://101-master:50000/testimage.seq
drwxr-xr-x - ubuntu supergroup 0 2015-01-30 17:03 hdfs://101-master:50000/wrodcount-in
ubuntu#101-master:~$
I am getting exception in MapReduce program due to this behavior. jarlib is referring to the HDFS file location, whereas, I want jarlib to refer to the jar files stored at the local file system on the Hadoop nodes.
The behaviour that you are seeing is expected, let me explain what's going on when you are working with hadoop fs commands.
The command's syntax is this: hadoop fs -ls [path]
By default, when you don't specify [path] for the above command, hadoop expands the path to /home/[username] in hdfs; where [username] gets replaced with linux username who is executing the command.
So, when you execute this command:
ubuntu#xad101-master:~$ hadoop fs -ls
the reason you are seeing the error is ls: '.': No such file or directory because hadoop is looking for this path /home/ubuntu, it seems like this path doesn't exist in hdfs.
The reason why this command:
ubuntu#101-master:~$ hadoop fs -ls hdfs://101-master:50000/
is working because, you have explicitly specified [path] and is the root of the hdfs. You can also do the same using this:
ubuntu#101-master:~$ hadoop fs -ls /
which automatically gets evaluated to the root of hdfs.
Hope, this clears the behaviour you are seeing while executing hadoop fs -ls command.
Hence, if you want to specify local file system path use file:/// url scheme.
this has to do with the missing home directory for the user. Once I created the home directory under the hdfs for the logged in user, it worked like a charm..
hdfs dfs -mkdir /user
hdfs dfs -mkdir /user/{loggedin user}
hdfs dfs -ls
this method fixed my problem.
The user directory in Hadoop is (in HDFS)
/user/<your operational system user>
If you get this error message it may be because you have not yet created your user directory within HDFS.
Use
hadoop fs -mkdir -p /user/<current o.p. user directory>
To see what is your current operational system user, use:
id -un
hadoop fs -ls it should start working...
There are a couple things at work here; based on "jarlib is referring to the HDFS file location", it sounds like you indeed have an HDFS path set as your fs.default.name, which is indeed the typical setup. So, when you type hadoop fs -ls, this is indeed trying to look inside HDFS, except it's looking in your current working directory, which should be something like hdfs://101-master:50000/user/ubuntu. The error message is unfortunately somewhat confusing since it doesn't tell you that . was interpreted to be that full path. If you hadoop fs -mkdir /user/ubuntu then hadoop fs -ls should start working.
This problem is unrelated to your "jarlib" problem; whenever you want to refer files explicitly stored in the local filesystem, but where the path goes through Hadoop's Path resolution, you simply need to add file:/// to force Hadoop to refer to the local filesystem. For example:
hadoop fs -ls file:///tmp
Try passing your jar file paths as fille file:///path/to/your/jarfile and it should work.
WARN util.NativeCodeLoader: Unable to load native-hadoop library for your
platform... using builtin-java classes where applicable
This error will be removed using this command in .bashrc file:
export HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=/usr/local/hadoop/lib/native"
------------------------------------------------------
/usr/local/hadoop is location where hadoop is install
-------------------------------------------------------

Why I cannot see the block file in the path specified by dfs.data.dir?

Just now I wrote a 90M file into hdfs, and execute command fsck below. The output is below.
xuhang#master:~$ hadoop fsck /home/xuhang/hadoopinput/0501/baidu_hadoop.flv -files -blocks -locations
/home/xuhang/hadoopinput/0501/baidu_hadoop.flv 103737775 bytes, 2 block(s)
.......................
0. blk_-7625024667897507616_12224 len=67108864 repl=2 [node1:50010, node2:50010]
1. blk_2225876293125688018_12224 len=36628911 repl=2 [node1:50010, node2:50010]
.................
.................
FSCK ended at Sun Sep 22 11:55:51 CST 2013 in 25 milliseconds
I have configured the same property in hdfs-site.xml to two datanodes like below.
<name>dfs.name.dir</name>
<value>/home/xuhang/hadoop-1.2.1/name1,/home/xuhang/hadoop-1.2.1/name2</value>
But I find nothing in /home/xuhang/hadoop-1.2.1/name1 and /home/xuhang/hadoop-1.2.1/name2 in two datanodes. Why? I am sure I have wrote the 90M file into hdfs successfully because I can read it from hadoop command or java client.
I see those blocks are in hosts node1 and node2. Have you been looking at node1 and node2?
Please check the hdfs-site.xml in both node1 and node2 too. It's likely that dfs.data.dir may be set to something different in those nodes. You should find the blk_ files inside a directory named current, which is inside the directories pointed by dfs.data.dir.

Hadoop dfs -ls returns list of files in my hadoop/ dir

I've set up a sigle-node Hadoop configuration running via cygwin under Win7. After starting Hadoop bybin/start-all.sh I run bin/hadoop dfs -ls which returns me a list of files in my hadoop directory. Then I run bin/hadoop datanode -formatbin/hadoop namenode -format but -ls still returns me the contents of my hadoop directory. As far as I understand it should return nothing(empty folder). What am I doing wrong?
Did you edit the core-site.xml and mapred-site.xml under conf folder ?
It seems like your hadoop cluster is in local mode.
I know this question is quite old, but directory structure in Hadoop has changed a bit (version 2.5 )
Jeroen's current version would be.
hdfs dfs -ls hdfs://localhost:9000/users/smalldata
Also Just for information - use of start-all.sh and stop-all.sh has been deprecated, instead one should use start-dfs.sh and start-yarn.sh
I had the same problem and solved it by explicitly specifying the URL to the NameNode.
To list all directories in the root of your hdfs space do the following:
./bin/hadoop dfs -ls hdfs://<ip-of-your-server>:9000/
The documentation says something about a default hdfs point in the configuration, but I cannot find it. If someone knows what they mean please enlighten us.
This is where I got the info: http://hadoop.apache.org/common/docs/r0.20.0/hdfs_shell.html#Overview
Or you could just do:
Run stop-all.sh.
Remove dfs data and name directories
Namenode -format
Run start-all.sh

Resources