Why Datanode process didn't run in wsl? - hadoop

I have set up and configured my hadoop in wsl, but when I start dataNode, it did't work.
This is it's log
2020-03-24 23:47:08,788 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: registered UNIX signal handlers for [TERM, HUP, INT]
2020-03-24 23:47:09,809 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2020-03-24 23:47:10,199 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Invalid dfs.datanode.data.dir /mnt/d/hadoop/hadoop-2.7.1/tmp/dfs/data :
ExitCodeException exitCode=1: chmod: changing permissions of '/mnt/d/hadoop/hadoop-2.7.1/tmp/dfs/data': Operation not permitted
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:815)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:798)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:728)
at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:502)
at org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:140)
at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:156)
at org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2344)
at org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2386)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2368)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2260)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2307)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2484)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2508)
2020-03-24 23:47:10,207 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain
java.io.IOException: All directories in dfs.datanode.data.dir are invalid: "/mnt/d/hadoop/hadoop-2.7.1/tmp/dfs/data/"
at org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2395)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2368)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2260)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2307)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2484)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2508)
2020-03-24 23:47:10,209 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2020-03-24 23:47:10,213 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at DESKTOP-U1EOV4J.localdomain/127.0.1.1
************************************************************/
I want to let the dataNode and nameNode directory in my windows directory(such as in mnt/d/hadoop/hadoop-2.7.1/tmp) but failed.
I tried to use chmod 777 -R hadoop /mnt/d/hadoopbut it still can't start. Also I tired to delete the tmp directory and reformat namenode.
But when I change the directory to /home/hadoop/tmp(my wsl directory), dataNode can start normally.
According to the log file, I think this is a problem of authority but I don't konw why. How can I fix this?

I have solved this problem, this is caused by windows authority.
You can't change the file's authority under /mnt in wsl.
When you use chown or chmod, they actually don't work.
You need to use the following instructions first.
sudo umount /mnt/d
sudo mount -t drvfs D: /mnt/d -o metadata
Then you can change the file's authority in wsl.
You can get more information here: https://devblogs.microsoft.com/commandline/chmod-chown-wsl-improvements/

Related

Datanode is Not Starting in Hadoop after Kerberos Auth

I have given permission to /app/hadoop/tmp/dfs/data.
WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Invalid dfs.datanode.data.dir /app/hadoop/tmp/dfs/data :
EPERM: Operation not permitted
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmodImpl(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmod(NativeIO.java:230)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:727)
at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:502)
at org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:140)
at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:156)
at org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2341)
at org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2383)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2365)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2257)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2304)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2481)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2505)
2017-03-14 20:10:51,169 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain
java.io.IOException: All directories in dfs.datanode.data.dir are invalid: "/app/hadoop/tmp/dfs/data/"
at org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2392)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2365)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2257)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2304)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2481)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2505)
2017-03-14 20:10:51,172 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2017-03-14 20:10:51,174 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
Does the owner of that directory on the local file system match the service of the Kerberos principal the datanode is now using? So if it is hdfs/, then the directory (and all under it) should be owned by hdfs.

Hadoop 2.6.1 Single Node Set up : Data Node is not up

I tried to set up Hadoop 2.6.1 based on instructions from here
But my data node is not up. When I do JPS , I get only the below process
▶ jps
8406 ResourceManager
7744 NameNode
8527 NodeManager
8074 SecondaryNameNode
9121 Jps
DataNode Log:
2015-10-07 13:02:24,144 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Invalid dfs.datanode.data.dir /home/vinod/.hadoopdata/hdfs/datanode :
EPERM: Operation not permitted
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmodImpl(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmod(NativeIO.java:230)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:652)
at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:490)
at org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:140)
at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:156)
at org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2299)
at org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2341)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2323)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2215)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2262)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2438)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2462)
2015-10-07 13:02:24,147 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain
java.io.IOException: All directories in dfs.datanode.data.dir are invalid: "/home/vinod/.hadoopdata/hdfs/datanode/"
at org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2350)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2323)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2215)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2262)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2438)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2462)
2015-10-07 13:02:24,148 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2015-10-07 13:02:24,150 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at BBDSK0201/127.0.1.1
************************************************************/
Please help what's that I could be missing
1)Make sure the directory has right owner and permissions.
$ sudo chown -R hduser:hadoop /home/vinod/.hadoopdata/hdfs/datanode
2) Delete the contents given in tmp directory. It is the parameter given for hadoop.tmp.dir
3) Format the namenode.
Start all the process again. Hope this helps...

Data node not starting..here is the log file

java.io.FileNotFoundException: File file:/hadoopuser/hdfs/datanode does not exist
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:534)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:747)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:524)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:409)
at org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:139)
at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:156)
at org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2239)
at org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2281)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2263)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2155)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2202)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2378)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2402)
2015-06-03 03:36:22,656 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain
java.io.IOException: All directories in dfs.datanode.data.dir are invalid: "/hadoopuser/hdfs/datanode"
at org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2290)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2263)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2155)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2202)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2378)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2402)
2015-06-03 03:36:22,659 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2015-06-03 03:36:22,674 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at ubuntu/127.0.1.1
************************************************************
STEP 1: Create a new directory for datanode. Change ownership and permission.
su - hadoop
Enter password for hadoop user.
sudo mkdir /usr/local/hdfs/datanode
sudo chmod -R 777 /usr/local/hdfs/datanode
sudo chown -R hadoop:hadoop /usr/local/hdfs/datanode
STEP 2: Change the value for dfs.datanode.data.dir in hdfs-site.xml as:
Change this
/hadoopuser/hdfs/datanode
to
/usr/local/hdfs/datanode
Now, restart your hadoop processes.
NOTE: In "su - hadoop", hadoop is my hadoop username. If hadoop is not your username, change it to your username which you are
using to start and stop hadoop processes.
Your jobtracker and tasktracker are all not start, how can you get the their log files.

hadoop's datanode is not starting

I am using ubuntu 14.04 LTS Java version 8 and Hadoop 2.5.1 for installation. I followed this guide to install all the components. Sorry for not using michael noll's.
Now the problem that i face is when i do start-dfs.sh i get the following message
oroborus#Saras-Dell-System-XPS-L502X:~$ start-dfs.sh <br>
14/11/12 16:12:33 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable<br>
Starting namenodes on [localhost]<br>
localhost: starting namenode, logging to /usr/local/hadoop/logs/hadoop-oroborus-namenode-Saras-Dell-System-XPS-L502X.out<br>
localhost: starting datanode, logging to /usr/local/hadoop/logs/hadoop-oroborus-datanode-Saras-Dell-System-XPS-L502X.out<br>
Starting secondary namenodes [0.0.0.0]<br>
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-oroborus-secondarynamenode-Saras-Dell-System-XPS-L502X.out<br>
14/11/12 16:12:48 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable<br>
Now after running start-yarn.sh(which seems to work fine) and jps i get the following output
oroborus#Saras-Dell-System-XPS-L502X:~$ jps
9090 NodeManager
5107 JobHistoryServer
8952 ResourceManager
12442 Jps
11981 NameNode
The ideal output should have datanode in it, but it is not there. Googling and SOing a bit i found out the error is looged in the logs so here are the logs for datanode.(Only error part if you need more let me know)
2014-11-08 23:30:32,709 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: registered UNIX signal handlers for [TERM, HUP, INT]
2014-11-08 23:30:33,132 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Invalid dfs.datanode.data.dir /usr/local/hadoop_store/hdfs/datanode :
EPERM: Operation not permitted
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmodImpl(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmod(NativeIO.java:226)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:642)
at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:472)
at org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:126)
at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:142)
at org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:1866)
at org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:1908)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1890)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1782)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1829)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2005)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2029)
2014-11-08 23:30:33,134 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain
java.io.IOException: All directories in dfs.datanode.data.dir are invalid: "/usr/local/hadoop_store/hdfs/datanode/"
at org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:1917)
Now my doubt is how to make it valid.
Help is appreciated.
P.S. I tried a lot of forums, SO post none of them could give me solution to this problem. Hence the question.
First delete all contents from hdfs folder:
Value of <name>hadoop.tmp.dir</name>
rm -rf /usr/local/hadoop_store
Make sure that dir has right owner and permission /usr/local/hadoop_store
hduser#localhost$sudo chown hduser:hadoop -R /usr/local/hadoop_store
hduser#localhost$sudo chmod 777 -R /usr/local/hadoop_store
Format the namenode:
hduser#localhost$hadoop namenode -format
Start all processes again
I also got the same problem,and I fixed them by changing the owner of those working directory.Although you have permissions 777 to these two directory,framework will not be able to use it unless you change the owner to hduser.
$ sudo chown -R hduser:hadoop /usr/local/hadoop/yarn_data/hdfs/namenode
$ sudo chown -R hduser:hadoop /usr/local/hadoop/yarn_data/hdfs/datanode
After this,you again start your cluster and you should see datanode running.
first stop all the entities like namenode, datanode etc. (you will
be having some script or command to do that)
Format tmp directory
Go to /var/cache/hadoop-hdfs/hdfs/dfs/ and delete all the contents in the directory manually
Now format your namenode again
start all the entities then use jps command to confirm that the datanode has been started
Now run whichever application you may like or have.
Hope this helps.
When using cloudera, I had the same problem.
* chown hdfs:hadoop all_hdfs_directories
chown 997:996 all_hdfs_directories (if users don't exist yet)
* chmod 0755 is enough
* delete contents of directories if problem persists

hadoop namenode not getting detected

I am trying to configure a pseudo hadoop 1 node cluster on my ubuntu machine. However when i give the following command
bin/start-all.sh
it start all the daemons but when i do jps,it does not give me namnode port and when i go to the namenode logs
The following message is displayed.
2013-04-26 11:59:09,927 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.io.IOException: Incomplete HDFS URI, no host: hdfs://vikasXXX.XX.XX.XX:X000
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:85)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
at org.apache.hadoop.fs.Trash.<init>(Trash.java:62)
at org.apache.hadoop.hdfs.server.namenode.NameNode.startTrashEmptier(NameNode.java:314)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:310)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:496)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1279)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1288)
2013-04-26 11:59:09,929 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at vikas/XXX.XX.XX.XX
************************************************************/
What could be the reason?
Thanks
I assume that you have masked and shared your URL
hdfs://vikasXXX.XX.XX.XX:X000
I think it is not recognizing your machine by name. Try using localhost and check if it works.
hdfs://localhost:8020

Resources