I am very confused about Hadoop configuration about core-site.xml and hdfs-site.xml. I feel that start-dfs.sh script not actually use the setting. I use hdfs user to format the Namenode successfully but execute start-dfs.sh can not start hdfs daemons. Can anyone help me! here is the error message:
[hdfs#I26C ~]$ start-dfs.sh
Starting namenodes on [I26C]
I26C: mkdir: cannot create directory ‘/hdfs’: Permission denied
I26C: chown: cannot access ‘/hdfs/hdfs’: No such file or directory
I26C: starting namenode, logging to /hdfs/hdfs/hadoop-hdfs-namenode-I26C.out
I26C: /edw/hadoop-2.7.2/sbin/hadoop-daemon.sh: line 159: /hdfs/hdfs/hadoop-hdfs-namenode-I26C.out: No such file or directory
I26C: head: cannot open ‘/hdfs/hdfs/hadoop-hdfs-namenode-I26C.out’ for reading: No such file or directory
I26C: /edw/hadoop-2.7.2/sbin/hadoop-daemon.sh: line 177: /hdfs/hdfs/hadoop-hdfs-namenode-I26C.out: No such file or directory
I26C: /edw/hadoop-2.7.2/sbin/hadoop-daemon.sh: line 178: /hdfs/hdfs/hadoop-hdfs-namenode-I26C.out: No such file or directory
10.1.226.15: mkdir: cannot create directory ‘/hdfs’: Permission denied
10.1.226.15: chown: cannot access ‘/hdfs/hdfs’: No such file or directory
10.1.226.15: starting datanode, logging to /hdfs/hdfs/hadoop-hdfs-datanode-I26C.out
10.1.226.15: /edw/hadoop-2.7.2/sbin/hadoop-daemon.sh: line 159: /hdfs/hdfs/hadoop-hdfs-datanode-I26C.out: No such file or directory
10.1.226.16: mkdir: cannot create directory ‘/edw/hadoop-2.7.2/logs’: Permission denied
10.1.226.16: chown: cannot access ‘/edw/hadoop-2.7.2/logs’: No such file or directory
10.1.226.16: starting datanode, logging to /edw/hadoop-2.7.2/logs/hadoop-hdfs-datanode-I26D.out
10.1.226.16: /edw/hadoop-2.7.2/sbin/hadoop-daemon.sh: line 159: /edw/hadoop-2.7.2/logs/hadoop-hdfs-datanode-I26D.out: No such file or directory
10.1.226.15: head: cannot open ‘/hdfs/hdfs/hadoop-hdfs-datanode-I26C.out’ for reading: No such file or directory
10.1.226.15: /edw/hadoop-2.7.2/sbin/hadoop-daemon.sh: line 177: /hdfs/hdfs/hadoop-hdfs-datanode-I26C.out: No such file or directory
10.1.226.15: /edw/hadoop-2.7.2/sbin/hadoop-daemon.sh: line 178: /hdfs/hdfs/hadoop-hdfs-datanode-I26C.out: No such file or directory
10.1.226.16: head: cannot open ‘/edw/hadoop-2.7.2/logs/hadoop-hdfs-datanode-I26D.out’ for reading: No such file or directory
10.1.226.16: /edw/hadoop-2.7.2/sbin/hadoop-daemon.sh: line 177: /edw/hadoop-2.7.2/logs/hadoop-hdfs-datanode-I26D.out: No such file or directory
10.1.226.16: /edw/hadoop-2.7.2/sbin/hadoop-daemon.sh: line 178: /edw/hadoop-2.7.2/logs/hadoop-hdfs-datanode-I26D.out: No such file or directory
Starting secondary namenodes [0.0.0.0]
0.0.0.0: mkdir: cannot create directory ‘/hdfs’: Permission denied
0.0.0.0: chown: cannot access ‘/hdfs/hdfs’: No such file or directory
0.0.0.0: starting secondarynamenode, logging to /hdfs/hdfs/hadoop-hdfs-secondarynamenode-I26C.out
0.0.0.0: /edw/hadoop-2.7.2/sbin/hadoop-daemon.sh: line 159: /hdfs/hdfs/hadoop-hdfs-secondarynamenode-I26C.out: No such file or directory
0.0.0.0: head: cannot open ‘/hdfs/hdfs/hadoop-hdfs-secondarynamenode-I26C.out’ for reading: No such file or directory
0.0.0.0: /edw/hadoop-2.7.2/sbin/hadoop-daemon.sh: line 177: /hdfs/hdfs/hadoop-hdfs-secondarynamenode-I26C.out: No such file or directory
0.0.0.0: /edw/hadoop-2.7.2/sbin/hadoop-daemon.sh: line 178: /hdfs/hdfs/hadoop-hdfs-secondarynamenode-I26C.out: No such file or directory
Here is the info about my deployment
master:
hostname: I26C
IP:10.1.226.15
Slave:
hostname:I26D
IP:10.1.226.16
Hadoop version: 2.7.2
OS: CentOS 7
JAVA: 1.8
I have create four users:
groupadd hadoop
useradd -g hadoop hadoop
useradd -g hadoop hdfs
useradd -g hadoop mapred
useradd -g hadoop yarn
The HDFS namenode and datanode dir privileges :
drwxrwxr-x. 3 hadoop hadoop 4.0K Apr 26 15:40 hadoop-data
The core-site.xml setting:
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/edw/hadoop-data/</value>
<description>Temporary Directory.</description>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://10.1.226.15:54310</value>
</property>
</configuration>
The hdfs-site.xml setting:
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///edw/hadoop-data/dfs/namenode</value>
<description>Determines where on the local filesystem the DFS name node should store the name table(fsimage). If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy.
</description>
</property>
<property>
<name>dfs.blocksize</name>
<value>67108864</value>
</property>
<property>
<name>dfs.namenode.handler.count</name>
<value>100</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///edw/hadoop-data/dfs/datanode</value>
<description>Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored.
</description>
</property>
</configuration>
The hdfs user doesn't have permission for the hadoop folders.
Lets say, you are using hdfs user and hadoop group to run the hadoop setup. Then you need to run following command :
sudo chown -R hduser:hadoop <directory-name>
Give the appropriate Read-write-execute permission to your logged in user.
I have fix the problem, thank you guys.
The hadoop log configuration HADOOP_LOG_DIR setting in /etc/profile can not be used in hadoop-env.sh. So the HADOOP_LOG_DIR default is empty, the start-dfs.sh use the default directory setting by hadoop-env.sh
export HADOOP_LOG_DIR=${HADOOP_LOG_DIR}/$USER
I use hdfs use to preform the start-dfs.sh the HADOOP_LOG_DIR set to /hdfs, so it will not have privilege to create directory.
Here is my new solution edit ${HADOOP_HOME}/etc/hadoop/hadoop-env.sh set HADOOP_LOG_DIR:
HADOOP_LOG_DIR="/var/log/hadoop"
export HADOOP_LOG_DIR=${HADOOP_LOG_DIR}/$USER
Related
I installed Hadoop in my Ubuntu 12.04 by following the procedure in the below link.
http://www.bogotobogo.com/Hadoop/BigData_hadoop_Install_on_ubuntu_single_node_cluster.php
Everything is installed successfully and when I run the start-all.sh only some of the services are running.
wanderer#wanderer-Lenovo-IdeaPad-S510p:~$ su - hduse
Password:
hduse#wanderer-Lenovo-IdeaPad-S510p:~$ cd /usr/local/hadoop/sbin
hduse#wanderer-Lenovo-IdeaPad-S510p:/usr/local/hadoop/sbin$ start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [localhost]
hduse#localhost's password:
localhost: starting namenode, logging to /usr/local/hadoop/logs/hadoop-hduse-namenode-wanderer-Lenovo-IdeaPad-S510p.out
hduse#localhost's password:
localhost: starting datanode, logging to /usr/local/hadoop/logs/hadoop-hduse-datanode-wanderer-Lenovo-IdeaPad-S510p.out
Starting secondary namenodes [0.0.0.0]
hduse#0.0.0.0's password:
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-hduse-secondarynamenode-wanderer-Lenovo-IdeaPad-S510p.out
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-hduse-resourcemanager-wanderer-Lenovo-IdeaPad-S510p.out
hduse#localhost's password:
localhost: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-hduse-nodemanager-wanderer-Lenovo-IdeaPad-S510p.out
hduse#wanderer-Lenovo-IdeaPad-S510p:/usr/local/hadoop/sbin$ jps
7940 Jps
7545 ResourceManager
7885 NodeManager
Once I stop the service by running the script stop-all.sh
hduse#wanderer-Lenovo-IdeaPad-S510p:/usr/local/hadoop/sbin$ stop-all.sh
This script is Deprecated. Instead use stop-dfs.sh and stop-yarn.sh
Stopping namenodes on [localhost]
hduse#localhost's password:
localhost: no namenode to stop
hduse#localhost's password:
localhost: no datanode to stop
Stopping secondary namenodes [0.0.0.0]
hduse#0.0.0.0's password:
0.0.0.0: no secondarynamenode to stop
stopping yarn daemons
stopping resourcemanager
hduse#localhost's password:
localhost: stopping nodemanager
no proxyserver to stop
My configuration files
Editing bashrc file
vi ~/.bashrc
#HADOOP VARIABLES START
export JAVA_HOME=/usr/lib/jvm/java-8-oracle/
export HADOOP_INSTALL=/usr/local/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
#HADOOP VARIABLES END
hdfs-site.xml
vi /usr/local/hadoop/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
</configuration>
hadoop-env.sh
vi /usr/local/hadoop/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/java-8-oracle/
export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"}
for f in $HADOOP_HOME/contrib/capacity-scheduler/*.jar; do
if [ "$HADOOP_CLASSPATH" ]; then
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$f
else
export HADOOP_CLASSPATH=$f
fi
done
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true"
export HADOOP_NAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_NAMENODE_OPTS"
export HADOOP_DATANODE_OPTS="-Dhadoop.security.logger=ERROR,RFAS $HADOOP_DATANODE_OPTS"
export HADOOP_SECONDARYNAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_SECONDARYNAMENODE_OPTS"
export HADOOP_NFS3_OPTS="$HADOOP_NFS3_OPTS"
export HADOOP_PORTMAP_OPTS="-Xmx512m $HADOOP_PORTMAP_OPTS"
# The following applies to multiple commands (fs, dfs, fsck, distcp etc)
export HADOOP_CLIENT_OPTS="-Xmx512m $HADOOP_CLIENT_OPTS"
export HADOOP_SECURE_DN_USER=${HADOOP_SECURE_DN_USER}
export HADOOP_SECURE_DN_LOG_DIR=${HADOOP_LOG_DIR}/${HADOOP_HDFS_USER}
export HADOOP_PID_DIR=${HADOOP_PID_DIR}
export HADOOP_SECURE_DN_PID_DIR=${HADOOP_PID_DIR}
# A string representing this instance of hadoop. $USER by default.
export HADOOP_IDENT_STRING=$USER
core-site.xml
vi /usr/local/hadoop/etc/hadoop/core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
</configuration>
mapred-site.xml
vi /usr/local/hadoop/etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>
</configuration>
$ javac -version
javac 1.8.0_66
$ java -version
java version "1.8.0_66"
Java(TM) SE Runtime Environment (build 1.8.0_66-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode)
I am new to Hadoop and could not find the issue. Where can I find the log files for Jobtracker and NameNode in order to track the services?
If it is not an ssh issue, do the next:
Delete all contents from temporary directory: rm -Rf /app/hadoop/tmp and format the namenode server bin/hadoop namenode -format.
Start the namenode and datanode with bin/start-dfs.sh.
Type jps in command line to check whether nodes are running.
Check if hduser has rights to write the hadoop_store/hdfs/namenode and datanode directories with ls -ld directory
You can change the rights by sudo chmod +777 /hadoop_store/hdfs/namenode/
if you take a closer look to start-all.sh command log, you can easily see log fileş path. Each service after try starting write into logs
localhost: starting namenode, logging to /usr/local/hadoop/logs/hadoop-hduse-namenode-wanderer-Lenovo-IdeaPad-S510p.out
ocalhost: starting datanode, logging to /usr/local/hadoop/logs/hadoop-hduse-datanode-wanderer-Lenovo-IdeaPad-S510p.out
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-hduse-secondarynamenode-wanderer-Lenovo-IdeaPad-S510p.out
starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-hduse-resourcemanager-wanderer-Lenovo-IdeaPad-S510p.out
localhost: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-hduse-nodemanager-wanderer-Lenovo-IdeaPad-S510p.out
You have to set up passwordless authentication for ssh. The hduse user should be able to login to localhost over ssh without password.
The namenode is not showing
After inserting the $jps command, the namenode is not showing but, datanode is created. So, to solve the problems, we can follow the steps which are given below,
It will work for the configuration with hadoop 2.7.6
Step 1:::(Stop hadoop)
/usr/local/hadoop/sbin$ stop-dfs.sh
Step 2:::(Remove tmp folder)
/usr/local/hadoop/sbin$ sudo rm -rf /app/hadoop/tmp/
Step 3:::(Create new tmp file)
/usr/local/hadoop/sbin$ sudo mkdir -p /app/hadoop/tmp
/usr/local/hadoop/sbin$ sudo chown hduser:hadoop /app/hadoop/tmp
/usr/local/hadoop/sbin$ chmod 750 /app/hadoop/tmp
Step 4:::(Format namenode)
/usr/local/hadoop/sbin$ hdfs namenode -format
Step 5:::(Start dfs)
/usr/local/hadoop/sbin$ start-all.sh
/usr/local/hadoop/sbin$ jps
The namenode is now showing
I learned that I have to configure the NameNode and DataNode dir in hdfs-site.xml. So that's my hdfs-site.xml configuration on the NameNode:
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file://usr/local/hadoop-2.6.0/hadoop_data/hdfs/namenode</value>
</property>
<property>
<name>dfs.block.size</name>
<value>134217728</value>
</property>
</configuration>
I did almost the same on my DataNode and changed dfs.namenode to dfs.datanode.
Then I formatted the filesystem via
hadoop namenode -format
Everything seems to be finished without an error.
Then I wanted to create a directory in my HDFS filesystem by using:
hdfs dfs -mkdir test
And I got an error:
mkdir: `test': No such file or directory
What did I miss or what's the common process from formatting to creating files/directories with HDFS?
Well, it's so easy.
hdfs dfs -mkdir /test
was created successfully.
hdfs dfs -put myFile /test/myFile
works as well.
Create a directory:
hdfs dfs -mkdir directoryName
Create a new file in directory
hdfs dfs -touchz directoryName/Newfilename
Write into newly created file in HDFS
nano filename
Save it Cntr+X Y
Read the newly created file from HDFS
nano fileName
Or
hdfs dfs -cat directoryName/fileName
HDFS is a non POSIX compliant file systems so you can't edit files directly inside of HDFS, however you can Copy a file from your local system to HDFS using following command:
hdfs dfs -put /path/in/source/system/filename /path/in/HDFS/system/destination
If you want to create multiple sub-directories then you should also use -p flag:
hdfs dfs -mkdir -p /test/another_test/one_more_test
I am trying to run Hadoop in Pseudo-Distributed mode. For this I am trying to follow this tutorial http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html
I can ssh to my localhost and Format the filesystem. However, I can't start NameNode daemon and DataNode daemon by this command :
sbin/start-dfs.sh
When I execute it with sudo I get:
ubuntu#ip-172-31-42-67:/usr/local/hadoop-2.6.0$ sudo sbin/start-dfs.sh
Starting namenodes on [localhost]
localhost: Permission denied (publickey).
localhost: Permission denied (publickey).
Starting secondary namenodes [0.0.0.0]
0.0.0.0: Permission denied (publickey).
and when executed without sudo:
ubuntu#ip-172-31-42-67:/usr/local/hadoop-2.6.0$ sbin/start-dfs.sh
Starting namenodes on [localhost]
localhost: mkdir: cannot create directory ‘/usr/local/hadoop-2.6.0/logs’: Permission denied
localhost: chown: cannot access ‘/usr/local/hadoop-2.6.0/logs’: No such file or directory
localhost: starting namenode, logging to /usr/local/hadoop-2.6.0/logs/hadoop-ubuntu-namenode-ip-172-31-42-67.out
localhost: /usr/local/hadoop-2.6.0/sbin/hadoop-daemon.sh: line 159: /usr/local/hadoop-2.6.0/logs/hadoop-ubuntu-namenode-ip-172-31-42-67.out: No such file or directory
localhost: head: cannot open ‘/usr/local/hadoop-2.6.0/logs/hadoop-ubuntu-namenode-ip-172-31-42-67.out’ for reading: No such file or directory
localhost: /usr/local/hadoop-2.6.0/sbin/hadoop-daemon.sh: line 177: /usr/local/hadoop-2.6.0/logs/hadoop-ubuntu-namenode-ip-172-31-42-67.out: No such file or directory
localhost: /usr/local/hadoop-2.6.0/sbin/hadoop-daemon.sh: line 178: /usr/local/hadoop-2.6.0/logs/hadoop-ubuntu-namenode-ip-172-31-42-67.out: No such file or directory
localhost: mkdir: cannot create directory ‘/usr/local/hadoop-2.6.0/logs’: Permission denied
localhost: chown: cannot access ‘/usr/local/hadoop-2.6.0/logs’: No such file or directory
localhost: starting datanode, logging to /usr/local/hadoop-2.6.0/logs/hadoop-ubuntu-datanode-ip-172-31-42-67.out
localhost: /usr/local/hadoop-2.6.0/sbin/hadoop-daemon.sh: line 159: /usr/local/hadoop-2.6.0/logs/hadoop-ubuntu-datanode-ip-172-31-42-67.out: No such file or directory
localhost: head: cannot open ‘/usr/local/hadoop-2.6.0/logs/hadoop-ubuntu-datanode-ip-172-31-42-67.out’ for reading: No such file or directory
localhost: /usr/local/hadoop-2.6.0/sbin/hadoop-daemon.sh: line 177: /usr/local/hadoop-2.6.0/logs/hadoop-ubuntu-datanode-ip-172-31-42-67.out: No such file or directory
localhost: /usr/local/hadoop-2.6.0/sbin/hadoop-daemon.sh: line 178: /usr/local/hadoop-2.6.0/logs/hadoop-ubuntu-datanode-ip-172-31-42-67.out: No such file or directory
Starting secondary namenodes [0.0.0.0]
0.0.0.0: mkdir: cannot create directory ‘/usr/local/hadoop-2.6.0/logs’: Permission denied
0.0.0.0: chown: cannot access ‘/usr/local/hadoop-2.6.0/logs’: No such file or directory
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop-2.6.0/logs/hadoop-ubuntu-secondarynamenode-ip-172-31-42-67.out
0.0.0.0: /usr/local/hadoop-2.6.0/sbin/hadoop-daemon.sh: line 159: /usr/local/hadoop-2.6.0/logs/hadoop-ubuntu-secondarynamenode-ip-172-31-42-67.out: No such file or directory
0.0.0.0: head: cannot open ‘/usr/local/hadoop-2.6.0/logs/hadoop-ubuntu-secondarynamenode-ip-172-31-42-67.out’ for reading: No such file or directory
0.0.0.0: /usr/local/hadoop-2.6.0/sbin/hadoop-daemon.sh: line 177: /usr/local/hadoop-2.6.0/logs/hadoop-ubuntu-secondarynamenode-ip-172-31-42-67.out: No such file or directory
0.0.0.0: /usr/local/hadoop-2.6.0/sbin/hadoop-daemon.sh: line 178: /usr/local/hadoop-2.6.0/logs/hadoop-ubuntu-secondarynamenode-ip-172-31-42-67.out: No such file or directory
I also notice now that when executing ls to check content of hfs directories like here, it fails:
ubuntu#ip-172-31-42-67:~/dir$ hdfs dfs -ls output/
ls: Call From ip-172-31-42-67.us-west-2.compute.internal/172.31.42.67 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
Can anyone tell me what could be the problem ?
I had the same problem and the only solution I found was:
https://anuragsoni.wordpress.com/2015/07/05/hadoop-start-dfs-sh-localhost-permission-denied-how-to-fix/
Which suggest you to generate a new ssh-rsa key
The errors above suggest a permissions problem.
You have to make sure that the hadoop user has the proper privileges to /usr/local/hadoop.
For this purpose you can try:
sudo chown -R hadoop /usr/local/hadoop/
Or
sudo chmod 777 /usr/local/hadoop/
Please make sure that you do the following "Configuration" correctly, you need to edit 4 ".xml" files:
Edit the file hadoop-2.6.0/etc/hadoop/core-site.xml , between , put in :
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
Edit the file hadoop-2.6.0/etc/hadoop/hdfs-site.xml, between put in:
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
Edit the file hadoop-2.6.0/etc/hadoop/mapred-site.xm, between paste the following and save
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
Edit the file hadoop-2.6.0/etc/hadoop/yarn-site.xml, between paste the following and save
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
I have setup hadoop on mac local mac. When i start-dfs using the start-dfs.sh command using a separate hadoop user i get the following error in the terminal.
0.0.0.0: mkdir: /usr/local/Cellar/hadoop/2.3.0/libexec/logs: Permission denied
Does anyone know how i can change the log directory for hadoop? I installed hadoop using homebrew.
bash-3.2$ start-dfs.sh
14/03/31 09:04:20 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: mkdir: /usr/local/Cellar/hadoop/2.3.0/libexec/logs: Permission denied
localhost: chown: /usr/local/Cellar/hadoop/2.3.0/libexec/logs: No such file or directory
localhost: starting namenode, logging to /usr/local/Cellar/hadoop/2.3.0/libexec/logs/hadoop-hadoop-namenode-mymac.local.out
localhost: /usr/local/Cellar/hadoop/2.3.0/libexec/sbin/hadoop-daemon.sh: line 151: /usr/local/Cellar/hadoop/2.3.0/libexec/logs/hadoop-hadoop-namenode-mymac.local.out: No such file or directory
localhost: head: /usr/local/Cellar/hadoop/2.3.0/libexec/logs/hadoop-hadoop-namenode-mymac.local.out: No such file or directory
localhost: /usr/local/Cellar/hadoop/2.3.0/libexec/sbin/hadoop-daemon.sh: line 166: /usr/local/Cellar/hadoop/2.3.0/libexec/logs/hadoop-hadoop-namenode-mymac.local.out: No such file or directory
localhost: /usr/local/Cellar/hadoop/2.3.0/libexec/sbin/hadoop-daemon.sh: line 167: /usr/local/Cellar/hadoop/2.3.0/libexec/logs/hadoop-hadoop-namenode-mymac.local.out: No such file or directory
localhost: mkdir: /usr/local/Cellar/hadoop/2.3.0/libexec/logs: Permission denied
localhost: chown: /usr/local/Cellar/hadoop/2.3.0/libexec/logs: No such file or directory
The error indicates a permissions problem. The hadoop user needs the proper privileges to the hadoop folder. Try running the following in Terminal:
sudo chown -R hadoop /usr/local/Cellar/hadoop/2.3.0/
i am trying to install hadoop 2.2.0 i am getting following kind of error while starting dataenode services please help me resolve this issue.Thanks in Advance.
2014-03-11 08:48:16,406 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /home/prassanna/usr/local/hadoop/yarn_data/hdfs/datanode/in_use.lock acquired by nodename 3627#prassanna-Studio-1558
2014-03-11 08:48:16,426 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool BP-611836968-127.0.1.1-1394507838610 (storage id DS-1960076343-127.0.1.1-50010-1394127604582) service to localhost/127.0.0.1:9000
java.io.IOException: Incompatible clusterIDs in /home/prassanna/usr/local/hadoop/yarn_data/hdfs/datanode: namenode clusterID = CID-fb61aa70-4b15-470e-a1d0-12653e357a10; datanode clusterID = CID-8bf63244-0510-4db6-a949-8f74b50f2be9
at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:391)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:191)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:219)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:837)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:808)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:280)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:222)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:664)
at java.lang.Thread.run(Thread.java:662)
2014-03-11 08:48:16,427 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool BP-611836968-127.0.1.1-1394507838610 (storage id DS-1960076343-127.0.1.1-50010-1394127604582) service to localhost/127.0.0.1:9000
2014-03-11 08:48:16,532 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool BP-611836968-127.0.1.1-1394507838610 (storage id DS-1960076343-127.0.1.1-50010-1394127604582)
2014-03-11 08:48:18,532 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2014-03-11 08:48:18,534 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0
2014-03-11 08:48:18,536 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/**********************************
SHUTDOWN_MSG: Shutting down DataNode at prassanna-Studio-1558/127.0.1.1
Make sure you are ready with correct configuration and right path.
This is a link for Running Hadoop on ubuntu.
I have used this link to setup hadoop in my machine and it works fine.
That simply shows that the datanode tried to startup but took some exception and died.
Please check the datanode log under the logs folder in the hadoop installation folder (unless you changed that config) for exceptions. It usually points to a configuration issue of some kind, esp. network settings (/etc/hosts) related but there are quite a few possibilities.
Refer this,
1.Check JAVA_HOME---
readlink -f $(which java)
/usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java
2.If JAVA is not available install by command
sudo apt-get install defalul-jdk
than run 1. and check on terminal
java -version
javac -version
3.Configure SSH
Hadoop requires SSH access to manage its nodes, i.e. remote machines plus your local machine if you want to use Hadoop on it (which is what we want to do in this short tutorial). For our single-node setup of Hadoop, we therefore need to configure SSH access to localhost for the user .
sudo apt-get install ssh
sudo su hadoop
ssh-keygen -t rsa -P “”
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
ssh localhost
Download and extract hadoop-2.7.3(Chosse dirrectory having read write permisson)
Set Environment Variable
sudo gedit .bashrc
source .bashrc
Setup Configuration Files
The following files will have to be modified to complete the Hadoop setup:
~/.bashrc (Already done)
(PATH)/etc/hadoop/hadoop-env.sh
(PATH)/etc/hadoop/core-site.xml
(PATH)/etc/hadoop/mapred-site.xml.template
(PATH)/etc/hadoop/hdfs-site.xm
gedit (PATH)/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
gedit (PATH)/etc/hadoop/core-site.xml:
The (HOME)/etc/hadoop/core-site.xml file contains configuration properties that Hadoop uses when starting up.
This file can be used to override the default settings that Hadoop starts with.
($ sudo mkdir -p /app/hadoop/tmp)
Open the file and enter the following in between the <configuration></configuration> tag:
gedit /usr/local/hadoop/etc/hadoop/core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
</configuration>
(PATH)/etc/hadoop/mapred-site.xml
By default, the (PATH)/etc/hadoop/ folder contains (PATH)/etc/hadoop/mapred-site.xml.template file which has to be renamed/copied with the name mapred-site.xml:
cp /usr/local/hadoop/etc/hadoop/mapred-site.xml.template /usr/local/hadoop/etc/hadoop/mapred-site.xml
The mapred-site.xml file is used to specify which framework is being used for MapReduce.
We need to enter the following content in between the <configuration></configuration> tag:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>
</configuration>
(PATH)/etc/hadoop/hdfs-site.xml
The (PATH)/etc/hadoop/hdfs-site.xml file needs to be configured for each host in the cluster that is being used.
It is used to specify the directories which will be used as the namenode and the datanode on that host.
Before editing this file, we need to create two directories which will contain the namenode and the datanode for this Hadoop installation.
This can be done using the following commands:
sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode
sudo mkdir -p /usr/local/hadoop_store/hdfs/datanode
Open the file and enter the following content in between the <configuration></configuration> tag:
gedit (PATH)/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
</configuration>
Format the New Hadoop Filesystem
Now, the Hadoop file system needs to be formatted so that we can start to use it. The format command should be issued with write permission since it creates current directory under /usr/local/hadoop_store/ folder:
bin/hadoop namenode -format
or
bin/hdfs namenode -format
HADOOP SETUP IS DONE
Now start the hdfs
start-dfs.sh
start-yarn.sh
CHECK URL: http://localhost:50070/
FOR STOPPING HDFS
stop-dfs.sh
stop-yarn.sh