Hadoop 3.1.1 Mac OS Namenode Issues - macos

I had a Hadoop on my machine running but I was running into some compiler issues, so I deleted it and started fresh.
I was following this setup: https://www.guru99.com/how-to-install-hadoop.html
When I run $HADOOP_HOME/bin/hdfs namenode -format
Terminal doesn't return any thing.
Thanks in advance.

I had the same issue and fix it by following
Change to etc file for adjusting the files hadoop-env , core-site, mapred, yarn
Change to sbin to do format and lunch hds services

run with sudo will get you result
for example
sudo hdfs namenode -format

You need to make sure that you already adjust the files hadoop-env , core-site, mapred, yarn then processed to namenode format

Related

HADOOP NAMENODE ERROR

I have successfully installed Hadoop in my ubuntu system.
But when i run the command start-all.sh all the daemons start except the namenode.
Please help me.I Have attached the image with the problem
Try to Name Node Format Format it might work for You:
bin/hadoop namenode -format
and then try
start-all.sh

Hadoop: Cannot delete a directory. Name node is in safe mode

When I am trying to delete a directory in the HDFS file system, I am getting the following error:
Cannot delete directory. Name node is in safe mode.
How to solve this issue? Please advice.
If you see that error that means the Namenode is in safe mode and it is almost equivalent to read-only mode.
To leave the namenode from the safemode run the below command:
$ hadoop dfsadmin –safemode leave
if you are using hadoop 2.9.0 or higher, use
hdfs dfsadmin -safemode leave
In my case the hadoop dfsadmin -safemode leave canceled safemode but as soon as I tried to delete the old directory, the system returned to safemode.
I deleted all the tmp folders that I could find related to hadoop installations but the old directory did not disappear and it could not be deleted.
Finally I used:
ps aux | grep -i namenode
and discovered that there was a running process that was using parameters from an older Hadoop implementation (different version). I killed the process using kill pid and finally this action removed the old directory.

Hadoop Installation: Format Namenode

I'm struggling with installing Hadoop 2.2.0 on my Mac OSX 10.9.3. I essentially followed this tutorial:
http://www.alexjf.net/blog/distributed-systems/hadoop-yarn-installation-definitive-guide
When I run $HADOOP_PREFIX/bin/hdfs namenode -format to format namenode, I get the message:
SHUTDOWN_MSG: Shutting down NameNode at Macintosh.local/192.168.0.103. I believe this is preventing me from successfully running the test
$HADOOP_PREFIX/bin/hadoop jar $HADOOP_PREFIX/share/hadoop/yarn/hadoop-yarn-applications-
distributedshell-2.2.0.jar org.apache.hadoop.yarn.applications.distributedshell.Client --jar
$HADOOP_PREFIX/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.2.0.jar --
shell_command date --num_containers 2 --master_memory 1024
Does anyone know how to correctly format namenode?
(Regarding the test command above, someone mentioned to me that it could have something to do with the hdfs file system not functioning properly, if this is relevant.)

Need help adding multiple DataNodes in pseudo-distributed mode (one machine), using Hadoop-0.18.0

I am a student, interested in Hadoop and started to explore it recently.
I tried adding an additional DataNode in the pseudo-distributed mode but failed.
I am following the Yahoo developer tutorial and so the version of Hadoop I am using is hadoop-0.18.0
I tried to start up using 2 methods I found online:
Method 1 (link)
I have a problem with this line
bin/hadoop-daemon.sh --script bin/hdfs $1 datanode $DN_CONF_OPTS
--script bin/hdfs doesn't seem to be valid in the version I am using. I changed it to --config $HADOOP_HOME/conf2 with all the configuration files in that directory, but when the script is ran it gave the error:
Usage: Java DataNode [-rollback]
Any idea what does the error mean? The log files are created but DataNode did not start.
Method 2 (link)
Basically I duplicated conf folder to conf2 folder, making necessary changes documented on the website to hadoop-site.xml and hadoop-env.sh. then I ran the command
./hadoop-daemon.sh --config ..../conf2 start datanode
it gives the error:
datanode running as process 4190. stop it first.
So I guess this is the 1st DataNode that was started, and the command failed to start another DataNode.
Is there anything I can do to start additional DataNode in the Yahoo VM Hadoop environment? Any help/advice would be greatly appreciated.
Hadoop start/stop scripts use /tmp as a default directory for storing PIDs of already started daemons. In your situation, when you start second datanode, startup script finds /tmp/hadoop-someuser-datanode.pid file from the first datanode and assumes that the datanode daemon is already started.
The plain solution is to set HADOOP_PID_DIR env variable to something else (but not /tmp). Also do not forget to update all network port numbers in conf2.
The smart solution is start a second VM with hadoop environment and join them in a single cluster. It's the way hadoop is intended to use.

HADOOP_HOME and hadoop streaming

Hi I am trying to run hadoop on a server that has hadoop installed but I have no idea the directory where hadoop resides. The server was configure by the server admin.
In order to load hadoop I use the use command from the dotkit package.
There may be several solutions but wanted to know where the hadoop package was installed, how to set up the $HADOOP_HOME variable, and how to approp run a hadoop streaming job, such as $HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/mapred/contrib/streaming/hadoop-streaming.jar, aka, http://wiki.apache.org/hadoop/HadoopStreaming.
Thanks! any help would be greatly appreciated!
If you're using a cloudera distribution then it's most probably in /usr/lib/hadoop, otherwise it could be anywhere (at the discretion of your system admin).
There are some tricks you can use to try and locate it:
locate hadoop-env.sh (assuming that locate has been installed and updatedb has been run recently)
If the machine you're running this on is running a hadoop service (such as data node, job tracker, task tracker, name node), then you can perform a process list and grep for the hadoop command: ps axww | grep hadoop
Failing the above two, look for the hadoop root directory in some common locations such as: /usr/lib, /usr/local, /opt
Failing all this, and assuming your current user has the permissions: find / -name hadoop-env.sh
If you're install with rpm then it's most probably in /etc/hadoop.
Why don't you try:
echo $HADOOP_HOME
Obiviously the above env variable has to be set before you could even issue hadoop executables from anywhere on the box.

Resources