hadoop 2.9.1 failed to start DataNode - hadoop

I'm new with Apache Hadoop and I'm trying to install in Alpine (docker container) in pseudo-distribuited mode Apache Hadoop 2.9.1 but I got this error when I run start-dfs.sh
localhost: /usr/local/hadoop/sbin/hadoop-daemon.sh: line 131: 883 Aborted (core dumped) nohup nice -n $HADOOP_NICENESS $hdfsScript --config $HADOOP_CONF_DIR $command "$#" > "$log" 2>&1 < /dev/null
The NameNode and SecondaryNameNode start succesfully but the DataNode no.

I had the exact same problem and also every version >2.9.1 ended in a quick core dump of a the DataNode in Docker.
The comment from #OneCricketeer actually lead me to the right direction which should have been an acceptable answer - so here is a quick heads up for future users:
Apparently, components of the DataNode won't work well with Alpine/Musl and switching to e.g. an Ubuntu-based parent image like 8-jdk solves this problem.
Here is a link to the Dockerfile I am currently using.

Related

Why does Yarn ResourceManager always shutdown as I submit a job?

I am now learning how to build a Hadoop cluster and the first step is to try a Pseudo-Distributed cluster following the guide of https://hadoop.apache.org/docs/r3.3.1/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation. And I succeeded to start yarn by call $HADOOP_HOME/sbin/start-dfs.sh and $HADOOP_HOME/sbin/start-yarn.sh. The output of jps is
However, if I submit a job, which does nothing though, the ResouceManager shutdown immediately.
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.1.jar wordcount input output
The output of console is
and the log is
The result of strace to ResoruceManager is
+++ killed by SIGKILL +++
I struggled days and have not figured it out. Any insight advice would be welcome.
Oh I forgot to leave the version:
Hadoop 3.3.1
WSL: 2, Ubuntu 20.04
Windows 11: 22518.1000

Preferred way to start a hadoop datanode

Functionally, I was wondering whether there is a difference between starting a hadoop namenode with the command:
$HADOOP_HOME/sbin/hadoop-daemons.sh --config $HADOOP_CONF_DIR --script hdfs start datanode
and:
hdfs datanode
The first command gives me an error stating that the datanode cannot ssh into itself(this is running on a docker container), while the second command seems to run without that issue. The official hadoop documentation for this version, (2.9.1) doesn't mention "hdfs datanode" as a way to start a datanode.

Zookeeper startup issues/confusion

Apart from the issue I am already having, I installed Zookeeper BEFORE I installed HBase (it's still not installed), after I saw a video on it. While installing it, I faced numerous issue, which I've now overcome, but I am left with one challenging one; probably the only one I will have to. So, the installation part has gone through well. I start zookeeper with the following command: sudo /home/hduser/zookeeper/bin/zkServer.sh start and (I am ok with it because) this is the result:
ZooKeeper JMX enabled by default
Using config: /home/hduser/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
YES! IT'S STARTED (after almost 50 mintutes of digging on the internet). But nevertheless, when I jps, this is what I get:
8499 SecondaryNameNode
8162 NameNode
8983 NodeManager
9370 Jps
8313 DataNode
8672 ResourceManager
Exactly!! No QuorumPeerMain! BUT wait.. When I sudo jps, I get this:
8499 -- process information unavailable
9243 QuorumPeerMain
8162 -- process information unavailable
8983 -- process information unavailable
9429 Jps
8313 -- process information unavailable
8672 -- process information unavailable
You see there? There's the QuorumPeerMain (minus the fact that it say process information unavailable against the perfectly relatable processes), riding the process 9243.
Can you tell me why that's happeneing?
Also, because of this discrepancy (or inconvenience), do you think HBase installation will be an issue?
I don't think it should matter, but this is a Mint machine (Sarah).
Thanks in advance!
The QuorumPeerMain service is visible with sudo jps command because you are running the Zookeeper with sudo /home/hduser/zookeeper/bin/zkServer.sh. You should run the Zookeeper without sudo in command then it will be visible in jps command result.
As you have started the Zookeeper with sudo the Zookeeper directory is having the files with root permissions you have to update the owner of these directories to run it with normal command.
Once you make above changes the hbase installation will not create any problem.

Could not find and execute start-all.sh and Stop-all.sh on Cloudera VM for Hadoop

How to start / Stop services from command line CDH4 --. I am new to Hadoop. Installed VM from Cloudera. Could not find start-all.sh and stop-all.sh . How to stop or start the task tracker or data node if I want. It is a single node cluster which I am using on Centos. I haven't dont any modifications.
More over I see there are changes in the directory structures in all flavours. I could not locate these sh files on the VM for my installation.
[cloudera#localhost ~]$ stop-all.sh
bash: stop-all.sh: command not found
Highly appreciate your support.
use Sudo su hdfs to start and to stop just type exit it will stop all the services.

Need help adding multiple DataNodes in pseudo-distributed mode (one machine), using Hadoop-0.18.0

I am a student, interested in Hadoop and started to explore it recently.
I tried adding an additional DataNode in the pseudo-distributed mode but failed.
I am following the Yahoo developer tutorial and so the version of Hadoop I am using is hadoop-0.18.0
I tried to start up using 2 methods I found online:
Method 1 (link)
I have a problem with this line
bin/hadoop-daemon.sh --script bin/hdfs $1 datanode $DN_CONF_OPTS
--script bin/hdfs doesn't seem to be valid in the version I am using. I changed it to --config $HADOOP_HOME/conf2 with all the configuration files in that directory, but when the script is ran it gave the error:
Usage: Java DataNode [-rollback]
Any idea what does the error mean? The log files are created but DataNode did not start.
Method 2 (link)
Basically I duplicated conf folder to conf2 folder, making necessary changes documented on the website to hadoop-site.xml and hadoop-env.sh. then I ran the command
./hadoop-daemon.sh --config ..../conf2 start datanode
it gives the error:
datanode running as process 4190. stop it first.
So I guess this is the 1st DataNode that was started, and the command failed to start another DataNode.
Is there anything I can do to start additional DataNode in the Yahoo VM Hadoop environment? Any help/advice would be greatly appreciated.
Hadoop start/stop scripts use /tmp as a default directory for storing PIDs of already started daemons. In your situation, when you start second datanode, startup script finds /tmp/hadoop-someuser-datanode.pid file from the first datanode and assumes that the datanode daemon is already started.
The plain solution is to set HADOOP_PID_DIR env variable to something else (but not /tmp). Also do not forget to update all network port numbers in conf2.
The smart solution is start a second VM with hadoop environment and join them in a single cluster. It's the way hadoop is intended to use.

Resources