hortonworks : start datanode failed

hortonworks : start datanode failed - hadoop

I have installed a new cluster HDP 2.3 using ambari 2.2. the problem is that namenode service can't be started and each time I try, I get the folowwing error. when I tried to find the problem I found an other error more explicit (port 50070 is used and I think that namenode use this port). Any one Has solved this problem before? thanks
resource_management.core.exceptions.Fail: Execution of 'ambari-sudo.sh
su hdfs -l -s /bin/bash -c 'ulimit -c unlimited ;
/usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh --config
/usr/hdp/current/hadoop-client/conf start namenode'' returned 1.
starting namenode, logging to
/var/log/hadoop/hdfs/hadoop-hdfs-namenode-ip-10-8-23-175.eu-west-2.compute.internal.out

in order to install hortonworks cluster, ambari tries to set core file size limit to unlimited if not set initially. It seems like linux user which is installing the cluster doesn't have the privileges to set ulimits.
just set core file size to unlimited in
/etc/security/limits.conf and it should come up.
* soft core unlimited
* hard core unlimited

Related

Datanode is in dead state as DFS used is 100 percent

I am having a standalone setup of Apache Hadoop with Namenode and Datanode running in the same machine.
I am currently running Apache Hadoop 2.6 (I cannot upgrade it) running on Ubuntu 16.04.
Although my system is having more than 400 GB of Hard disk left but my Hadoop dashboard is showing 100%.
Why Apache Hadoop is not consuming the rest of the disk space available to it? Can anybody help me figuring out the solution.

There can be certain reasons for it.
You can try following steps:
Goto $HADOOP_HOME/bin
./hadoop-daemon.sh --config $HADOOP_HOME/conf start datanode
Then you can try the following things:-
If any directory other than your namenode and datanode directories taking up too much space, you can start cleaning up
Also you can run hadoop fs -du -s -h /user/hadoop (to see usage of the directories).
Identify all the unnecessary directories and start cleaning up by running hadoop fs -rm -R /user/hadoop/raw_data (-rm is to delete -R is to delete recursively, be careful while using -R).
Run hadoop fs -expunge (to clean up the trash immediately, some times you need to run multiple times).
Run hadoop fs -du -s -h / (it will give you hdfs usage of the entire file system or you can run dfsadmin -report as well - to confirm whether storage is reclaimed)
Many times it shows missing blocks ( with replication 1).

H2O: unable to connect to h2o cluster through python

I have a 5 node hadoop cluster running HDP 2.3.0. I setup a H2O cluster on Yarn as described here.
On running following command
hadoop jar h2odriver_hdp2.2.jar water.hadoop.h2odriver -libjars ../h2o.jar -mapperXmx 512m -nodes 3 -output /user/hdfs/H2OTestClusterOutput
I get the following ouput
H2O cluster (3 nodes) is up
(Note: Use the -disown option to exit the driver after cluster formation)
(Press Ctrl-C to kill the cluster)
Blocking until the H2O cluster shuts down...
When I try to execute the command
h2o.init(ip="10.113.57.98", port=54321)
The process remains stuck at this stage.On trying to connect to the web UI using the ip:54321, the browser tries to endlessly load the H2O admin page but nothing ever displays.
On forcefully terminating the init process I get the following error
No instance found at ip and port: 10.113.57.98:54321. Trying to start local jar...
However if I try and use H2O with python without setting up a H2O cluster, everything runs fine.
I executed all commands as the root user. Root user has permissions to read and write from the /user/hdfs hdfs directory.
I'm not sure if this is a permissions error or that the port is not accessible.
Any help would be greatly appreciated.

It looks like you are using H2O2 (H2O Classic). I recommend upgrading your H2O to the latest (H2O 3). There is a build specifically for HDP2.3 here: http://www.h2o.ai/download/h2o/hadoop
Running H2O3 is a little cleaner too:
hadoop jar h2odriver.jar -nodes 1 -mapperXmx 6g -output hdfsOutputDirName
Also, 512mb per node is tiny - what is your use case? I would give the nodes some more memory.

How to start Datanode? (Cannot find start-dfs.sh script)

We are setting up automated deployments on a headless system: so using the GUI is not an option here.
Where is start-dfs.sh script for hdfs in Hortonworks Data Platform? CDH / cloudera packages those files under the hadoop/sbin directory. However when we search for those scripts under HDP they are not found:
$ pwd
/usr/hdp/current
Which scripts exist in HDP ?
[stack#s1-639016 current]$ find -L . -name \*.sh
./hadoop-hdfs-client/sbin/refresh-namenodes.sh
./hadoop-hdfs-client/sbin/distribute-exclude.sh
./hadoop-hdfs-datanode/sbin/refresh-namenodes.sh
./hadoop-hdfs-datanode/sbin/distribute-exclude.sh
./hadoop-hdfs-nfs3/sbin/refresh-namenodes.sh
./hadoop-hdfs-nfs3/sbin/distribute-exclude.sh
./hadoop-hdfs-secondarynamenode/sbin/refresh-namenodes.sh
./hadoop-hdfs-secondarynamenode/sbin/distribute-exclude.sh
./hadoop-hdfs-namenode/sbin/refresh-namenodes.sh
./hadoop-hdfs-namenode/sbin/distribute-exclude.sh
./hadoop-hdfs-journalnode/sbin/refresh-namenodes.sh
./hadoop-hdfs-journalnode/sbin/distribute-exclude.sh
./hadoop-hdfs-portmap/sbin/refresh-namenodes.sh
./hadoop-hdfs-portmap/sbin/distribute-exclude.sh
./hadoop-client/sbin/hadoop-daemon.sh
./hadoop-client/sbin/slaves.sh
./hadoop-client/sbin/hadoop-daemons.sh
./hadoop-client/etc/hadoop/hadoop-env.sh
./hadoop-client/etc/hadoop/kms-env.sh
./hadoop-client/etc/hadoop/mapred-env.sh
./hadoop-client/conf/hadoop-env.sh
./hadoop-client/conf/kms-env.sh
./hadoop-client/conf/mapred-env.sh
./hadoop-client/libexec/kms-config.sh
./hadoop-client/libexec/init-hdfs.sh
./hadoop-client/libexec/hadoop-layout.sh
./hadoop-client/libexec/hadoop-config.sh
./hadoop-client/libexec/hdfs-config.sh
./zookeeper-client/conf/zookeeper-env.sh
./zookeeper-client/bin/zkCli.sh
./zookeeper-client/bin/zkCleanup.sh
./zookeeper-client/bin/zkServer-initialize.sh
./zookeeper-client/bin/zkEnv.sh
./zookeeper-client/bin/zkServer.sh
Notice: there are ZERO start/stop sh scripts..
In particular I am interested in the start-dfs.sh script that starts the namenode(s) , journalnode, and datanodes.

How to start DataNode
su - hdfs -c "/usr/lib/hadoop/bin/hadoop-daemon.sh --config /etc/hadoop/conf start datanode";
Github - Hortonworks Start Scripts
Update
Decided to hunt for it myself.
Spun up a single node with Ambari, installed HDP 2.2 (a), HDP 2.3 (b)
sudo find / -name \*.sh | grep start
Found
(a) /usr/hdp/2.2.8.0-3150/hadoop/src/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/s‌tart-dfs.sh
Weird that it doesn't exist in /usr/hdp/current, which should be symlinked.
(b) /hadoop/yarn/local/filecache/10/mapreduce.tar.gz/hadoop/sbin/start-dfs.sh

The recommended way to administer your hadoop cluster would be via the administrator panel. Since you are working on Hotronworks distribution, it makes more sense for you to use Ambari instead.

Cloudera installation dfs.datanode.max.locked.memory issue on LXC

I have created virtual box, ubuntu 14.04LTS environment on my mac machine.
In virtual box of ubuntu, I've created cluster of three lxc-containers. One for master and other two nodes for slaves.
On master, I have started installation of CDH5 using following link http://archive.cloudera.com/cm5/installer/latest/cloudera-manager-installer.bin
I have also made necessary changes in the /etc/hosts including FQDN and hostnames. Also created passwordless user named as "ubuntu".
While setting up the CDH5, during installation I'm constantly facing following error on datanodes. Max locked memory size: dfs.datanode.max.locked.memory of 922746880 bytes is more than the datanode's available RLIMIT_MEMLOCK ulimit of 65536 bytes.
Exception in secureMain: java.lang.RuntimeException: Cannot start datanode because the configured max locked memory size (dfs.datanode.max.locked.memory) of 922746880 bytes is more than the datanode's available RLIMIT_MEMLOCK ulimit of 65536 bytes.
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1050)
at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:411)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2297)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2184)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2231)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2407)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2431)

Krunal,
This solution will be probably be late for you but maybe it can help somebody else so here it is. Make sure your ulimit is set correctly. But in case its a config issue.
goto:
/run/cloudera-scm-agent/process/
find latest config dir,
in this case:
1016-hdfs-DATANODE
search for parameter in this dir:
grep -rnw . -e "dfs.datanode.max.locked.memory"
./hdfs-site.xml:163: <name>dfs.datanode.max.locked.memory</name>
and edit the value to the one he is expecting in your case(65536)

I solved by opening a seperate tab in Cloudera and set the value from there

Hadoop CDH3 ERROR. Could not start Hadoop datanode daemon

I'm deploying Hadoop CDH3 in pseudo-distributed mode on a VPS.
So i have installed CDH3, then i have executed
sudo apt-get install hadoop-0.20-conf-pseudo
but if i try to start all daemons with
for service in /etc/init.d/hadoop-0.20-*; do sudo $service start; done
it throws
ERROR. Could not start Hadoop datanode daemon
The same installation and starting commands works on my notebook.
I don't understand the cause. In fact the log file is empty. The available RAM is about 900MB, with 98G of available disk space.
Which can be the cause or how can i discover it? I'm excluding that the error is from the configuration files.

Consider using Cloudera Manager, it could save you some time (especially if you use multiple nodes). There is a nice video on Youtube which shows deployment process

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio