Hadoop : start-dfs.sh Connection refused - hadoop

I have a vagrant box on debian/stretch64
I try to install Hadoop3 with documentation
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.htm
When I run start-dfs.sh
I have this message
vagrant#stretch:/opt/hadoop$ sudo sbin/start-dfs.sh
Starting namenodes on [localhost]
pdsh#stretch: localhost: connect: Connection refused
Starting datanodes
pdsh#stretch: localhost: connect: Connection refused
Starting secondary namenodes [stretch]
pdsh#stretch: stretch: connect: Connection refused
vagrant#stretch:/opt/hadoop$
of course I tried to update my hadoop-env.sh with :
export HADOOP_SSH_OPTS="-p 22"
ssh localhost work (without password)
I have not ideas what I can change to solve this problem

There is a problem the way pdsh works by default (see edit), but Hadoop can go without it. Hadoop checks if the system has pdsh on /usr/bin/pdsh and uses it if so. An easy way get away from using pdsh is editing $HADOOP_HOME/libexec/hadoop-functions.sh
replace the line
if [[ -e '/usr/bin/pdsh' ]]; then
by
if [[ ! -e '/usr/bin/pdsh' ]]; then
then hadoop goes without pdsh and everything works.
EDIT:
A better solution would be use pdsh, but with ssh instead rsh as explained here, so replace line from $HADOOP_HOME/libexec/hadoop-functions.sh:
PDSH_SSH_ARGS_APPEND="${HADOOP_SSH_OPTS}" pdsh \
by
PDSH_RCMD_TYPE=ssh PDSH_SSH_ARGS_APPEND="${HADOOP_SSH_OPTS}" pdsh \
Obs: Only doing export PDSH_RCMD_TYPE=ssh, as I mention in the comment, doesn't work. I don't know why...
I've also opened a issue and submitted a patch to this problem: HADOOP-15219

I fixed this problem for hadoop 3.1.0 by adding
PDSH_RCMD_TYPE=ssh
in my .bashrc as well as $HADOOP_HOME/etc/hadoop/hadoop-env.sh.

check if your /etc/hosts file contains the hostname stretch and localhost mapping or not
my /etc/hosts file

Go to your hadoop home directory
~$ cd libexec
~$ nano hadoop-functions.sh
edit this line:
if [[ -e '/usr/bin/pdsh' ]]; then
with:
if [[ ! -e '/usr/bin/pdsh' ]]; then

Additionally, it is recommended that pdsh also be installed for better ssh resource management. —— Hadoop: Setting up a Single Node Cluster
We can remove pdsh to solve this problem.
apt-get remove pdsh

Check if the firewalls are running on your vagrant box
chkconfig iptables off
/etc/init.d/iptables stop
if not that have a look in the underlying logs /var/log/...

I was dealing with my colleague's problem.
he configured ssh using hostname from the hosts file and specified ip in the workers.
after I rewrote the workers file everything worked.
~/hosts file
10.0.0.1 slave01
#ssh-copy-id hadoop#slave01
~/hadoop/etc/workers
slave01

I added export PDSH_RCMD_TYPE=ssh to my .bashrc file, logged out and back in and it worked.
For some reason simply exporting and running right away did not work for me.

Related

pdsh not working with ips in the file

I have a text file, like this:
cat hed.txt
10.21.23.12
10.23.12.12
I can ssh to each ip without without prompting for the key verification.
I want to run a command on each of these IPs, so I was using pdsh. I tried multiple options, but I am getting following error:
pdsh -w ^hed uptime
00f12e86-cfcc-4239-9dfc-006b65a319c3: ssh: Could not resolve hostname 00f12e86-cfcc-4239-9dfc-006b65a319c3: nodename nor servname provided, or not known
pdsh#saurabh: 00f12e86-cfcc-4239-9dfc-006b65a319c3: ssh exited with exit code 255
I mentioned here, I tried following as well, but this also gave same error.
PDSH_SSH_ARGS_APPEND="-o StrictHostKeyChecking=no" pdsh -R ssh -w ^hed uptime
Also tried comment from here, but no help.
PDSH_SSH_ARGS_APPEND="-o StrictHostKeyChecking=no" pdsh -R ssh ^hed uptime
pdsh#saurabh: no remote hosts specified
I am able to do csshx on these via: csshX --host hed.txt, which works but pdsh will suit more for my work which is not working.
Ahh, This worked like this:
pdsh -w '^hed.txt' uptime
For my colleagues it is working without quotes as well with same version of pdsh, which is weird.

Hadoop - requestion for network lan password during starting cluster

I can't understant what password is expected by hadoop.
I configured it according to tutorial. I do:
sudo su
#bash start-dfs.sh
And now it expects someting like password lan's network. I have no idea what should I write.
As you can see, I run script as root. Of course master (from that I run script) may ssh to slaves as root without password (I configured and tested it).
Disclaimer: It is possbile that I give incorrect name (for example for script name - it is beacause of I don't understand exactly now. However I am sure that it was about something like lan's network password)
Help me please, for which a password is it?
Edit: I was using http://backtobazics.com/big-data/setup-multi-node-hadoop-2-6-0-cluster-with-yarn/
It seems you may not setup passwordless-ssh. Passwordless-ssh is required to run hadoop services (daemons). So try to setup ssh among nodes
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
$ chmod 0600 ~/.ssh/authorized_keys
Then ssh user#hostname

How to copy files from one machine to another machine

I want to copy /home/cmind012/m.sh from one system to another system (both system Linux) using shell script.
Command $
scp /home/cmind012/m.sh cmind013:/home/cmind013/tanu
getting message
ssh: cmind013: Name or service not known
lost connection
It seems that cmind013 is not being resolved, I would try using first
nslookup cming013
and see what why donesn't it resolve.
It seems that you are missing the IP Address/Domain of the remote host. The format should be user#host:[directory]
You could do the following:
scp -r [directory/files] [remote host]:[destination directory]
ex: scp -r /var/www/html/* root#192.168.1.0:/var/www/html/
Try the following command:
scp /home/cmind012/m.sh denil#172.22.192.105:/home/denil/

Setup Pseudo Distributed / Single Node Setup Apache Hadoop 2.2

I have installed Apache Hadoop 2.2 as Single Node Cluster. When I am trying to execute giraph example, it ends up with error "LocalJobRunner, you cannot run in split master/worker mode since there is only 1 task at a time".
I was going through forums, and I found that I can update mapred-site.xml to have 4 mappers. I tried that but still no help. I came across, one more forum were I can change single node setup to behave as pseudo distributed mode and it resolved the issue.
Can someone please let me know, which config files do I need to change to get single node setup behave as pseudo distributed mode.
Adding to renZzz answer, You also need to check that if you can ssh to the localhost without a passphrase:
$ ssh localhost
If you cannot ssh to localhost without a passphrase, execute the following commands:
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
Following link can help you- https://hadoop.apache.org/docs/current2/hadoop-project-dist/hadoop-common/SingleNodeSetup.html
for my first setup, I followed some manuals, but surely the best one for single node setup, was the pdf Apache Hadoop YARN_sample. I recommond you to use this manual step by step
First, ensure that the number of workers is one. Then, you need to configure Giraph not to split workers and master via:
giraph.SplitMasterWorker=false
You can either set it in giraph-site.xml or pass via command
line option:
-ca giraph.SplitMasterWorker=false
Ref:
https://www.mail-archive.com/user#giraph.apache.org/msg01631.html

Messed up sed syntactics in hadoop startup script after reinstalling JVM

i'm trying to run 3 node Hadoop cluster on Windows Azure cloud. I've gone through configuration, and test launch. Everything look fine, however, as i used to use OpedJDK which is not recommended as VM for Hadoop according to what i read, i decide to replace it with Oracle Server JVM. Removed old installation of java with Yum, along with all java folders in /usr/lib, installed most recent version of Oracle JVM, updated PATH and JAVA_HOME variables; however, now on launch i getting following masseges:
sed: -e expression #1, char 6: unknown option to `s'
64-Bit: ssh: Could not resolve hostname 64-Bit: Name or service not known
HotSpot(TM): ssh: Could not resolve hostname HotSpot(TM): Name or service not known
Server: ssh: Could not resolve hostname Server: Name or service not known
VM: ssh: Could not resolve hostname VM: Name or service not known
e.t.c. (in total about 20-30 strings with words which should not have anything in common with hostnames)
For me it looks like it's trying to pass part of code as Hostname because of incorrect usage of sed in start up script:
if [ "$HADOOP_SLAVE_NAMES" != '' ] ; then
SLAVE_NAMES=$HADOOP_SLAVE_NAMES
else
SLAVE_FILE=${HADOOP_SLAVES:-${HADOOP_CONF_DIR}/slaves}
SLAVE_NAMES=$(cat "$SLAVE_FILE" | sed 's/#.*$//;/^$/d')
fi
# start the daemons
for slave in $SLAVE_NAMES ; do
ssh $HADOOP_SSH_OPTS $slave $"${#// /\\ }" \
2>&1 | sed "s/^/$slave: /" &
if [ "$HADOOP_SLAVE_SLEEP" != "" ]; then
sleep $HADOOP_SLAVE_SLEEP
fi
done
Which looks unchanged, so the question is: how change of JVM could affect sed? And how can i fix it?
So i found an answer to this question: My guess was wrong, and everything with sed is fine. Problem however was in how Oracle JVM works with external libraries compare to OpenJDK. It did throw exception where script was not expecting it, and it ruin whole sed input.
You can fix it by adding following system variables:
HADOOP_COMMON_LIB_NATIVE_DIR which should point to /lib/native folder of your Hadoop installation and add -Djava.library.path=/opt/hadoop/lib to whatever options you already have in HADOOP_OPTS variable (notice that /opt/hadoop is my installation folder, you might need to change it in order for stuff to work properly).
I personally add export commands to hadoop-env.sh script, but adding it to .bash file or start-all.sh should work as well.

Resources