Must the username of NameNode be equal to DataNode's? - hadoop

I run namenode with user hduser#master, datanodes are running with user1#slave1, user1#slave2. Setting up SSh keys works fine, I can ssh remotely to my DataNode machines from master.
However, when I try to run the hadoop-daemons.sh for my datanodes it fails because it tries to ssh with the wrong user:
hduser#master:~$ hadoop-daemons.sh start datanode
hduser#slave3's password: hduser#slave1's password: hduser#slave2's password:
slave1: Permission denied (publickey,password).
slave2: Permission denied (publickey,password).
slave3: Permission denied (publickey,password).
I tried to reset the public and private key for my master and copying it to the data nodes
$ ssh-keygen -t rsa -P ""
$ ssh-copy-id -i $HOME/.ssh/id_rsa.pub user1#slave1
But gives me same error.
Does the user on the NameNode need to be the same as for the DataNodes?

Answer: After resetting the VMs and adding same user and installing Hadoop on the Data Nodes with the same user as for Name Node, it worked. So I guess the answer is yes...

Related

I keep getting "Permission Denied" in Google Cloud terminal when trying to open Hadoop

I am trying to run Hadoop on a GCP. When ever I type in the command
start-dfs.ssh && start-yarn
I get the following.......
localhost: chuckpryorjr#localhost: Permission denied (publickey).
localhost: chuckpryorjr#localhost: Permission denied (publickey).
Starting secondary namenodes [0.0.0.0]
0.0.0.0: chuckpryorjr#0.0.0.0: Permission denied (publickey).
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop-ecosystem/hadoop-2.9.2/logs/yarn-chuckpryorj
r-resourcemanager-hadoopmasters.out
localhost: chuckpryorjr#localhost: Permission denied (publickey).
I don't get it. Before they used to prompt me for a password(which I didn't ever recall making), now its just outright denying me. How can make this passwordless? Also the very first time I installed hadoop on GCP and ran it..it worked fine. Sometimes I can get through to complete my work..sometimes I cant.
How can make this passwordless?
$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
Then update your local authorized keys file for localhost
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
$ chmod 0600 ~/.ssh/authorized_keys
And if you have other servers, you can use ssh-copy-id to place the key into those
In my case when i added my Hadoop user to sudoer it worked fine.
sudo adduser hadoop sudo

Hadoop : Permission denied (publickey,password, keyboard-interactive)

While installing Hadoop I got many errors but this one just doesn't go. No matter what I do, it keeps popping again and again. As soon as I am starting Hadoop by the command ./start-all.sh, I get the error:
localhost: rajneeshsahai#localhost: Permission denied
(publickey,password,keyboard-interactive)
Error logs:
Starting namenodes on [localhost]
localhost: rajneeshsahai#localhost: Permission denied (publickey,password,keyboard-interactive).
Starting datanodes
localhost: rajneeshsahai#localhost: Permission denied (publickey,password,keyboard-interactive).
Starting secondary namenodes [MacBook-Air.local]
MacBook-Air.local: rajneeshsahai#macbook-air.local: Permission denied (publickey,password,keyboard-interactive).
2020-05-29 18:42:06,106 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting resourcemanager
resourcemanager is running as process 2937. Stop it first.
Starting nodemanagers
localhost: rajneeshsahai#localhost: Permission denied (publickey,password,keyboard-interactive).
I already tried the following things:
ssh-keygen -t rsa
cat ~/.ssh/id-rsa.pub >> ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys
I think repeating this process has created multiple keys in my system.
sudo passwd
Configured /etc/ssh/sshd_config
(i) Changed PermitRootLogin prohibit-password to PermitRootLogin yes
(ii) Changed PasswordAuthentication no to PasswordAuthentication yes
I do have one doubt: Do I have to remove the hash tag (#) from the lines?
I am using macOS Catalina.
You can try the following:
$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
$ chmod 0600 ~/.ssh/authorized_keys
On Windows WSL2 Ubuntu container you have to restart the ssh service to make it available for Hadoop. You could try to run the Hadoop in a docker container. See https://github.com/big-data-europe/docker-hadoop.
In the Ubuntu 20.04 container each time before I start Hadoop I restart the ssh service.
sudo service ssh restart
For more details see the following tutorial https://dev.to/samujjwaal/hadoop-installation-on-windows-10-using-wsl-2ck1.

Hadoop nodes do not ask for passwords during start

When I try to ssh into localhost, I am prompted for password. See below
"
ssh connection to localhost:
[hadoop#mftrhel74 sbin]$ ssh localhost
hadoop#localhost's password:
Last login: Fri Aug 23 15:44:08 2019 from mah"
---The above statement means, passwordless connection is not setup----
But when I try to start Hadoop nodes as below, it doesn't prompt for password.
And the nodes are not starting, I see below message
I think it should prompt me to enter the password for the user just like as SSH connection is to be established.
[hadoop#mftrhel74 ~]$ start-dfs.sh
Starting namenodes on [mftrhel74]
mftrhel74: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
Starting datanodes
localhost: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
Starting secondary namenodes [mftrhel74]
mftrhel74: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
************I DO NOT WANT A PASSWORDLESS CONNECTION*****
I suspect you are able to log in to one of the nodes with SSH, however probably you have not set up passwordless ssh between the nodes, so the steps you try to execute from the node will fail.
Here is some documentation that should explain that you need to set up passwordless ssh or otherwise install an ambari client (assuming you work on HDP).
https://ambari.apache.org/1.2.2/installing-hadoop-using-ambari/content/ambari-chap1-5-2.html

unable to start start-dfs.sh in Hadoop Multinode cluster

I have created a hadoop multinode cluster and also configured SSH in both master and slave nodes now i can connect to slave without password in master node
But when i try to start-dfs.sh in master node I'm unable to connect to slave node the execution stops at below line
log:
HNname#master:~$ start-all.sh
starting namenode, logging to /usr/local/hadoop/libexec/../logs/hadoop-HNname-namenode-master.out
HDnode#slave's password: master: starting datanode, logging to /usr/local/hadoop/libexec/../logs/hadoop-HNname-datanode-master.out
I pressed Enter
slave: Connection closed by 192.168.0.2
master: starting secondarynamenode, logging to /usr/local/hadoop/libexec/../logs/hadoop-HNname-secondarynamenode-master.out
jobtracker running as process 10396. Stop it first.
HDnode#slave's password: master: starting tasktracker, logging to /usr/local/hadoop/libexec/../logs/hadoop-HNname-tasktracker-master.out
slave: Permission denied, please try again.
HDnode#slave's password:
after entering the slave password the connection is closed
Below things I have tried but no results:
formatted namenode in both master & slave node
created new ssh key and configured in both the nodes
override the default HADOOP_LOG_DIR form the this post
I think you missed this step "Add the SSH Public Key to the authorized_keys file on your target hosts"
Just redo the password-less ssh step correctly. Follow this:
Generate public and private SSH keys
ssh-keygen
Copy the SSH Public Key (id_rsa.pub) to the root account on your
target hosts
.ssh/id_rsa
.ssh/id_rsa.pub
Add the SSH Public Key to the authorized_keys file on your target
hosts
cat id_rsa.pub >> authorized_keys
Depending on your version of SSH, you may need to set permissions on
the .ssh directory (to 700) and the authorized_keys file in that
directory (to 600) on the target hosts.
chmod 700 ~/.ssh
chmod 600 ~/.ssh/authorized_keys
Check the connection:
ssh root#<remote.target.host>
where <remote.target.host> has the value of each host name in your cluster.
If the following warning message displays during your first
connection: Are you sure you want to continue connecting (yes/no)?
Enter Yes.
Refer: Set Up Password-less SSH
Note: password will not be asked, if your passwordless ssh is setup properly.
Make sure to start hadoop services with a new user called hadoop.
Then make sure to add the public key to the slaves with that new user.
If this doesn't work, check your firewall or iptables
I hope it helps
That mean you haven't created public key properly.
Follow below sequence.
Create User
Give all required permissions to that user
Generate public key with same user
Format Name Node
Start hadoop services.
Now it should not ask for password.

how to restart hadoop cluster on emr

I have a hadoop installation on the Amazon Elastic MapReduce , whenever I try to restart the cluster I get the following error:
/stop-all.sh
no jobtracker to stop
The authenticity of host 'localhost (::1)' can't be established. RSA key fingerprint is
Are you sure you want to continue connecting (yes/no)? yes
localhost: Warning: Permanently added 'localhost' (RSA) to the list of known hosts.
localhost: Permission denied (publickey).
no namenode to stop
localhost: Permission denied (publickey).
localhost: Permission denied (publickey).
Any idea on how to restart hadoop?
Following hack worked for me.
I have replaced "ssh" command in sbin/slaves.sh & sbin/hadoop-daemon.sh with "ssh -i ~/.ssh/keyname"
I'm using hadoop version 2.4 and this worked for me:
export HADOOP_SSH_OPTS="-i /home/hadoop/mykey.pem"
For the stop-all.sh script to work, you probably need to have the same user in all the machines as the user with which you are executing the stop-all.sh script.
Moreover, it seems you do not have a password less ssh setup from the machine you are executing stop-all.sh to rest of the machines that will spare you from manually entering the password for each machine separately. Passwords might be different for the same user for different machines, please don't forget that.

Resources