"Cannot set priority of datanode process" Unable to start datanode due to keytab password issue - hadoop

I am attempting to start a datanode on a kerberised hadoop cluster, however when I run the command:
sudo ./hdfs --config $HADOOP_HOME/etc/hadoop --daemon start datanode
I am met with the error "ERROR: Cannot set priority of datanode process". in the logs I can see that the error is:
org.apach.hadoop.security.KerberosAuthException: failure to login: for principal: hdfs/hadoopworker1.securerealm.com#SECUREREALM.COM from keytab /etc/security/keytabs/hdfs.service.keytab javax.security.auth.login.LoginException: Unable to obtain passord from user
I can confirm that the keytab indicated definitely exists in the specified location and that the principal specified exists. The principal and keytab match the way they are called in hdfs-site.xml and in yarn-site.xml. My master node called its keytabs the same way without issue, but the keytabs for my workers seem to be causing issues.

Related

NameNode Format error "failure to login for principal: X from keytab Y: Unable to obtain password from user" with Kerberos in a Hadoop cluster

I've been setting up Kerberos with my Hadoop cluster on Ubuntu 20.04.1 LTS and when I try to reformat the namenode in command line after changing all config files and setting everything up (including principals and keytabs), I'm being met by the error:
Exiting with status 1: org.apache.hadoop.security.KerberosAuthException: failure to login: for principal: hdfs/hadoopmaster.406bigdata.com#406BIGDATA.COM from keytab /etc/security/keytabs/hdfs.service.keytab javax.security.auth.login.LoginException: Unable to obtain password from user
This is taking place on my master node, with host name "hadoopmaster". Keytabs are stored in /etc/security/keytabs and when checking the keytabs using klist -t -k -e, the keytab has the correct principal "hdfs/hadoopmaster.406bigdata.com#406BIGDATA"
My hdfs-site.xml file consists of the following properties (includes more, but not included in code below as shouldn't be relevant to the error):
<property>
<name>dfs.namenode.keytab.file</name>
<value>/etc/security/keytabs/hdfs.service.keytab</value>
</property
<property>
<name>dfs.namenode.kerberos.principal</name>
<value>hdfs/hadoopmaster.406bigdata.com#406BIGDATA.COM</value>
</property>
I also have yarn setup with keytabs and principals and that starts up fine (log files have been checked and no errors) and can access the WebUI.
Tried changing filepaths of the keytabs outside of the root directory, double checked /etc/hosts file, the file has correct permissions and ownerships but nothing has helped fix the issue.
What happens when you su hdfs and try and use the keytab? --> does hdfs user have permissions to access the file?

Cannot start running on browser the namenode for Hadoop

It is my first time in installing Hadoop on my Linux (Fedora distro) running on VM (using Parallel on my Mac). And I followed every step on this video and including the textual version of it.And then when I run it on localhost (or the equivalent value from hostname) in port 50070, I got the following message.
...can't establish a connection to the server at localhost:50070
When I run the jps by the way command I don't have the datanode and namenode unlike at the end of the textual version tutorial which has the following:
While mine has only the following processes running:
6021 NodeManager
3947 SecondaryNameNode
5788 ResourceManager
8941 Jps
When I run the hadoop namenode command I have some of the following [redacted] error:
Cannot access storage directory /usr/local/hadoop_store/hdfs/namenode
16/10/11 21:52:45 WARN namenode.FSNamesystem: Encountered exception loading fsimage
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /usr/local/hadoop_store/hdfs/namenode is in an inconsistent state: storage directory does not exist or is not accessible.
I tried to access by the way the above mentioned directories and it existed.
Any hint for this newbie? ;-)
You would need to give read and write permission to user with which you are running the services on directory /usr/local/hadoop_store/hdfs/namenode.
Once done, you should run format command using hadoop namenode -format
Then try to start your services.
delete files /app/hadoop/tmp/*
and try again formatting the namenode and then start-dfs.sh & start-yarn.sh

Ambari show namenode is stop but actually namenode is still working

We are using HDP 2.7.1.2.3 with Ambari 2.1.2
After finish setup, every node status is correct.
But oneday ambari suddenly show namdenode is stopped.(we don't change any config of ambari or namenode)
However, we still can use HBASE and run MapReduce.
we think name node status should be normal.
We try to restart namenode and check ambari-server log
It shows:
ServiceComponentHostImpl:949 - Host role transitioned to a new state, serviceComponentName=NAMENODE, oldState=STARTING, currentState=STARTED
HeartBeatHandler:657 - State of service component NAMENODE of service HDFS of cluster wae has changed from STARTED to INSTALLED
we don't understand why its status change from "STARTED" to "INSTALLED".
In namenode side, we check ambari-agent.log
It shows one warning:
[Alert][namenode_directory_status] HA nameservice value is present but there are no aliases for {{hdfs-site/dfs.ha.namenodes.{{ha-nameservice}}}}
We think it is irrelevant.
What's the reason that ambari think namenode is stopped?
Is there any way that we can fix this issue?
Run the command ambari-server restart from linux terminal in Ambari server node
Run the command ambari-agent restart from linux terminal in all the nodes in the cluster.
You can run the command hdfs dfsadmin -report from the terminal as hdfs user to confirm all the nodes are up and running.

Hadoop on Google Compute Engine

I am trying to setup hadoop cluster in Google Compute Engine through "Launch click-to-deploy software" feature .I have created 1 master and 1 slave node and tried to start the cluster using start-all.sh script from master node and i got error "permission denied(publickey)" .
I have generated public and private keys in both slave and master nodes .
currently i logged into the master with my username, is it mandatory to login into master as "hadoop" user .If so ,what is the password for that userid .
please let me know how to overcome this problem .
The deployment creates a user hadoop which owns Hadoop-specific SSH keys which were generated dynamically at deployment time; this means since start-all.sh uses SSH under the hood, you must do the following:
sudo su hadoop
/home/hadoop/hadoop-install/bin/start-all.sh
Otherwise, your "normal" username doesn't have SSH keys properly set up so you won't be able to launch the Hadoop daemons, as you saw.
Another thing to note is that the deployment should have already started all the Hadoop daemons automatically, so you shouldn't need to manually run start-all.sh unless you're rebooting the daemons after some manual configuration updates. If the daemons weren't running after the deployment ran, you may have encountered some unexpected error during initialization.

Multi-NodeHadoop: NameNode and DataNode not working

I am new student on Hadoop clusters, and I built a multi-node in the lab
But I cannot start NameNode or DataNode.
After I execute start-all.sh and jps: only shows jobtracker, tasktracker, secondenamenode, jps on Master. But slaves works good with datanode and tasktracker
And when I execute stop-all.sh:
it should shows: No tasttracker to stop, but it did show in jps
And this is the log file about NameNode:
1.Cannot access storage directory /app/hadoop/tmp/dfs/name
2.ERROR org.apache.hadoop.hdfs.server![enter image description here][2].namenode.FSNamesystem: FSNamesystem initialization failed.
3.org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /app/hadoop/tmp/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
4.org.apache.hadoop.hdfs.server.namenode.NameNode: org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /app/hadoop/tmp/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
And I did try the namenode -format, yet it doesn't work.
Could somebody show me the way, and tell me why this happens?
Lots of thanks ahead.
PS: I am using hadoop1.0.3 + java1.7.0_51
I think you did not give permissions to data dir of tmp.data.dir.
Try bellow command to give permissions and try your start-all.sh once.
sudo chown $USER /(DIR NAME).
And try this command:
hadoop namenode -format

Resources