Hadoop NFS unable to start the Hadoop NFS gateway - hadoop

I am trying to install the NFS gateway on a Hadoop cluster.
Unfortunately I am not able to start the nfs gateway with the following Error.
I have also tried to add more debugging info by modifying the log4j file to include "Debug" info. the Log4j file does not seem to be affecting the output. So I also need to know how to increase the logging level.
************************************************************/
14/05/22 10:59:43 INFO nfs3.Nfs3Base: registered UNIX signal handlers for [TERM, HUP, INT]
Exception in thread "main" java.lang.IllegalArgumentException: value already present: sshd
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:115)
at com.google.common.collect.AbstractBiMap.putInBothMaps(AbstractBiMap.java:112)
at com.google.common.collect.AbstractBiMap.put(AbstractBiMap.java:96)
at com.google.common.collect.HashBiMap.put(HashBiMap.java:85)
at org.apache.hadoop.nfs.nfs3.IdUserGroup.updateMapInternal(IdUserGroup.java:85)
at org.apache.hadoop.nfs.nfs3.IdUserGroup.updateMaps(IdUserGroup.java:110)
at org.apache.hadoop.nfs.nfs3.IdUserGroup.<init>(IdUserGroup.java:54)
at org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.<init>(RpcProgramNfs3.java:172)
at org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.<init>(RpcProgramNfs3.java:164)
at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.<init>(Nfs3.java:41)
at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:52)
14/05/22 10:59:45 INFO nfs3.Nfs3Base: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down Nfs3 at
************************************************************/
I suspected it is related to the following Issue https://issues.apache.org/jira/browse/HDFS-5587, however I do not understand from this issue what action I need to take.

This is documented in the following Ticket, with workaround below:
https://issues.apache.org/jira/browse/HDFS-5587
Issue in my case was that sshd, and some other users existed in both the ldap and local box, but the UIDs did not match.
NFS gateway can't start with duplicate name or id on the host system.
This is because HDFS (non-kerberos cluster) uses name as the only way
to identify a user or group. The host system with duplicated
user/group name or id might work fine most of the time by itself.
However when NFS gateway talks to HDFS, HDFS accepts only user and
group name. Therefore, same name means the same user or same group. To
find the duplicated names/ids, one can do: and on Linux systms, and on MacOS.

Related

How to specify the address of ResourceManager to bin/yarn-session.sh?

I am a newbie in Flink.
I'm confused about how to specify the address of ResourceManager when run bin/yarn-session.sh?
When starting a Flink Yarn session via bin/yarn-session.sh then it will create a .yarn-properties-USER file in your tmp directory. This file will contain the connection information for the Flink cluster. When trying to submit a job via bin/flink run <JOB_JAR>, the client will use the connection information from this file.

Cannot start running on browser the namenode for Hadoop

It is my first time in installing Hadoop on my Linux (Fedora distro) running on VM (using Parallel on my Mac). And I followed every step on this video and including the textual version of it.And then when I run it on localhost (or the equivalent value from hostname) in port 50070, I got the following message.
...can't establish a connection to the server at localhost:50070
When I run the jps by the way command I don't have the datanode and namenode unlike at the end of the textual version tutorial which has the following:
While mine has only the following processes running:
6021 NodeManager
3947 SecondaryNameNode
5788 ResourceManager
8941 Jps
When I run the hadoop namenode command I have some of the following [redacted] error:
Cannot access storage directory /usr/local/hadoop_store/hdfs/namenode
16/10/11 21:52:45 WARN namenode.FSNamesystem: Encountered exception loading fsimage
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /usr/local/hadoop_store/hdfs/namenode is in an inconsistent state: storage directory does not exist or is not accessible.
I tried to access by the way the above mentioned directories and it existed.
Any hint for this newbie? ;-)
You would need to give read and write permission to user with which you are running the services on directory /usr/local/hadoop_store/hdfs/namenode.
Once done, you should run format command using hadoop namenode -format
Then try to start your services.
delete files /app/hadoop/tmp/*
and try again formatting the namenode and then start-dfs.sh & start-yarn.sh

Do we need to start Kafka for each user if started as root

Hello I am new to kafka and zookeeper concept. I have installed kafka and zookeeper in root and started as root user as nohup. The jps command gives output as:
root#rachita-Aspire-V7-481P:/usr/share/zookeeper/bin# jps
4037 Elasticsearch
1689 QuorumPeerMain
9899 Kafka
1692 Jps
3469 QuorumPeerMain
But when I try jps for rachita user the output is:
rachita#rachita-Aspire-V7-481P:/usr/share/zookeeper/bin$ jps
3261 Jps
Do I need to start kafka for every user on my machine who wish to use it?
Please give me any suggestions.
Also Haddop is installed as a separate user called hduser inside a group called hadoop. So can any user start all hadoop daemons or only hduser can do it.
Please help me with this. I am getting confused.
No, we don't need to start it for each user. Service should be started once by kafka-server-start on each node, which configured for kafka broker. If you are running kafka-server-start several times on a single node, you will start multiple brokers in it. Your user just does not have permissions to maintain the service, that is why you don't see it.
Best practice is to create a separate user as a member of hadoop group for each hadoop deamon and start it under this user account.

Need help adding multiple DataNodes in pseudo-distributed mode (one machine), using Hadoop-0.18.0

I am a student, interested in Hadoop and started to explore it recently.
I tried adding an additional DataNode in the pseudo-distributed mode but failed.
I am following the Yahoo developer tutorial and so the version of Hadoop I am using is hadoop-0.18.0
I tried to start up using 2 methods I found online:
Method 1 (link)
I have a problem with this line
bin/hadoop-daemon.sh --script bin/hdfs $1 datanode $DN_CONF_OPTS
--script bin/hdfs doesn't seem to be valid in the version I am using. I changed it to --config $HADOOP_HOME/conf2 with all the configuration files in that directory, but when the script is ran it gave the error:
Usage: Java DataNode [-rollback]
Any idea what does the error mean? The log files are created but DataNode did not start.
Method 2 (link)
Basically I duplicated conf folder to conf2 folder, making necessary changes documented on the website to hadoop-site.xml and hadoop-env.sh. then I ran the command
./hadoop-daemon.sh --config ..../conf2 start datanode
it gives the error:
datanode running as process 4190. stop it first.
So I guess this is the 1st DataNode that was started, and the command failed to start another DataNode.
Is there anything I can do to start additional DataNode in the Yahoo VM Hadoop environment? Any help/advice would be greatly appreciated.
Hadoop start/stop scripts use /tmp as a default directory for storing PIDs of already started daemons. In your situation, when you start second datanode, startup script finds /tmp/hadoop-someuser-datanode.pid file from the first datanode and assumes that the datanode daemon is already started.
The plain solution is to set HADOOP_PID_DIR env variable to something else (but not /tmp). Also do not forget to update all network port numbers in conf2.
The smart solution is start a second VM with hadoop environment and join them in a single cluster. It's the way hadoop is intended to use.

cdh4.3,Exception from the logs ,after ./start-dfs.sh ,datanode and namenode start fail

here is the logs from hadoop-datanode-...log:
FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool BP-1421227885-192.168.2.14-1371135284949 (storage id DS-30209445-192.168.2.41-50010-1371109358645) service to /192.168.2.8:8020
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException): Datanode denied communication with namenode: DatanodeRegistration(0.0.0.0, storageID=DS-30209445-192.168.2.41-50010-1371109358645, infoPort=50075, ipcPort=50020, storageInfo=lv=-40;cid=CID-f16e4a3e-4776-4893-9f43-b04d8dc651c9;nsid=1710848135;c=0)
at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:648)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:3498)
my mistake:namenode can start,datanode can't start
I saw this once too, the namenode server needs to do a reverse lookup request ,
so an nslookup 192.168.2.41 should return a name, it doesn't so 0.0.0.0 is also recorded
You don't need to hardcode address into /etc/hosts if you have dns working correctly (i.e. the in-addr.arpa file matches the entries in domain file) But if you don't have dns then you need to help hadoop out.
There seems to be a Name Resolution issue.
Datanode denied communication with namenode:
DatanodeRegistration(0.0.0.0,
storageID=DS-30209445-192.168.2.41-50010-1371109358645,
infoPort=50075, ipcPort=50020,
Here DataNode is identifying itself as 0.0.0.0.
Looks like dfs.hosts enforcement. Can you recheck on your NameNode's hdfs-site.xml configs that you are surely not using a dfs.hosts file?
This error may arise if the datanode that is trying to connect to the namenode is either listed in the file defined by dfs.hosts.exclude or that dfs.hosts is used and that datanode is not listed within that file. Make sure the datanode is not listed in excludes, and if you are using dfs.hosts, add it to the includes. Restart hadoop after that and run hadoop dfsadmin -refreshNodes.
HTH
Reverse DNS lookup is required when a datanode tries to register with a namenode. I got the same exceptions with Hadoop 2.6.0 because my DNS does not allow reverse lookup.
But you can disable Hadoop's reverse lookup by setting this configuration "dfs.namenode.datanode.registration.ip-hostname-check" to false in hdfs-site.xml
I got this solution from here and it solved my problem.

Resources