Zookeeper startup issues/confusion - hadoop

Apart from the issue I am already having, I installed Zookeeper BEFORE I installed HBase (it's still not installed), after I saw a video on it. While installing it, I faced numerous issue, which I've now overcome, but I am left with one challenging one; probably the only one I will have to. So, the installation part has gone through well. I start zookeeper with the following command: sudo /home/hduser/zookeeper/bin/zkServer.sh start and (I am ok with it because) this is the result:
ZooKeeper JMX enabled by default
Using config: /home/hduser/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
YES! IT'S STARTED (after almost 50 mintutes of digging on the internet). But nevertheless, when I jps, this is what I get:
8499 SecondaryNameNode
8162 NameNode
8983 NodeManager
9370 Jps
8313 DataNode
8672 ResourceManager
Exactly!! No QuorumPeerMain! BUT wait.. When I sudo jps, I get this:
8499 -- process information unavailable
9243 QuorumPeerMain
8162 -- process information unavailable
8983 -- process information unavailable
9429 Jps
8313 -- process information unavailable
8672 -- process information unavailable
You see there? There's the QuorumPeerMain (minus the fact that it say process information unavailable against the perfectly relatable processes), riding the process 9243.
Can you tell me why that's happeneing?
Also, because of this discrepancy (or inconvenience), do you think HBase installation will be an issue?
I don't think it should matter, but this is a Mint machine (Sarah).
Thanks in advance!

The QuorumPeerMain service is visible with sudo jps command because you are running the Zookeeper with sudo /home/hduser/zookeeper/bin/zkServer.sh. You should run the Zookeeper without sudo in command then it will be visible in jps command result.
As you have started the Zookeeper with sudo the Zookeeper directory is having the files with root permissions you have to update the owner of these directories to run it with normal command.
Once you make above changes the hbase installation will not create any problem.

Related

Cannot start running on browser the namenode for Hadoop

It is my first time in installing Hadoop on my Linux (Fedora distro) running on VM (using Parallel on my Mac). And I followed every step on this video and including the textual version of it.And then when I run it on localhost (or the equivalent value from hostname) in port 50070, I got the following message.
...can't establish a connection to the server at localhost:50070
When I run the jps by the way command I don't have the datanode and namenode unlike at the end of the textual version tutorial which has the following:
While mine has only the following processes running:
6021 NodeManager
3947 SecondaryNameNode
5788 ResourceManager
8941 Jps
When I run the hadoop namenode command I have some of the following [redacted] error:
Cannot access storage directory /usr/local/hadoop_store/hdfs/namenode
16/10/11 21:52:45 WARN namenode.FSNamesystem: Encountered exception loading fsimage
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /usr/local/hadoop_store/hdfs/namenode is in an inconsistent state: storage directory does not exist or is not accessible.
I tried to access by the way the above mentioned directories and it existed.
Any hint for this newbie? ;-)
You would need to give read and write permission to user with which you are running the services on directory /usr/local/hadoop_store/hdfs/namenode.
Once done, you should run format command using hadoop namenode -format
Then try to start your services.
delete files /app/hadoop/tmp/*
and try again formatting the namenode and then start-dfs.sh & start-yarn.sh

Ambari show namenode is stop but actually namenode is still working

We are using HDP 2.7.1.2.3 with Ambari 2.1.2
After finish setup, every node status is correct.
But oneday ambari suddenly show namdenode is stopped.(we don't change any config of ambari or namenode)
However, we still can use HBASE and run MapReduce.
we think name node status should be normal.
We try to restart namenode and check ambari-server log
It shows:
ServiceComponentHostImpl:949 - Host role transitioned to a new state, serviceComponentName=NAMENODE, oldState=STARTING, currentState=STARTED
HeartBeatHandler:657 - State of service component NAMENODE of service HDFS of cluster wae has changed from STARTED to INSTALLED
we don't understand why its status change from "STARTED" to "INSTALLED".
In namenode side, we check ambari-agent.log
It shows one warning:
[Alert][namenode_directory_status] HA nameservice value is present but there are no aliases for {{hdfs-site/dfs.ha.namenodes.{{ha-nameservice}}}}
We think it is irrelevant.
What's the reason that ambari think namenode is stopped?
Is there any way that we can fix this issue?
Run the command ambari-server restart from linux terminal in Ambari server node
Run the command ambari-agent restart from linux terminal in all the nodes in the cluster.
You can run the command hdfs dfsadmin -report from the terminal as hdfs user to confirm all the nodes are up and running.

How to check if hdfs is running?

I would like to see if the hdfs file system for Hadoop is working properly. I know that jps lists the daemons that are running, but I don't actually know which daemons to look for.
I ran the following commands:
$HADOOP_PREFIX/sbin/hadoop-daemon.sh start namenode
$HADOOP_PREFIX/sbin/hadoop-daemon.sh start datanode
$HADOOP_PREFIX/sbin/yarn-daemon.sh start resourcemanager
$HADOOP_PREFIX/sbin/yarn-daemon.sh start nodemanager
Only namenode, resourcemanager, and nodemanager appeared when I entered jps.
Which daemons are supposed to be running in order for hdfs/Hadoop to function? Also, what could you do to fix hdfs if it is not running?
Use any of the following approaches for to check your deamons status
JPS command would list all active deamons
the below is the most appropriate
hadoop dfsadmin -report
This would list down details of datanodes which is basically in a sense your HDFS
cat any file available in hdfs path.
So, I spent two weeks validating my setup (it was fine) , finally found this command:
sudo -u hdfs jps
Initially my simple JPS command was showing only one process, but Hadoop 2.6 under Ubuntu LTS 14.04 was up. I was using 'Sudo' to run the startup scripts.
Here is the startup that work with JPS listing multiple processes:
sudo su hduser
/usr/local/hadoop/sbin/start-dfs.sh
/usr/local/hadoop/sbin/start-yarn.sh

Spark Standalone Mode: Worker not starting properly in cloudera

I am new to the spark, After installing the spark using parcels available in the cloudera manager.
I have configured the files as shown in the below link from cloudera enterprise:
http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM4Ent/4.8.1/Cloudera-Manager-Installation-Guide/cmig_spark_installation_standalone.html
After this setup, I have started all the nodes in the spark by running /opt/cloudera/parcels/SPARK/lib/spark/sbin/start-all.sh. But I couldn't run the worker nodes as I got the specified error below.
[root#localhost sbin]# sh start-all.sh
org.apache.spark.deploy.master.Master running as process 32405. Stop it first.
root#localhost.localdomain's password:
localhost.localdomain: starting org.apache.spark.deploy.worker.Worker, logging to /var/log/spark/spark-root-org.apache.spark.deploy.worker.Worker-1-localhost.localdomain.out
localhost.localdomain: failed to launch org.apache.spark.deploy.worker.Worker:
localhost.localdomain: at java.lang.ClassLoader.loadClass(libgcj.so.10)
localhost.localdomain: at gnu.java.lang.MainThread.run(libgcj.so.10)
localhost.localdomain: full log in /var/log/spark/spark-root-org.apache.spark.deploy.worker.Worker-1-localhost.localdomain.out
localhost.localdomain:starting org.apac
When I run jps command, I got:
23367 Jps
28053 QuorumPeerMain
28218 SecondaryNameNode
32405 Master
28148 DataNode
7852 Main
28159 NameNode
I couldn't run the worker node properly. Actually I thought to install a standalone spark where the master and worker work on a single machine. In slaves file of spark directory, I given the address as "localhost.localdomin" which is my host name. I am not aware of this settings file. Please any one cloud help me out with this installation process. Actually I couldn't run the worker nodes. But I can start the master node.
Thanks & Regards,
bips
Please notice error info below:
localhost.localdomain: at java.lang.ClassLoader.loadClass(libgcj.so.10)
I met the same error when I installed and started Spark master/workers on CentOS 6.2 x86_64 after making sure that libgcj.x86_64 and libgcj.i686 had been installed on my server, finally I solved it. Below is my solution, wish it can help you.
It seem as if your JAVA_HOME environment parameter didn't set correctly.
Maybe, your JAVA_HOME links to system embedded java, e.g. java version "1.5.0".
Spark needs java version >= 1.6.0. If you are using java 1.5.0 to start Spark, you will see this error info.
Try to export JAVA_HOME="your java home path", then start Spark again.

Hadoop installation - Datanode running, but not showing in JPS

I have installed CDH3U5 on a 2 node cluster. Everything seems to run fine such as all the services, web UI, MR jobs, HDFS shell commands. However, interestingly, when I started the datanode service, it gave me an OK message that datanode is running as process say X. But when I run JPS, I do not see the label "Datanode" for the process. So the output looks like -
17153 TaskTracker
18908 Jps
16267
The process ID - 16267 is the Datanode process. All other checkpoints have passed. So this seems weird. The same thing happens on the other node in the cluster. Any insight into this behavior and if this is something that needs fixing would be helpful.
can you check the following and reply?
- web interface for namenode and what does it show there for livenode
- logfiles for datanode to see if any exception
- if datanode is pingable/ssh from namenode and viceversa
If all the above look ok I'm not sure what the problem is but to fix you can
- stop all hadoop deamons
- delete temp directory pointed in conf/core-site.xml for both NN and DN
- format namenode
- start deamon

Resources