I installed hadoop and my HBase is running on top of it. All my deamons in hadoop is up and running. After i started my hbase i could see the HMaster running when i gave the JPS command.
I'm running my hadoop in Pseudo distributed mode . When i checked my localhost it shows regionserver is running.
But why couldn't i see the HRegionServer running in my Terminal in Linux?
It might be because hbase.cluster.distributed is not set or set to false in hbase-site.xml .
According to http://hbase.apache.org/book/config.files.html :
hbase.cluster.distributed :
The mode the cluster will be in. Possible values are false for
standalone mode and true for distributed mode. If false, startup will
run all HBase and ZooKeeper daemons together in the one JVM. Default:
false
So if you set it to true you'll see the distinct master, region server and ZooKeeper processes. E.g: a pseudo-distributed Hadoop/HBase process list would look like this:
jps
3991 HMaster
4209 HRegionServer
3140 DataNode
3464 TaskTracker
3246 JobTracker
2942 NameNode
3924 HQuorumPeer
Related
my question might be duplicated, i had searched similar question here but it didn't solve my problem.
so i'm new in hadoop, now i'm setting up multinode cluster which is 1 master and 2 slaves. when i run jps command on master node, my terminal shows this
3250 - DataNode
3090 - NameNode
4099 - jps
3498 - SecondaryNameNode
and when i run jps command on slaves node, my terminal shows this
3896 - DataNode
4684 - jps
4111 - SecondaryNameNode
according to this tutorial link, my master node would have the this output
jps
namenode
secondary namenode
resource manager
and my slaves node would have this
jps
NodeManager
DataNode
so on master node there is no resource manager and on slaves node there is no node manager
edit
when i start start-dfs.sh command, the output of my terminal shows this
Starting namenodes on [HadoopMaster]
starting datanodes
starting secondary namenodes [farhan-master]
and when i start start-yarn.sh command, the output of my terminal shows this
starting resourcemanager
starting nodemanager
how do i solve this problem? thanks in advance
iam using hadoop apache 2.7.1
on centos 7 environment
and i have an HA Cluster which consists of two name nodes(mn1 and mn2)
and 6 data nodes
issuing jps on mn1 shows
34734 DFSZKFailoverController
34245 NameNode
31529 DFSAdmin
34551 JournalNode
34822 Jps
3857 QuorumPeerMain
and issuing jps on mn2 shows
26272 JournalNode
26483 Jps
26110 NameNode
26388 DFSZKFailoverController
2259 QuorumPeerMain
what does DFSAdmin Process in mn1 jps output refers to ?
i noticed that this dfsadmin process appeared in the following scenario:
when number of failed journal nodes exceeds possible numbers of journal nodes for cluster to continue working
Which is defined in formula
(N-1)/ 2 where N is number of journal nodes
So the cluster will not continue to work and active name node shut downs
so if i started accepted number of journal nodes and name node again
jps on active name node shows dfsadmin process
and this problem was solved by restarting all cluster services a gian
Am new to the Big data environment and just started with installing a 3 Node Hadoop cluster 2.6 with HA Capability using Zookeeper.
All works good for now and i have tested the Failover scenario using zookeeper on NN1 and NN2 and works well.
Now i was thinking to install Apache Spark on my Hadoop Yarn cluster also with HA Capability.
Can anyone guide me with the installation steps ? I could only find on how to setup Spark on Stand alone mode and which i have setup successfully. Now i want to install the same in Yarn cluster along with HA Capability ,
I have three node cluster (NN1 , NN2 , DN1) , the following daemons are currently running on each of these servers ,
Nodes running in Master NameNode (NN1)
Jps
DataNode
DFSZKFailoverController
JournalNode
ResourceManager
NameNode
QuorumPeerMain
NodeManager
Nodes running in StandBy NameNode (NN2)
Jps
DFSZKFailoverController
NameNode
QuorumPeerMain
NodeManager
JournalNode
DataNode
Nodes running in DataNode (DN1)
QuorumPeerMain
Jps
DataNode
JournalNode
NodeManager
You should setup ResourceManager HA (http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html). Spark when run on YARN doesn't run its own daemon processes, so there is no spark part that requires HA in YARN mode.
You can configure the Spark Yarn mode, In Yarn mode you can configure the Driver and Executors Depends on the Cluster capacity.
spark.executor.memory <value>
Number of executors are allocated based on your YARN Container memory!
We have a Hadoop setup with 2 Master nodes and 1 Slave node.
We have configured the Hadoop cluster. After configuring, when we executed "jps" command, we are getting following output on my Master Node:
13405 NameNode
14614 Jps
13860 ResourceManager
13650 DataNode
14083 NodeManage
On my second Master Node, output is:
9698 Jps
9234 DataNode
9022 NameNode
9450 NodeManager
On my Data Node, output is:
21681 NodeManager
21461 DataNode
21878 Jps
I feel my secondary node is not running. Please tell me this is right or wrong. If its wrong, what should be the status of my node? Please answer me as soon as possible.
You can check status of node by running below command
hdfs haadmin -getServiceState
when i give the command
for service in /etc/init.d/hadoop*
>do
>sudo $service stop
>done
its stops all the service
and when i give
for service in /etc/init.d/hadoop-hdfs-*
>do
>sudo $service stop
>done
its stops all the service
it sometimes start datanode and sometimes namenode
eg:
21270 NameNode
21422 Jps
21374 SecondaryNameNode
2624 HMaster
or
11070 DataNode
11422 Jps
11554 SecondaryNameNode
2554 HMaster
same thing happens for jobtracker and tasktracker
I tried formating the namenode but it didnt help
I also changing the path of localhost in
core-site.xml from 8020 to 50020
and also in mapred-site.xml from 8021 to 50020
this time it shows NameNode, DataNode, JobTracker,Tasktracker using jps
but when i check the browser localhost:50070 and localhost:50030
it refers to 8020 instead of 50020.
why is this happening ?
please help
Run the following script from terminal to stop the running hadoop daemons.
> $HADOOP_INSTALL/hadoop/bin/stop-all.sh
Run the following script from terminal to start the hadoop daemons.
$HADOOP_INSTALL/hadoop/bin/start-all.sh