I started storm nimbus and supervisor with following command,
sudo storm nimbus &
sudo storm supervisor &
Is there a way to stop the cluster without using kill -9 processes
Related
I have zookeeper servers, and I'm trying to install storm using those zk servers.
My storm.yaml file looks like:
storm.zookeeper.servers:
- "ZKSERVER-0"
- "ZKSERVER-1"
- "ZKSERVER-2"
storm.local.dir: "/opt/apache-storm-2.2.0/data"
nimbus.host: "localhost"
supervisor.slots.ports:
- 6700
- 6701
- 6702
- 6703
I tested ping with those ZKSERVERs, and it worked fine.
However, when I start nimbus with ./storm nimbus command, it doesn't show any error, but it doesn't end either.
root#69e55d266f5a:/opt/apache-storm-2.2.0/bin:> ./storm nimbus
Running: /usr/jdk64/jdk1.8.0_112/bin/java -server -Ddaemon.name=nimbus -Dstorm.options= -Dstorm.home=/opt/apache-storm-2.2.0 -Dstorm.log.dir=/opt/apache-storm-2.2.0/logs -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib:/usr/lib64 -Dstorm.conf.file= -cp /opt/apache-storm-2.2.0/*:/opt/apache-storm-2.2.0/lib/*:/opt/apache-storm-2.2.0/extlib/*:/opt/apache-storm-2.2.0/extlib-daemon/*:/opt/apache-storm-2.2.0/conf -Xmx1024m -Djava.deserialization.disabled=true -Dlogfile.name=nimbus.log -Dlog4j.configurationFile=/opt/apache-storm-2.2.0/log4j2/cluster.xml org.apache.storm.daemon.nimbus.Nimbus
The terminal just shows the above logs, and that doesn't change until I run control+C.
What could be a problem here?
Can you share the log of the nimbus?
Generally, the nimbus will be in a running state until you stop it, or it faces an error. If you want to be sure about your nimbus status, you can check the log of your nimbus (./logs/nimbus.log).
on running ./storm nimbus command. The process has started as it is showing in your example. This is the usual behavior.
If you want to run the storm in the background, try to run it with the nohup command
nohup ./storm nimbus > storms.log &
I'm trying to mount a local hadoop cluster using docker and ambari, the problem im having is that ambari install check shows NTP is not running, and it is needed to know if the services installed with ambari are working. I checked ntpd in the containers and tried to launch them but it failed
[root#97ea7075ca78 ~]# service ntpd start
Starting ntpd: [ OK ]
[root#97ea7075ca78 ~]# service ntpd status
ntpd dead but pid file exists
Is there a way to start ntp daemon in those containers?
In docker you don't use the service command as there is no init system. Just run the ntpd command and it should work
ntpd by default goes to background. If that was not the case you would need to use ntpd &
I may be searching with the wrong terms, but google is not telling me how to do this. The question is how can I restart hadoop services on Dataproc after changing some configuration files (yarn properties, etc)?
Services have to be restarted on a specific order throughout the cluster. There must be scripts or tools out there, hopefully in the Dataproc installation, that I can invoke to restart the cluster.
Configuring properties is a common and well supported use case.
You can do this via cluster properties, no daemon restart required. Example:
dataproc clusters create my-cluster --properties yarn:yarn.resourcemanager.client.thread-count=100
If you're doing something more advanced, like updating service log levels, then you can use systemctl to restart services.
First ssh to a cluster node and type systemctl to see the list of available services. For example to restart HDFS NameNode type sudo systemctl restart hadoop-hdfs-namenode.service
If this is part of initialization action then sudo is not needed.
On master nodes:
sudo systemctl restart hadoop-yarn-resourcemanager.service
sudo systemctl restart hadoop-hdfs-namenode.service
on worker nodes:
sudo systemctl restart hadoop-yarn-nodemanager.service
sudo systemctl restart hadoop-hdfs-datanode.service
After that, you can use systemctl status <name> to check the service status, also check logs in /var/log/hadoop.
I have set up a Hadoop cluster of 5 virtual machines , using plain vanilla Hadoop. The cluster details are below:
192.168.1.100 - Configured to Run NameNode and SNN daemons
192.168.1.101 - Configured to Run ResourceManager daemon.
192.168.1.102 - Configured to Run DataNode and NodeManager daemons.
192.168.1.103 - Configured to Run DataNode and NodeManager daemons.
192.168.1.104 - Configured to Run DataNode and NodeManager daemons.
I have kept masters and slaves files in each virtual servers.
masters:
192.168.1.100
192.168.1.101
slaves file:
192.168.1.102
192.168.1.103
192.168.1.104
Now when I run start-all.sh command from NameNode machine, how is it able to start all the daemons? I am not able to understand it? There are no adapters installed (or I am not aware of), there are simple hadoop jars present in all the machines so how is NameNode machine able to start all the daemons in all the machines (virtual servers).
Can anyone help me understand this?
The namenode connects to the slaves via SSH and runs the slave services.
That is why you need public ssh-keys in ~/.ssh/authorized_keys on the slaves, to have their private counterparts be present for the user running the Hadoop namenode.
I am using CDH 5.3 on muiltinode cluster and on top of this I have installed Hive, Hbase, Pig and zookeeper. It has total 5 nodes.
Recently server was shutdown to upgrade number of cores in each node. First all datanodes' services were stopped and later name node services.
Below commands were used to stop all services:
DataNode:
sudo service hbase-regionserver stop
sudo service hadoop-yarn-nodemanager stop
sudo service hadoop-hdfs-datanode stop
Name Node:
sudo service mysql stop
sudo service hive-metastore stop
sudo service zookeeper-server stop
sudo service hbase-master stop
sudo service hadoop-yarn-resourcemanager stop
sudo service hadoop-mapreduce-historyserver stop
sudo service hadoop-hdfs-namenode stop
While starting this cluster, name node was started first and then all datanodes.
But when it was started, name node was not coming out of safe mode, even when all datanodes were up. Almost all the files in HDFS were corrupted. Hive metastore and hbase name space was corrupted. Due to this all data got removed from cluster.
Could anyone please give me steps on how to stop all the services and start the cluster back.
Thanks in advance.