restart server using shell script in jenkins - shell

We are using jenkins to restart our servers. The order of execution is as follows:
Stop server1
Stop server2
Start server2
Start server1
I have created a jenkins 'Free style project', where server1 and server2 stops as expected and server2 starts as expected.
But console gets hanged after server2 has started, which does not initiate to start server1.
There is no space related issues in our jenkins application.
Can anyone let me know how to resolve this issue..

Related

How to keep session alive while running the playbook for longer period of time?

I am running ansible playbook to restart some of our servers but we need to sleep for 40 minutes between each server restart so if I sleep for 40 minutes in my playbook then it sleeps for a while but then my session gets terminated on Ubuntu box in prod and whole script is also stopped. Is there anything I can add in ansible playbook so that it can keep my session alive during the time whole playbook is running?
# This will restart servers
---
- hosts: tester
serial: "{{ num_serial }}"
tasks:
- name: copy files
copy: src=conf.prod dest=/opt/process/config/conf.prod owner=goldy group=goldy
- name: stop server
command: sudo systemctl stop server_one.service
- name: start server
command: sudo systemctl start server_one.service
- name: sleep for 40 minutes
pause: minutes=40
I want to sleep for 40 minutes without terminating my linux session and then move to next set of servers restart.
I am running ansible 2.6.3 version.
You can run your ansible script inside screen in order to keep the session alive even after disconnection.
Basically what you want to do is ssh into the production server, run screen, then execute the playbook inside the newly created session.
If you ever get disconnected, you can connect back to the server, then run screen -r to get back into your saved session.

zookeeper automatic start operation fails when firewall is stopped at rc.local script

iam using hadoop apache 2.7.1 on centos 7
and my cluster is ha cluster and iam using zookeeper quorum for automatic failover
but i want to automate zookeeper start process and ofcourse in the shell script we have to stop firewall first in order to let other quorum elements able to contact current zookeeper element
iam writing the following script in /etc/rc.d/rc.local
hostname jn1
systemctl stop firewalld
ZOOKEEPER='/usr/local/zookeeper-3.4.9/'
source /etc/rc.d/init.d/functions
source $ZOOKEEPER/bin/zkEnv.sh
daemon --user root $ZOOKEEPER/bin/zkServer.sh start
but iam facing the problem that when iam issuing the command
systemctl stop firewalld
in rc.local
and issuing zkServer status after host boots iam getting the error
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper-3.4.9/bin/../conf/zoo.cfg
Error contacting service. It is probably not running.
but if i execute the same commands with out a script i mean after my host boots as normal process
systemctl status firewalld
zkServer start
there is no problem and zkstatus shows its mode
i have noticed the difference in zookeeper.out log between executing rc.local script and normal commands after the host boots
and the difference is reading server environments in normal commands execute
what could be the effect of stopping firewall at rc.local script to server environment and how to handle it
?
i have abig headache about stopping and restarting firewall scenarios
and i discovered that stopping firewall at rc.local is a fake stopping
so because idon't want fire wall to work at all i ended up with the following solution
systemctl disable firewalld
https://www.rootusers.com/how-to-disable-the-firewall-in-centos-7-linux/
so firewall is not going to work again at any boot

Zookeeper cannot be launched remotely via ssh

Strange issue:
The zookeeper works normally on my cluster if I start it using ./zkServer.sh on each machine, respectively.
However, when I try to start it remotely from the master node:
ssh 192.168.xxx.xxx "/opt/apache/zookeeper-3.4.5/bin/zkServer.sh start"
it looks fine:
JMX enabled by default
Using config: /opt/apache/zookeeper-3.4.5/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
But actually, zookeeper is not running on that machine, which can be confirmed by jps.
The most strange thing is:
If I directly start zookeeper on that node using ./zkServer.sh start then I can successfully stop it remotely from the master node using
ssh 192.168.xxx.xxx "/opt/apache/zookeeper-3.4.5/bin/zkServer.sh stop"
Why could this happen? Any help would be appreciated.
Though after a lot of search I was able to get it, below is the command that successfully starts it.
ssh -i "somekey.pem" username#hostname 'bash -i -c "~/zookeeper-3.4.6/bin/zkServer.sh start"'

Why does clock offset error in the host keeps occurring again and again : cloudera

I have stopped the ntpd and restarted it again. Have done a ntpdate pool.ntp.org. the error went once and the hosts were healthy but after sometime again got a clock offset error.
Also I observed that after doing a ntpdate the web interface of cloudera stopped working. It says potential mismatch configuration fix and restart hue.
I have the cloudera quick start vm with centos setup on VMware.
Check if /etc/ntp.conf file is the same across all nodes/masters
restart ntp
add deamon with chkconfig and set it to on
You can fix it by restarting the NTP service which syncronizes the time with a central source.
You can do this by logging in as root from the commandline and running service ntpd restart.
After about a minute the error in CM shoud go away.
Host Terminal
sudo su
service ntpd restart
Clock offset Error occur on Cloudera Manager if host\node's NTP service could not located or did not respond to a request for the clock offset.
Solution:
1)Identify NTP Server IP or Get details of NTP Server IP for your hadoop Cluster
2)On your Hadoop Cluster Nodes Edit-> /etc/ntp.conf
3)Add entries in ntp.conf
server [NTP Server IP]
server xxx.xx.xx.x
4)Restart Services.Execute
Service ntpd restart
5) Restart Cluster From Cloudera Manager
Note: If Problem Still Persist .Reboot you Hadoop Nodes & Check Process.
Check $ cat /etc/ntp.conf make sure configuration file is same as others (nodes)
$ systemctl restart ntpd
$ ntpdc -np
$ ntpdate -u 0.centos.pool.ntp.org
$ hwclock --systohc
$ systemctl restart cloudera-scm-agent
After that wait a few seconds to let it auto configure.

Ansible: wait_for in with_items

I have a set of web server processes that I wish to restart one at a time. I want to wait for process N to be ready to service HTTP requests before restarting process N+1
The following works:
- name: restart server 9990
supervisorctl: name='server_9990' state=restarted
- wait_for: port=9990 delay=1
- name: restart server 9991
supervisorctl: name='server_9991' state=restarted
- wait_for: port=9991 delay=1
etc.
But I'd really like to do this in a loop. It seems that Ansible doesn't allow multiple tasks inside a loop (in this case, I need two tasks: supervisorctl and wait_for)
Am I missing a way to do this or is replicating these tasks for each instance of the server really the way to go?
I believe that's not possible with Ansible default functionality. I think your best bet would be to create your own module. I had seen that modules can call other modules, so you might be able to have a slim module which simply calls the supervisorctl module and then waits for the port to be ready. This module then could be called with with_items.
Another idea is to not use supervisorctl in the first place. You could run a shell or script task which does then manually call supervisorctl and waits for the port to open.

Resources