How to run a specific program before systemd's watchdog stops a service - systemd

I have a program which is run by systemd with a service file like this:
[Unit]
Description=...
[Service]
Type=notify
ExecStart=/usr/sbin/myprogram
WatchdogSec=1
KillMode=process
KillSignal=SIGTERM
Restart=always
It sends the respective signal to the watchdog regularly. From time to time, the program seems to hang and is terminated by the watchdog, then restarts. Before the watchdog terminates it, I'd like to capture some information from the program by executing a command or running some other script (e.g. run gdb -p <PID> --batch -ex 'thread apply all backtrace'). How would I do this?

Add a ExecStop= to your service.
[Service]
ExecStart=....
ExecStop=/path/to/SomeOtherProgram
....
According to systemd manual, if ExecStop option is available, it will run that first, then if the process under ExecStart is still available after this, it will run the KillMode.
ExecStop=
Commands to execute to stop the service started via
ExecStart=. This argument takes multiple command lines, following the
same scheme as described for ExecStart= above. Use of this setting is
optional. After the commands configured in this option are run, it is
implied that the service is stopped, and any processes remaining for
it are terminated according to the KillMode= setting (see
systemd.kill(5)). If this option is not specified, the process is
terminated by sending the signal specified in KillSignal= when service
stop is requested. Specifier and environment variable substitution is
supported (including $MAINPID, see above).
EDIT
As in the comment below, this solution may not work for Watchdog option in the service file.

Related

Start systemctl from a Bash script and don't wait for it

I need to call systemctl start myservice near the end of a Bash script, but I really don't care about whether it will be successful or when it intends to return. I just need to start the action. It's others' task to monitor the status of that service. My script should return as quickly as possible, no matter whether that service has completed starting, I'm not depending on that.
My first thought was to use something like this:
# do other work
systemctl start myservice &
echo "done"
# end of script
But I've read that this is problematic with signals or in non-interactive environments, where my script is usually called. So I read on and found the nohup command, but that seems to write output files anywhere and might hang if you don't redirect stdin from /dev/null, they say.
So I still don't know how to do this correctly. I'm open for a generic way to start-and-forget any process from a Bash script, or for systemctl specifically as this will be my only use case for now.
I found a pretty easy solution to this:
systemctl start --no-block myservice
The --no-block option can be used for starting, stopping etc. and it won't wait for the actual process to finish. More details in the manpage of systemctl.
If you simply want to start systemctl and you don't want to wait for it, use exec to replace the current process with the systemctl call. For example, instead of backgrounding the process, simply use:
exec systemctl ....
You may want to include the --no-pager option to ensure that the process isn't piped to a pager which would block waiting for user input, e.g.
exec systemctl --no-pager ....
Of course your echo "done" will never be reached, but that wasn't pertinent to your script.

How to run multiple command systemd

i want to multiple command in myapp.service file in systemd
[Unit]
Description=to serve myapp
[Service]
User=ubuntu
WorkingDirectory=/home/ubuntu/myapp
ExecStart=/home/ubuntu/.local/bin/pserve production.ini http_port=5000
ExecStart=/home/ubuntu/.local/bin/pserve production.ini http_port=5001
Restart=always
[Install]
WantedBy=multi-user.target
it throws error saying invalid argument.
i want to run two commands
pserve production.ini http_port=5000
pserve production.ini http_port=5001
How do i do that??
You can start multiple background processes from one systemd unit, but systemd will not be able to track them for you and do all the nice things that it does to support a daemon, such as send signals to it on various system events or auto-restart it when needed.
If you must have it as a single unit, then you can do one of the following (in my order of preference):
make the two servers separate units (note you may be able to use the same config file for both, so they are two 'instances' of the same service - which makes sense, they run the same server). You will have two entries in the list of running services when you run 'systemctl'.
make that unit a one-shot (runs a program that exits and is not monitored and restarted). Make the one-shot command start both servers in background, e.g.,
sh -c " { pserve production.ini http_port=5000 & pserve production.ini http_port=5001 & } </dev/null >/dev/null >&1"
make a script that launches both daemons and watches for them, restarting them if needed and kills them when it is killed itself. Then you make that script the 'daemon' that systemd runs. Not really worth it, IMO - because you're doing much of the work that systemd itself is best suited to do. Of course you can spin a new copy of systemd that is configured to run just those two servers (and make that systemd as your 'one-service-for-two-commands' unit), but that seems an overkill.

Daemonizing an executable in ansible

I am trying to create a task in ansible which executes a shell command to run an executable in daemon mode using &. Something like following
-name: Start daemon
shell: myexeprogram arg1 arg2 &
What am seeing is if I keep & the task returns immediately and the process is not started . If I remove & ansible task waits for quite some time without returning.
Appreciate suggestion on proper way to start program in daemon mode through ansible. Pls note that I dont want to run this as a service but an adhoc background process based on certain conditions.
Running program with '&' does not make program a daemon, it just runs in background. To make a "true daemon" your program should do steps described here.
If your program is written in C, you can call daemon() function, which will do it for you. Then you can start your program even without '&' at the end and it will be running as a daemon.
The other option is to call your program using daemon, which should do the job as well.
- name: Start daemon
shell: daemon -- myexeprogram arg1 arg2
When you (or Ansible) log out the exit signal will still be sent to the running process, even though it is running in the background.
You can use nohup to circumvent that.
- name: Start daemon
shell: nohup myexeprogram arg1 arg2 &
http://en.wikipedia.org/wiki/Nohup
From the brief description on what you want to achieve, it sounds like it would be best for you to set up your executable as a service (using Upstart or similar) and then start/stop it as needed based on the other conditions that require it to be running (or not running).
Trying to run this as a process otherwise will entail having to capture the PID or similar so you can try and shut down the daemon you have started when you need to, with pretty much the same amount of complexity as installing an init config file would take and without the niceties that systems such as Upstart give you with the controls such as start/stop.
I found the best way, particularly because I wanted output to be logged, was to use the "daemonize" package. If you are on CentOS/Redhat, like below. There is probably also an apt-package for it.
- name: yum install daemonize
yum:
name: daemonize
state: latest
- name: run in background, log errors and standout to file
shell: daemonize -e /var/log/myprocess.log -o /var/log/myprocess.log /opt/myscripts/myprocess.sh
Adding to the daemonize suggestions above, if you want to start your program in a specific directory you can do:
- name: install daemonize package
package:
name: daemonize
state: latest
- name: start program
command: daemonize -c /folder/to/run/in /path/to/myexeprogram arg1 arg2
Notably, you also probably want the -e -o flags to log output.

Kafka in supervisor mode

I'm trying to run kafka in supervision mode so that it can start automatically in case of a shutdown. But all the examples of running kafka use shell scripts and the supervisord is not able to note which PID to monitor. Can anyone suggesthow to accomplish auto restart of kafka?
If you are on a Unix or Linux machine, then this is when /etc/inittab comes in handy. Or you might want to use daemontools. I don't know about Windows though.
We are running Kafka under Supervisord (http://supervisord.org/), it works like a charm. Run command looks like this (as specified in supervisord.conf file:
command=/usr/local/bin/pidproxy /var/run/kafka.pid /usr/lib/kafka/bin/kafka-server.sh -f -p /var/run/kafka.pid
Flag -f tells Kafka to start in foreground. If flag -p is set, Kafka process PID is written into specified file.
The command pidproxy is a part of Supervisord distribution. Upon receiving KILL signal, it reads PID from specified file, and forwards the signal to the corresponding process.

Script which launches another application will bring it down on exit

I have a script which does launch another application using nohup my_app &, but when the initial script dies the launched process also goes down. As per my understanding since since it has been ran with nohup that should not happen. The original script also called with nohup.
What went wrong there?
A very reliable script that has been used successfully for years, and has always terminated after invoking a nohup uses this construct:
nohup ${BinDir}/${Watcher} >${DataDir}/${Watcher}.nohup.out 2>&1 &
Perhaps the problem is that output is not being managed?
nohup does not mean that a (child) process is still running when the (parent) process is killed. nohup is used f.e. when you're connecting over ssh to a server and there starting a process. If you log out, the process will terminate (logging out sents the signal SIGHUP to the process causing the process to terminate), using nohup avoid this behaviour and you're process is still running when you logged out.
If you need a program which runs in the background even it's parent process has terminated try using daemons.
It depends what my-app does - it might set its own signal mask. You probably know that nohup ignores the hang-up signal SIGHUP, and this is inherited by the target program. If that target program does its own signal handling then it might be setting SIGHUP to, for example SIG_DFT - the default action (which is to die).
To check, run strace -f -o out or truss -f -o out on the command. This will give you all the kernel calls in the file called 'out'. You should be able to spot the signal mask being changed if it is.

Resources