Monit fails to start process - bash

I've written a scrip that works fine to start and stop a server.
#!/bin/bash
PID_FILE='/var/run/rserve.pid'
start() {
touch $PID_FILE
eval "/usr/bin/R CMD Rserve"
PID=$(ps aux | grep Rserve | grep -v grep | awk '{print $2}')
echo "Starting Rserve with PID $PID"
echo $PID > $PID_FILE
}
stop () {
pkill Rserve
rm $PID_FILE
echo "Stopping Rserve"
}
case $1 in
start)
start
;;
stop)
stop
;;
*)
echo "usage: rserve {start|stop}" ;;
esac
exit 0
If I start it by running
rserve start
and then start monit it will correctly capture the PID and the server:
The Monit daemon 5.3.2 uptime: 0m
Remote Host 'localhost'
status Online with all services
monitoring status Monitored
port response time 0.000s to localhost:6311 [DEFAULT via TCP]
data collected Mon, 13 May 2013 20:03:50
System 'system_gauss'
status Running
monitoring status Monitored
load average [0.37] [0.29] [0.25]
cpu 0.0%us 0.2%sy 0.0%wa
memory usage 524044 kB [25.6%]
swap usage 4848 kB [0.1%]
data collected Mon, 13 May 2013 20:03:50
If I stop it, it will properly kill the process and unmonitor it. However if I start it again, it won't start the server again:
ps ax | grep Rserve | grep -vc grep
1
monit stop localhost
ps ax | grep Rserve | grep -vc grep
0
monit start localhost
[UTC May 13 20:07:24] info : 'localhost' start on user request
[UTC May 13 20:07:24] info : monit daemon at 4370 awakened
[UTC May 13 20:07:24] info : Awakened by User defined signal 1
[UTC May 13 20:07:24] info : 'localhost' start: /usr/bin/rserve
[UTC May 13 20:07:24] info : 'localhost' start action done
[UTC May 13 20:07:34] error : 'localhost' failed, cannot open a connection to INET[localhost:6311] via TCP
Here is the monitrc:
check host localhost with address 127.0.0.1
start = "/usr/bin/rserve start"
stop = "/usr/bin/rserve stop"
if failed host localhost port 6311 type tcp with timeout 15 seconds for 5 cycles
then restart

I had problem start or stop process via shell too.
One solution might be add "/bin/bash" in the config like this:
start program = "/bin/bash /urs/bin/rserv start"
stop program = "/bin/bash /urs/bin/rserv stop"
It worked for me.

monit is a silent killer. It does not tell you anything. Here are things I would check which monit won't help you identify
Check permissions of all the files you are reading / writing. If you are redirecting output to a file, make sure that file is writable by uid and gid you are using to execute the program
Again check exec permission on the program you are trying to run
Specify full path to any program you are trying to execute ( not strictly necessary, but you don't have to worry about path not being set if you always specify full path )
Make sure you can run the program outside of monit without any error before trying to investigate why monit is not starting.

If the Monit log is displaying
failed to start (exit status -1) -- no output
Then it may be that you're trying to run a script without any of the Bash infrastructure. You can run such a command by wrapping it in /bin/bash -c, like so:
check process my-process
matching "my-process-name"
start program = "/bin/bash -c '/etc/init.d/my-init-script'"

When monit starts it checks for its own pidfile and checks if the process with
matching PID is running already - if it does, then it just wakes up this
process.
in your case, check if this pid is being used by some other process:
ps -ef |grep 4370
if yes, then you need to remove the below file(usually under /run directory) and start monit again:
monit.pid

For me, the issue was that the stop command was not being run, even though I specifically specified "then restart" on the configuration.
The solution was just to change:
start program = "/etc/init.d/.... restart"

Related

Automatic shutdown of Windows 10 computer from bash script

I have a problem with my power failure monitoring scheme / automatic shutdown of Windows 10 computer.
On my network I have a Linux box that has a UPS unit connected via USB and maintained with software which makes the computer shut down when power failure occurs.
On another computer, also powered from the same UPS unit running headless Windows 10 server, I have setup a simple bash script (cygwin) running from Task Scheduler.
It is supposed to shutdown this computer once the aforementioned Linux box stops responding to ping on the network (because it shuts down when power failure is detected).
The UPS unit is setup to shut the power down after 10 minutes from power failure, giving plenty of time for the Windows machine to detect lack of response from Linux box and executing shutdown.
Here is the monitoring script code:
#!/bin/bash
upshost="<linux-box-ip>"
echo "UPS Host: $upshost"
while true
do
date
ping -c3 $upshost >/dev/null
if [[ $? -ne 0 ]]
then
echo "WARNING: Ping failed, trying again in 30 seconds ..."
sleep 30
ping -c3 $upshost >/dev/null
[[ $? -eq 0 ]] || break
fi
echo "Host $upshost responded to ping."
sleep 30
done
date
echo "ERROR: Host $upshost didn't respond to ping."
echo "System will shutdown in 1 minute."
shutdown -h +1 "System will shutdown in 1 minute."
NOTE: is the actual IP address of my Linux box which is always the same (reserved in the router setup).
The task scheduler properties:
Runs from SYSTEM account, whether user is logged on or not, with highest privileges.
Triggers: at system startup
Actions: Start a program
program/script -> c:\cygwin64\bin\bash
arguments -> -l -c "/cygdrive/c/bin/pwrwdog.sh > /cygdrive/c/tmp/pwrwdog.trc"
runs in -> c:\cygwin64\bin
Start if any network connection is available.
I see in traces and in task scheduler history that script properly monitors the network response from Linux box, breaks out of the monitoring loop when host stops responding and invokes shutdown command. The problem is - the shutdown doesn't actually happen:
Host <linux-box-ip> responded to ping.
Sun, Jan 3, 2021 7:23:48 AM
Host <linux-box-ip> responded to ping.
Sun, Jan 3, 2021 7:24:20 AM
WARNING: Ping failed, trying again in 30 seconds ...
Sun, Jan 3, 2021 7:25:16 AM
ERROR: Host <linux-box-ip> didn't respond to ping.
System will shutdown in 1 minute.
System will shutdown in 1 minute.
What do I miss?

Keep Track of laravel websocket with monit centos

Im trying to monitor laravel-websocket with monit instead of supervisord because of more options it provides
So In my /home/rabter/laravelwebsocket.sh :
#!/bin/bash
case $1 in
start)
echo $$ > /var/run/laravelwebsocket.pid;
exec 2>&1 php /home/rabter/core/artisan websockets:serve 1>/tmp/laravelwebsocket.out
;;
stop)
kill `cat /var/run/laravelwebsocket.pid` ;;
*)
echo "usage: laravelwebsocket.sh {start|stop}" ;;
esac
exit 0
And in etc/monit.d I made a file named cwp.laravelwebsocket with code
check process laravelwebsocket with pidfile /var/run/laravelwebsocket.pid
start program "/bin/bash -c /home/rabter/laravelwebsocket.sh start"
stop program "/bin/bash -c /home/rabter/laravelwebsocket.sh stop"
if failed port 6001 then restart
if 4 restarts within 8 cycles then timeout
unfortunately with I run monit everything starts to get monitord but laravel websocket, and it does not start once and in monit table infront I see
Process - laravelwebsocket Execution failed | Does not exist
How can I make monit monitor and start laravel-websocket on startup and on fails or errors or crashes?
I have looked into Monitor a Laravel Queue Worker with Monit
but no luck!
Your bash script inserts its own pid into your pid file. Additionally, the php process should be send to background if using monit, because monit is a monitoring tool, rather then a supervisor.
#!/usr/bin/env bash
case $1 in
start)
php /home/rabter/core/artisan websockets:serve & 2>&1 >/tmp/laravelwebsocket.out
echo $! > /var/run/laravelwebsocket.pid;
;;
stop)
kill $(cat /var/run/laravelwebsocket.pid) ;;
*)
echo "usage: $(basename $0) {start|stop}" ;;
esac
exit 0
Then make that file executable with chmod +x FILEPATH.
This should now work:
check process laravelwebsocket with pidfile /var/run/laravelwebsocket.pid
start program "/home/rabter/laravelwebsocket.sh start"
stop program "/home/rabter/laravelwebsocket.sh stop"
if failed port 6001 then restart
if 4 restarts within 8 cycles then timeout
Do you use monit as init-system for a container? If so, please let me know. Then a few more details apply.

How to set the program as daemon after $DISPLAY is set?

I want to set my screen as screensave status every 50minutes (3000 seconds).
cat /home/rest.sh
while true;do
sleep 3000
xscreensaver-command --lock 1>/dev/null
done
sh /home/rest.sh & can make it run.
Now i want to set it as a daemon.
sudo vim /etc/systemd/system/screensave.service
[Unit]
Description=screensave
[Service]
User=root
ExecStart=/bin/bash /home/rest.sh
StandardError=journal
[Install]
WantedBy=multi-user.target
To set it and enable as daemon.
systemctl enable screensave.service
I find that the service is not running as a daemon.
sudo journalctl -u screensave
Jan 24 12:16:50 user systemd[1]: Started screensave.
Jan 24 12:17:22 user bash[621]: xscreensaver-command: warning: $DISPLAY is not set: defaulting to ":0.0".
Jan 24 12:17:22 user bash[621]: No protocol specified
Jan 24 12:17:22 user bash[621]: xscreensaver-command: can't open display :0.0
How to run it as a daemon after $DISPLAY is set ?
This is a very common FAQ. A system daemon cannot easily connect to the X session of any individual user. On a multi-user system, how do you tell which user's session to connect to, anyway? On a single-user system, what should the daemon do if no session is running (as it often isn't at the time the daemon starts up)?
Trying to run a system daemon as any particular user won't work, and giving individual users access to a system daemon is a recipe for security problems. It can be done, but the solution is complex, and probably not something you want to attempt on your own. (Briefly, have the daemon listen to commands on a socket; create a user-space program which knows how to talk to the socket, and build some sort of authorization and authentication so the daemon knows whom it's talking to and can verify that this user is allowed to connect to this display.)
The drop-dead simple solution is to run this from your desktop environment's startup scripts instead. Most desktops have something like "session start-up items" or "autorun on login" hooks.
I'm not running linux and can't check now but the steps to daemonize a process are to close stdin stdout stderr change current working directory to / and to fork twice and setsid so that current process is a new session leader.
adding something like this at the beginning, before running, first thing to check is exec command creates a new session leader process with ps -Cbash -o sid,pgid,pid,ppid,comm,args
# checking if current process is a session leader to avoid infinite call
if [[ $(ps -p $$ -osid=) != $$ ]]; then
( cd / ; exec setsid /bin/bash /home/rest.sh & ) </dev/null 1>&0 2>&0 &
exit
fi

Batch or script file to close program with a command then reopen it

I am running a minecraft server off my machine and have found I need to shut down and restart the server once a day. I am trying to write a script file that will give the kill command to the server and after like 30 sec restart it.
Script could be something like
minecraft-start & # change the cmd, keep the '&'
PID=$!;
echo $PID > /tmp/minecraft.pid # store PID, removed if physical server reboot
sleep 24h; # wait 24 hours
kill $(cat /tmp/minecraft.pid) # kill, what about active users and all?
As other have mentioned this could be in a cron task.
Explanation
$! contains the last command process id (aka PID)

Debugging monit

I find debugging monit to be a major pain. Monit's shell environment basically has nothing in it (no paths or other environment variables). Also, there are no log file that I can find.
The problem is, if the start or stop command in the monit script fails, it is difficult to discern what is wrong with it. Often times it is not as simple as just running the command on the shell because the shell environment is different from the monit shell environment.
What are some techniques that people use to debug monit configurations?
For example, I would be happy to have a monit shell, to test my scripts in, or a log file to see what went wrong.
I've had the same problem. Using monit's verbose command-line option helps a bit, but I found the best way was to create an environment as similar as possible to the monit environment and run the start/stop program from there.
# monit runs as superuser
$ sudo su
# the -i option ignores the inherited environment
# this PATH is what monit supplies by default
$ env -i PATH=/bin:/usr/bin:/sbin:/usr/sbin /bin/sh
# try running start/stop program here
$
I've found the most common problems are environment variable related (especially PATH) or permission-related. You should remember that monit usually runs as root.
Also if you use as uid myusername in your monit config, then you should change to user myusername before carrying out the test.
Be sure to always double check your conf and monitor your processes by hand before letting monit handle everything. systat(1), top(1) and ps(1) are your friends to figure out resource usage and limits. Knowing the process you monitor is essential too.
Regarding the start and stop scripts i use a wrapper script to redirect output and inspect environment and other variables. Something like this :
$ cat monit-wrapper.sh
#!/bin/sh
{
echo "MONIT-WRAPPER date"
date
echo "MONIT-WRAPPER env"
env
echo "MONIT-WRAPPER $#"
$#
R=$?
echo "MONIT-WRAPPER exit code $R"
} >/tmp/monit.log 2>&1
Then in monit :
start program = "/home/billitch/bin/monit-wrapper.sh my-real-start-script and args"
stop program = "/home/billitch/bin/monit-wrapper.sh my-real-stop-script and args"
You still have to figure out what infos you want in the wrapper, like process infos, id, system resources limits, etc.
You can start Monit in verbose/debug mode by adding MONIT_OPTS="-v" to /etc/default/monit (don't forget to restart; /etc/init.d/monit restart).
You can then capture the output using tail -f /var/log/monit.log
[CEST Jun 4 21:10:42] info : Starting Monit 5.17.1 daemon with http interface at [*]:2812
[CEST Jun 4 21:10:42] info : Starting Monit HTTP server at [*]:2812
[CEST Jun 4 21:10:42] info : Monit HTTP server started
[CEST Jun 4 21:10:42] info : 'ocean' Monit 5.17.1 started
[CEST Jun 4 21:10:42] debug : Sending Monit instance changed notification to monit#example.io
[CEST Jun 4 21:10:42] debug : Trying to send mail via smtp.sendgrid.net:587
[CEST Jun 4 21:10:43] debug : Processing postponed events queue
[CEST Jun 4 21:10:43] debug : 'rootfs' succeeded getting filesystem statistics for '/'
[CEST Jun 4 21:10:43] debug : 'rootfs' filesytem flags has not changed
[CEST Jun 4 21:10:43] debug : 'rootfs' inode usage test succeeded [current inode usage=8.5%]
[CEST Jun 4 21:10:43] debug : 'rootfs' space usage test succeeded [current space usage=59.6%]
[CEST Jun 4 21:10:43] debug : 'ws.example.com' succeeded testing protocol [WEBSOCKET] at [ws.example.com]:80/faye [TCP/IP] [response time 114.070 ms]
[CEST Jun 4 21:10:43] debug : 'ws.example.com' connection succeeded to [ws.example.com]:80/faye [TCP/IP]
monit -c /path/to/your/config -v
By default, monit logs to your system message log and you can check there to see what's happening.
Also, depending on your config, you might be logging to a different place
tail -f /var/log/monit
http://mmonit.com/monit/documentation/monit.html#LOGGING
Assuming defaults (as of whatever old version of monit I'm using), you can tail the logs as such:
CentOS:
tail -f /var/log/messages
Ubuntu:
tail -f /var/log/syslog
Mac OSX
tail -f /var/log/system.log
Windows
Here be Dragons
But there is a neato project I found while searching on how to do this out of morbid curiosity: https://github.com/derFunk/monit-windows-agent
Yeah monit isn't too easy to debug.
Here a few best practices
use a wrapper script that sets up your log file. Write your command arguments in there while you are at it:
shell:
#!/usr/bin/env bash
logfile=/var/log/myjob.log
touch ${logfile}
echo $$ ": ################# Starting " $(date) "########### pid " $$ >> ${logfile}
echo "Command: the-command $#" >> ${logfile} # log your command arguments
{
exec the-command $#
} >> ${logfile} 2>&1
That helps a lot.
The other thing I find that helps is to run monit with '-v', which gives you verbosity. So the workflow is
get your wrapper working from the shell "sudo my-wrapper"
then try and get it going from monit, run from the command line with "-v"
then try and get it going from monit, running in the background.
You can also try running monit validate once processes are running, to try and find out if any of them are having problems (and sometimes get more information than you would get in the log files if there are any problems). Beyond that, there's not much more you can do.

Resources