Get output of server crash from server re-spawning script - bash

I currently have Homebridge set up on my raspberry pi. When the pi boots, it starts a script which attempts to keep homebridge alive. I originally took the script from this answer which walks you through the rather trivial process of creating such a script. However, I have slightly adapted the script and it now looks like this:
until "homebridge" -s /bin/sh pi; do
echo "Server homebridge crashed with exit code $?. Respawning.." >&2
echo "Looks like Homebridge just crashed, restarting it now..." | mail -s "Homebridge Crash" pi
rm -r /home/pi/.homebridge/accessories/cachedAccessories
sleep 1
done
It is virtually the same as the original script with the exception that it deletes a folder and waits a second before re-spawning. Furthermore, it sends some mail to my user (pi) to let me know that the process has died and that it is re-spawning. This has been working perfectly for me with the simple omission of any sort of de-bugging. By that I mean that whilst I do get notified that the process has died, I am not presented with an output of the process when it died. It would be perfect if the mail could include, for example, the last 300 lines before the process exited in order to aid with debugging following a crash
What exactly would I need to add to the above script in order to receive a 'log' of what the homebridge output was just before it crashed in order to help with debugging?
Thank you in advance for your help,
Kind regards, Rocco

Related

Why does this nested bash command with subshells hang? [duplicate]

I have a script (lets call it parent.sh) that makes 2 calls to a second script (child.sh) that runs a java process. The child.sh scripts are run in the background by placing an & at the end of the line in parent.sh. However, when i run parent.sh, i need to press Ctrl+C to return to the terminal screen. What is the reason for this? Is it something to do with the fact that the child.sh processes are running under the parent.sh process. So the parent.sh doesn't die until the childs do?
parent.sh
#!/bin/bash
child.sh param1a param2a &
child.sh param1b param2b &
exit 0
child.sh
#!/bin/bash
java com.test.Main
echo "Main Process Stopped" | mail -s "WARNING-Main Process is down." user#email.com
As you can see, I don't want to run the java process in the background because i want to send a mail out when the process dies. Doing it as above works fine from a functional standpoint, but i would like to know how i can get it to return to the terminal after executing parent.sh.
What i ended up doing was to make to change parent.sh to the following
#!/bin/bash
child.sh param1a param2a > startup.log &
child.sh param1b param2b > startup2.log &
exit 0
I would not have come to this solution without your suggestions and root cause analysis of the issue. Thanks!
And apologies for my inaccurate comment. (There was no input, I answered from memory and I remembered incorrectly.)
The following link from the Linux Documentation Project suggests adding a wait after your mail command in child.sh:
http://tldp.org/LDP/abs/html/x9644.html
Summary of the above document
Within a script, running a command in the background with an ampersand (&)
may cause the script to hang until ENTER is hit. This seems to occur with
commands that write to stdout. It can be a major annoyance.
....
....
As Walter Brameld IV explains it:
As far as I can tell, such scripts don't actually hang. It just
seems that they do because the background command writes text to
the console after the prompt. The user gets the impression that
the prompt was never displayed. Here's the sequence of events:
Script launches background command.
Script exits.
Shell displays the prompt.
Background command continues running and writing text to the
console.
Background command finishes.
User doesn't see a prompt at the bottom of the output, thinks script
is hanging.
If you change child.sh to look like the following you shouldn't experience this annoyance:
#!/bin/bash
java com.test.Main
echo "Main Process Stopped" | mail -s "WARNING-Main Process is down." user#gmail.com
wait
Or as #SebastianStigler states in a comment to your question above:
Add a > /dev/null at the end of the line with mail. mail will otherwise try to start its interactive mode.
This will cause the mail command to write to /dev/null rather than stdout which should also stop this annoyance.
Hope this helps
The process was still linked to the controlling terminal because STDOUT needs somewhere to go. You solved that problem by redirecting to a file ( > startup.log ).
If you're not interested in the output, discard STDOUT completely ( >/dev/null ).
If you're not interested in errors, either, discard both ( &>/dev/null ).
If you want the processes to keep running even after you log out of your terminal, use nohup — that effectively disconnects them from what you are doing and leaves them to quietly run in the background until you reboot your machine (or otherwise kill them).
nohup child.sh param1a param2a &>/dev/null &

How can I tell if a script was run in the background and with nohup?

Ive got a script that takes a quite a long time to run, as it has to handle many thousands of files. I want to make this script as fool proof as possible. To this end, I want to check if the user ran the script using nohup and '&'. E.x.
me#myHost:/home/me/bin $ nohup doAlotOfStuff.sh &. I want to make 100% sure the script was run with nohup and '&', because its a very painful recovery process if the script dies in the middle for whatever reason.
How can I check those two key paramaters inside the script itself? and if they are missing, how can I stop the script before it gets any farther, and complain to the user that they ran the script wrong? Better yet, is there way I can force the script to run in nohup &?
Edit: the server enviornment is AIX 7.1
The ps utility can get the process state. The process state code will contain the character + when running in foreground. Absence of + means code is running in background.
However, it will be hard to tell whether the background script was invoked using nohup. It's also almost impossible to rely on the presence of nohup.out as output can be redirected by user elsewhere at will.
There are 2 ways to accomplish what you want to do. Either bail out and warn the user or automatically restart the script in background.
#!/bin/bash
local mypid=$$
if [[ $(ps -o stat= -p $mypid) =~ "+" ]]; then
echo Running in foreground.
exec nohup $0 "$#" &
exit
fi
# the rest of the script
...
In this code, if the process has a state code +, it will print a warning then restart the process in background. If the process was started in the background, it will just proceed to the rest of the code.
If you prefer to bailout and just warn the user, you can remove the exec line. Note that the exit is not needed after exec. I left it there just in case you choose to remove the exec line.
One good way to find if a script is logging to nohup, is to first check that the nohup.out exists, and then to echo to it and ensure that you can read it there. For example:
echo "complextag"
if ( $(cat nohup.out | grep "complextag" ) != "complextag" );then
# various commands complaining to the user, then exiting
fi
This works because if the script's stdout is going to nohup.out, where they should be going (or whatever out file you specified), then when you echo that phrase, it should be appended to the file nohup.out. If it doesn't appear there, then the script was nut run using nohup and you can scold them, perhaps by using a wall command on a temporary broadcast file. (if you want me to elaborate on that I can).
As for being run in the background, if it's not running you should know by checking nohup.

Running bash script does not return to terminal when using ampersand (&) to run a subprocess in the background

I have a script (lets call it parent.sh) that makes 2 calls to a second script (child.sh) that runs a java process. The child.sh scripts are run in the background by placing an & at the end of the line in parent.sh. However, when i run parent.sh, i need to press Ctrl+C to return to the terminal screen. What is the reason for this? Is it something to do with the fact that the child.sh processes are running under the parent.sh process. So the parent.sh doesn't die until the childs do?
parent.sh
#!/bin/bash
child.sh param1a param2a &
child.sh param1b param2b &
exit 0
child.sh
#!/bin/bash
java com.test.Main
echo "Main Process Stopped" | mail -s "WARNING-Main Process is down." user#email.com
As you can see, I don't want to run the java process in the background because i want to send a mail out when the process dies. Doing it as above works fine from a functional standpoint, but i would like to know how i can get it to return to the terminal after executing parent.sh.
What i ended up doing was to make to change parent.sh to the following
#!/bin/bash
child.sh param1a param2a > startup.log &
child.sh param1b param2b > startup2.log &
exit 0
I would not have come to this solution without your suggestions and root cause analysis of the issue. Thanks!
And apologies for my inaccurate comment. (There was no input, I answered from memory and I remembered incorrectly.)
The following link from the Linux Documentation Project suggests adding a wait after your mail command in child.sh:
http://tldp.org/LDP/abs/html/x9644.html
Summary of the above document
Within a script, running a command in the background with an ampersand (&)
may cause the script to hang until ENTER is hit. This seems to occur with
commands that write to stdout. It can be a major annoyance.
....
....
As Walter Brameld IV explains it:
As far as I can tell, such scripts don't actually hang. It just
seems that they do because the background command writes text to
the console after the prompt. The user gets the impression that
the prompt was never displayed. Here's the sequence of events:
Script launches background command.
Script exits.
Shell displays the prompt.
Background command continues running and writing text to the
console.
Background command finishes.
User doesn't see a prompt at the bottom of the output, thinks script
is hanging.
If you change child.sh to look like the following you shouldn't experience this annoyance:
#!/bin/bash
java com.test.Main
echo "Main Process Stopped" | mail -s "WARNING-Main Process is down." user#gmail.com
wait
Or as #SebastianStigler states in a comment to your question above:
Add a > /dev/null at the end of the line with mail. mail will otherwise try to start its interactive mode.
This will cause the mail command to write to /dev/null rather than stdout which should also stop this annoyance.
Hope this helps
The process was still linked to the controlling terminal because STDOUT needs somewhere to go. You solved that problem by redirecting to a file ( > startup.log ).
If you're not interested in the output, discard STDOUT completely ( >/dev/null ).
If you're not interested in errors, either, discard both ( &>/dev/null ).
If you want the processes to keep running even after you log out of your terminal, use nohup — that effectively disconnects them from what you are doing and leaves them to quietly run in the background until you reboot your machine (or otherwise kill them).
nohup child.sh param1a param2a &>/dev/null &

Killing Subshell with SIGTERM

I'm sure this is really simple, but it's biting me in the face anyway, and I'm a little frustrated and stumped.
So, I have a script which I've managed to boil down to:
#!/bin/sh
sleep 50 | echo
If I run that at the command line, and hit Ctrl-C it stops, like I would expect.
If I send it sigint, using kill, it does nothing.
I thought that was strange, since I thought those should have been the same.
Then, if I send it sigterm, then it also dies, but if I look in ps, the sleep is still running.
What am I missing, here?
This is obviously not the real script, which runs python, and it's more of a problem when it keeps running after start-stop-daemon tries to kill the daemon.
Help me people. I'm dumb.
The reason this happens is that the Ctrl-C is delivered to the sleep process, whereas the sigint you are sending is delivered only to the script itself. See Child process receives parent's SIGINT for details on this.
You can verify this yourself by using strace -p when hitting ctrl-c or sending sigint; strace will tell you what signals are delivered.
EDIT: I don't think you are dumb. Processes and how they work are seemingly simple, but the details are often complicated, and even experts get confused by this sort of thing.
I did the same thing I written script named as test.sh with below containt.
#!/bin/sh
sleep 50 | echo
After executing , I did Ctrl-C -> its working fine means closing it.
Again executed and in another terminal i checked the PID by ps -ef|grep test.sh after finding the pid , i did kill <pid> and it killed the process , to verify again i executed ps -ef|grep test.sh and didnt get any pid.

How to search a text file for a specific line and then kill a process when finding it with bash?

I have very little experience with scripting, but I'd like to get into it.
Basically, this is for a StarMade server, as its java process doesn't shut down on a crash but rather remains running, so it becomes hard to properly detect crashes, and these occur very frequently.
What I want though, is to find a specific line in the server log file that indicates the server is frozen/has crashed, and kill the process after reading this line, then another script would start it up again (got that part already). The log file continuously gets updated by the server, so the script basically has to "watch" the file somehow.
I couldn't find any exact matches for this idea anywhere, but I tried to scrape something together from many sources, although I'm pretty sure it's entirely wrong:
#!/bin/bash
cd "$(dirname "$0")"
if grep -Fxq "[SERVER] SERVER SHUTDOWN" log.txt.0
then
kill -9 $(pidof StarMade.jar)
else
false
fi
As far as I can tell, this should continuously check the file log.txt.0 to find direct matches with "[SERVER] SERVER SHUTDOWN" and kill the server process when it does, or do nothing when it doesn't, is this a correct way to do it? I don't want to break things by stupid mistakes.
Try this:
if ( grep -Fxq "[SERVER] SERVER SHUTDOWN" log.txt.0 > /dev/null ); then
kill -9 $(pidof StarMade.jar)
fi

Resources