Pausing and resuming a Bash script - bash

Is there a way to pause a Bash script, then resume it another time, such as after the computer has been rebooted?

The only way to do that AFAIK:
Save any variables, or other script context information in a temporary file to establish the state of the script just before the pause. This goes without saying that the script should include a mechanism to check this file to know if the previous execution was paused and, if it was, fetch all the context and resume accordingly.
After reboot, manually run the script again, OR, have the script automatically run from your startup profile script.

Try Ctrl-Z to pause the command. I don't think you can pause it and then resume after reboot unless you're keeping state somehow.

You can't pause and resume the same script after a reboot, but a script could arrange to have another script run at some later time. For example, it could create an init script (or a cron job, or a login script, etc) which contained the tasks you want to defer, and then removed itself.

Intriguing...
You can suspend a job in BASH with a CTRL-Z, but you can't resume after a reboot. A reboot initializes the machine and the process that was suspended is terminated.
However, it might be possible to force the process into a coredump via a 'kill -QUIT $pidand then usegdb` to restart the script. I tried for a while, but was unable to do it. Maybe someone else can point out the way.

If this applies to your script and the job it does, add checkpoints to it - that means places where all the state of the process is saved to disk before continuing. Then have each individual part check if the output they have to produce is already there, and skip running if it is. That should make a rerun of the script almost as efficient as resuming from the exact same place in execution.
Alternatively, run the script in a VM. Freeze the VM before shutting down the real system and resume it afterwards. It would probably take a really huge and complex shell script to make this worth it, though.

Related

Shell script on MacOS freezes when trigged with launchd

This is a strange one.
I have a bash script (let's call it fileChecker.sh) that loops through a directory of files. It checks each file, sends parameters for each to another bash script (uploadToS3.sh) which uploads them to an S3 bucket. The fileChecker.sh triggers the uploadToS3.sh and does not wait for it to finish (I believe this is called forking???). Snippet from the fileChecker.sh triggering the uploadToS3.sh:
sh ("/Users/Shared/Scripts/uploadToS3.sh" "$thefilepath" "$s3" "$thefilename" "$filenameExtension") &
The uploadToS3.sh script uses python and s3cmd to upload the file to the s3 bucket. Snippet from the script:
/usr/local/bin/s3cmd --access_key=$s3AccessKey --secret_key=$s3SecretKey --region=$s3Region --progress put "$thefilepath" "$s3Path""$thefilename"
The Problem: both scripts execute without issue when run manually from an IDE but I need it to run on time intervals every 30 seconds. When I run it with launchd, /Library/LaunchAgents/, the first script (fileChecker.sh) completes without issue. Each execution of the uploadToS3.sh is triggered successfully but never finishes. To be more specific, I check the output from each instance of the uploadToS3.sh and each file starts to upload to the s3 bucket then stops. There are no errors to be found in the stderr. The stdout just has the first details of the upload process.
Any thoughts? I'm happy to add more of the code and more detail if needed. Been stuck on this for a week now and could use any help I can get.
Thank you 🙏
I suspect you're tripping over a feature of launchd that's intended to be helpful, but sometimes causes problems like this: when a daemon or agent exits, launchd will "clean up" (i.e. kill) any leftover subprocesses. It sounds like when fileChecker.sh exits, launchd kills uploadToS3.sh before it has a chance to finish.
Fortunately, this is easy to fix. Add <key>AbandonProcessGroup</key><true/> to the launchd .plist file to disable this behavior.

Blocking a bash script running with &

I may have inadvertently launched a bash script containing an infinite cycle whose exit condition may be met next century, if ever. The fact is that I launched the script, as I would do with a nohup program, with
bash [scriptname].sh &
so that (as I get it, which is most probably wrong) I can close the terminal and still keep the script running, as was my intention in developing it. The script should run calculation programmes in my absence and let me gather the results after some time.
Now I want to stop it, but nothing seems to do the trick: I killed the programmes the script had launched, I removed the input file the script was getting orders from and - last and most perfect of accomplishments - I accidentally closed the terminal trying to "exit" the script, which was still giving me error messages.
How can I check whether the script is running (as it does not appear in "top")? Is the '&' relevant? Should I just ask permission to reboot the pc, if that will work and kill everything?
Thank you.
[I put a "Hi everyone" at the beginning but the editor won't let me show it. Oh, well. It's that kind of day.]
Ok, I'll put it right here to prove my stupidity, as I wandered the internet shortly (after a long wandering before writing this post) and found that the line:
kill -9 $(pgrep -f [SCRIPTNAME].sh)
does the trick from any terminal window.
I write this answer to help anyone in the same situation, but feel free to remove the thread if unnecessary (and excuse me for disturbing).
Good you found it, here is another way if you do not use bash -c and run it in current shell not a separate shell.
# put a job in background
sleep 100 &
# save the last PID of background job
MY_PID=$!
# later
kill $MY_PID

Ctrl C does not kill foreground process in Unix

I have the following code written in a script anmed test.csh to start a GUI based application in foreground in Solaris Unix. When I run the script and want to kill the GUI process using Keyboard Ctrl + C, the process is not getting terminated. If I open the GUI application directly from the terminal, I am able to kill the process using Ctrl + C. Can someone help me understand why am I not able to kill the process invoked from a script?
#! /usr/bin/csh
# some script to set env variables
# GUI Process
cast
Then I execute the script using the following command. I am not able to terminate the vcast process using Ctrl + C command.
source test.csh
If it is being launched into its own thread then the hangup request may not get to the application. You could add a signal handler to cascade the hangup request or look at the process table to see what the process id is for the app and then kill it. This could also be scripted very easily.
You should better execute the script directly, instead of sourcing it.
1) first add #!/bin/csh at the beginning of your script,
2) set it as executable :
$ chmod u+x test.csh
3) execute it directly:
$ ./test.csh
you should be able to kill it. Anyway, consider that the problem may be related to some executable code that you are running within your script. Consider to try to debug your script by copy-pasting line after line in a terminal until you reach the point where it lags.
Another possible annoying issue can be an infinite while loop. Check for this kind of error too. Maybe you have a while loop that never gets the breaking point.
Regards

Creating a startup daemon for a shell script in FreeBSD

I am trying to create a file in rc.d/ that will start up a /bin/sh script that I have written. I am following some examples found here:
http://www.freebsd.org/doc/en/articles/rc-scripting/article.html#rc-flags
#!/bin/sh -x
# PROVIDE: copyfiles
. /etc/rc.subr
name=copyfiles
rcvar=copyfiles_enable
pidfile="/var/run/${name}.pid"
command="/var/etc/copy_dat_files.sh -f /var/etc/copydatafiles.conf"
command_args="&"
load_rc_config $name
run_rc_command "$1"
It seems like I am having a problem with the pidfile. Does my script need to be the one that creates the pid file, or does it automatically get created? I have tried both ways, and whether or not i make my script create a pid file, I get an error that the pid file is not readable.
If my script is supposed to make it, what is the proper way to make the pid file?
Thanks
Look at the existing daemons for example (such as /etc/rc.d/mountd). Then look at the subroutines in /etc/rc.subr -- there is code in there to check the PID-file, but nothing creates it.
In other words, you can declare in the daemon-starting script, what the PID-file is, but creating it is up to the daemon. Speaking of the daemons, you may wish to use the daemon(8) utility, if your daemon is, in fact, a shell script. The utility will take care of the PID-file creation for you. (If the daemon is written in C, you can/should use daemon(3) function.)
BTW, in my own opinion, daemons, when opening up the PID-files for creation, should also lock them (with flock(3) or fcntl(2) or lockf(3)). This way, if an instance crashes (or is killed) without removing the PID-file, the next instance will have no problem determining, the file is stale.
In general, a daemon is supposed to create and clean up its own PID file.
From a shell-script you can give the following command to create it;
echo $$ >/var/run/${name}.pid
Do not forget to remove the file before exiting the script. Write a cleanup() function that does that and let trap call that function when certain signals occur. Also call cleanup just before exiting the script.

how to kill a group of processes in clozure cl?

I want to run a shell command within ccl, but this command may be hung for some reason. So I want to kill all the sub process generated by this command. How can I do this?
I have tried trivial-shell to run the shell command, when the command not hung, it works well.
I also use with-timeout macro which is in trivial-shell to check the timeout, it just give me a timeout-error condition, the shell process is still hunging there. Here I just want to kill them all and return something.
Thank you all.
As far as I can tell, trivial-shell only provides a synchronous shell call so there's no simple way to terminate ongoing subprocesses.
I suggest calling Clozure Common Lisp's implementation-specific ccl:run-program function with :wait nil to run the jobs asynchronously. You can then call ccl:signal-external-process on the running process to kill it if you need. Documentation here.

Resources