Queue bash script - bash

Suppose I have a bash script that I want at most one instance of at a given time. Suppose the intended behavior for multiple calls to the same bash script is to queue their executions.
Within a single program where the script is analogous to a function, this can be achieved with mutex locks.
How would I approach such a design?

Ok, just off the cuff...
Implement a queueing system inside the script.
Have the script attempt to mkdir a known and standard directory name.
if mkdir /tmp/${0##*/}.Q
then : .. proceed normally
mkdir is atomic, so only one instance can ever succeed.
If it fails, test to see if it's because the directory already exists.
If it does not, there's a problem, report and abort.
If it succeeds, open a FIFO in that directory named with the PID of the script, and write its command line arguments there. Maybe add the PID.
Then clear the args, fork a child, and wait. On SIGCHLD go read the FIFO again. As long as you find stuff, lather/rinse/repeat. When there's no more, shut down.
If runs with args but the dir exists then confirm a copy is already running on another PID, find the FIFO in the dir, write args (& maybe PID) to that, and exit.
If it has no args (likely because it's a forked child), read the pipe, set the args, and process them. Confirm that there is a copy running with the PID that matches the FIFO. Delete the FIFO and exit normally when all records are processed.
Haven't tested this at all. Expect to find some flaws.
OR just look for the "flagfile" and sleep a bit in a loop until it does not exist, then create it, run normally, and remove it again when done. That doesn't assure order, but it does keep more than one copy from running.

Related

When data is piped from one program via | is there a way to detect what that program was from the second program?

Say you have a shell command like
cat file1 | ./my_script
Is there any way from inside the 'my_script' command to detect the command run first as the pipe input (in the above example cat file1)?
I've been digging into it and so far I've not found any possibilities.
I've been unable to find any environment variables set in the process space of the second command recording the full command line, the command data the my_script commands sees (via /proc etc) is just _./my_script_ and doesn't include any information about it being run as part of a pipe. Checking the process list from inside the second command even doesn't seem to provide any data since the first process seems to exit before the second starts.
The best information I've been able to find suggests in bash in some cases you can get the exit codes of processes in the pipe via PIPESTATUS, unfortunately nothing similar seems to be present for the name of commands/files in the pipe. My research seems to be saying it's impossible to do in a generic manner (I can't control how people decide to run my_script so I can't force 3rd party pipe replacement tools to be used over build in shell pipes) but it just at the same time doesn't seem like it should be impossible since the shell has the full command line present as the command is run.
(update adding in later information following on from comments below)
I am on Linux.
I've investigated the /proc/$$/fd data and it almost does the job. If the first command doesn't exit for several seconds while piping data to the second command can you read /proc/$$/fd/0 to see the value pipe:[PIPEID] that it symlinks to. That can then be used to search through the rest of the /proc//fd/ data for other running processes to find another process with a pipe open using the same PIPEID which gives you the first process pid.
However in most real world tests I've done of piping you can't trust that the first command will stay running long enough for the second one to have time to locate it's pipe fd in /proc before it exits (which removes the proc data preventing it being read). So if this method will return any information is something I can't rely on.

Are shell scripts read in their entirety when invoked?

I ask because I recently made a change to a KornShell (ksh) script that was executing. A short while after I saved my changes, the executing process failed. Judging from the error message, it looked as though the running process had seen some -- but not all -- of my changes. This strongly suggests that when a shell script is invoked, the entire script is not read into memory.
If this conclusion is correct, it suggests that one should avoid making changes to scripts that are running.
$ uname -a
SunOS blahblah 5.9 Generic_122300-61 sun4u sparc SUNW,Sun-Fire-15000
No. Shell scripts are read either line-by-line, or command-by-command followed by ;s, with the exception of blocks such as if ... fi blocks which are interpreted as a chunk:
A shell script is a text file containing shell commands. When such a
file is used as the first non-option argument when invoking Bash, and
neither the -c nor -s option is supplied (see Invoking Bash), Bash
reads and executes commands from the file, then exits. This mode of
operation creates a non-interactive shell.
You can demonstrate that the shell waits for the fi of an if block to execute commands by typing them manually on the command line.
http://www.gnu.org/software/bash/manual/bashref.html#Executing-Commands
http://www.gnu.org/software/bash/manual/bashref.html#Shell-Scripts
It's funny that most OS'es I know, do NOT read the entire content of any script in memory, and run it from disk. Doing otherwise would allow making changes to the script, while running. I don't understand why that is done, given the fact :
scripts are usually very small (and don't take many memory anyway)
at some point, and shown in this thread, people would start making changes to a script that is already running anyway
But, acknowledging this, here's something to think about: If you decided that a script is not running OK (because you are writing/changing/debugging), do you care on the rest of the running of that script ? you can go ahead making the changes, save them, and ignore all output and actions, done by the current run.
But .. Sometimes, and that depends on the script in question, a subsequent run of the same script (modified or not), can become a problem since the current/previous run is doing an abnormal run. It would typically skip some stuff, or sudenly jump to parts in the script, it shouldn't. And THAT may be a problem. It may leave "things" in a bad state; particularly if file manipulation/creation is involved.
So, as a general rule : even if the OS supports the feature or not, it's best to let the current run finish, and THEN save the updated script. You can change it already, but don't save it.
It's not like in the old days of DOS, where you actually have only one screen in front of you (one DOS screen), so you can't say you need to wait on run completion, before you can open a file again.
No they are not and there are many good reasons for that.
One of the things you should keep in mind is that a shell is not an interpreter even if there are some similarities. Shells are designed to work with a stream of commands. Either from the TTY ,a PIPE, FIFO or even a socket.
The shell reads from its resource line by line until a EOF is returned by the kernel.
The most shells have no extra support for interpreting files. they work with a file as they would work with a terminal.
In fact this is considered to be a nice feature because you can do interesting stuff like this How do Linux binary installers (.bin, .sh) work?
You can use a binary file and prepend shell scripts. You can't do this with an interpreter. because it parses the whole file or at least it would try it and fail. A shell would just interpret it line by line and doesnt care about the garbage at the end of the file. You just have to make sure the execution of the script gets terminated before it reaches the binary part.

Creating a startup daemon for a shell script in FreeBSD

I am trying to create a file in rc.d/ that will start up a /bin/sh script that I have written. I am following some examples found here:
http://www.freebsd.org/doc/en/articles/rc-scripting/article.html#rc-flags
#!/bin/sh -x
# PROVIDE: copyfiles
. /etc/rc.subr
name=copyfiles
rcvar=copyfiles_enable
pidfile="/var/run/${name}.pid"
command="/var/etc/copy_dat_files.sh -f /var/etc/copydatafiles.conf"
command_args="&"
load_rc_config $name
run_rc_command "$1"
It seems like I am having a problem with the pidfile. Does my script need to be the one that creates the pid file, or does it automatically get created? I have tried both ways, and whether or not i make my script create a pid file, I get an error that the pid file is not readable.
If my script is supposed to make it, what is the proper way to make the pid file?
Thanks
Look at the existing daemons for example (such as /etc/rc.d/mountd). Then look at the subroutines in /etc/rc.subr -- there is code in there to check the PID-file, but nothing creates it.
In other words, you can declare in the daemon-starting script, what the PID-file is, but creating it is up to the daemon. Speaking of the daemons, you may wish to use the daemon(8) utility, if your daemon is, in fact, a shell script. The utility will take care of the PID-file creation for you. (If the daemon is written in C, you can/should use daemon(3) function.)
BTW, in my own opinion, daemons, when opening up the PID-files for creation, should also lock them (with flock(3) or fcntl(2) or lockf(3)). This way, if an instance crashes (or is killed) without removing the PID-file, the next instance will have no problem determining, the file is stale.
In general, a daemon is supposed to create and clean up its own PID file.
From a shell-script you can give the following command to create it;
echo $$ >/var/run/${name}.pid
Do not forget to remove the file before exiting the script. Write a cleanup() function that does that and let trap call that function when certain signals occur. Also call cleanup just before exiting the script.

How can I capture output of background process

What is the best way of running process in background and receiving its output only when needed?
Intended usage: make prompt-outputting script with heavy initialization be initialized once per session and not on each prompt run. Note: two-way communication is needed: shell needs to tell when new prompt is needed, what is the last command status.
Known solutions:
some explicitly created files on filesystem (FIFO files, UNIX sockets): it would be better to avoid this as this means I need to choose file name, be sure it is garbage-collected on exit and add something to clean no longer used files in case of a crash.
zsh/zpty module: it is a bit like overkill for this job and does not work in bash.
coprocesses: does not work in bash and AFAIK only one coprocess per session is allowed.
Bash supports coprocesses sinces 4.0, but multiple coprocesses is still experimental.
I would have gone with some explicitly created files, naming them ~/.myThing-$HOSTNAME/fifo if they're per user and host. You can use flock to relatively easily determine if the command is still running and optionally start it:
(
flock -n 123 || exit 1
rm/mkfifo ..
exec yourServer < .. > ..
) 123> ~/".myThing-$HOSTNAME/lockfile"
If the command or server dies, the lock is automatically released and you only have a few zero length files lying around. The next time the server starts, it deletes and sets them up again.
Querying the server would be similar, but exiting if the lock is not in use (and optionally using a wait lock to avoid contention).

Pausing and resuming a Bash script

Is there a way to pause a Bash script, then resume it another time, such as after the computer has been rebooted?
The only way to do that AFAIK:
Save any variables, or other script context information in a temporary file to establish the state of the script just before the pause. This goes without saying that the script should include a mechanism to check this file to know if the previous execution was paused and, if it was, fetch all the context and resume accordingly.
After reboot, manually run the script again, OR, have the script automatically run from your startup profile script.
Try Ctrl-Z to pause the command. I don't think you can pause it and then resume after reboot unless you're keeping state somehow.
You can't pause and resume the same script after a reboot, but a script could arrange to have another script run at some later time. For example, it could create an init script (or a cron job, or a login script, etc) which contained the tasks you want to defer, and then removed itself.
Intriguing...
You can suspend a job in BASH with a CTRL-Z, but you can't resume after a reboot. A reboot initializes the machine and the process that was suspended is terminated.
However, it might be possible to force the process into a coredump via a 'kill -QUIT $pidand then usegdb` to restart the script. I tried for a while, but was unable to do it. Maybe someone else can point out the way.
If this applies to your script and the job it does, add checkpoints to it - that means places where all the state of the process is saved to disk before continuing. Then have each individual part check if the output they have to produce is already there, and skip running if it is. That should make a rerun of the script almost as efficient as resuming from the exact same place in execution.
Alternatively, run the script in a VM. Freeze the VM before shutting down the real system and resume it afterwards. It would probably take a really huge and complex shell script to make this worth it, though.

Resources