Removing created temp files in unexpected bash exit - bash

I am creating temporary files from a bash script. I am deleting them at the end of the processing, but since the script is running for quite a long time, if I kill it or simply CTRL-C during the run, the temp files are not deleted.
Is there a way I can catch those events and clean-up the files before the execution ends?
Also, is there some kind of best practice for the naming and location of those temp files?
I'm currently not sure between using:
TMP1=`mktemp -p /tmp`
TMP2=`mktemp -p /tmp`
...
and
TMP1=/tmp/`basename $0`1.$$
TMP2=/tmp/`basename $0`2.$$
...
Or maybe is there some better solutions?

I usually create a directory in which to place all my temporary files, and then immediately after, create an EXIT handler to clean up this directory when the script exits.
MYTMPDIR="$(mktemp -d)"
trap 'rm -rf -- "$MYTMPDIR"' EXIT
If you put all your temporary files under $MYTMPDIR, then they will all be deleted when your script exits in most circumstances. Killing a process with SIGKILL (kill -9) kills the process right away though, so your EXIT handler won't run in that case.

You could set a "trap" to execute on exit or on a control-c to clean up.
trap '{ rm -f -- "$LOCKFILE"; }' EXIT
Alternatively, one of my favourite unix-isms is to open a file, and then delete it while you still have it open. The file stays on the file system and you can read and write it, but as soon as your program exits, the file goes away. Not sure how you'd do that in bash, though.
BTW: One argument I'll give in favour of mktemp instead of using your own solution: if the user anticipates your program is going to create huge temporary files, he might want set TMPDIR to somewhere bigger, like /var/tmp. mktemp recognizes that, your hand-rolled solution (second option) doesn't. I frequently use TMPDIR=/var/tmp gvim -d foo bar, for instance.

You want to use the trap command to handle exiting the script or signals like CTRL-C. See the Greg's Wiki for details.
For your tempfiles, using basename $0 is a good idea, as well as providing a template that provides room for enough temp files:
tempfile() {
tempprefix=$(basename "$0")
mktemp /tmp/${tempprefix}.XXXXXX
}
TMP1=$(tempfile)
TMP2=$(tempfile)
trap 'rm -f $TMP1 $TMP2' EXIT

Just keep in mind that choosen answer is bashism, which means solution as
trap "{ rm -f $LOCKFILE }" EXIT
would work only in bash (it will not catch Ctrl+c if shell is dash or classic sh), but if you want compatibility then you still need to enumerate all signals that you want to trap.
Also keep in mind that when script exits the trap for signal "0"(aka EXIT) is always performed resulting in double execution of trap command.
That the reason not to stack all signals in one line if there is EXIT signal.
To better understand it look at following script that will work across different systems without changes:
#!/bin/sh
on_exit() {
echo 'Cleaning up...(remove tmp files, etc)'
}
on_preExit() {
echo
echo 'Exiting...' # Runs just before actual exit,
# shell will execute EXIT(0) after finishing this function
# that we hook also in on_exit function
exit 2
}
trap on_exit EXIT # EXIT = 0
trap on_preExit HUP INT QUIT TERM STOP PWR # 1 2 3 15 30
sleep 3 # some actual code...
exit
This solution will give you more control since you can run some of your code on occurrence of actual signal just before final exit (preExit function) and if it needed you can run some code at actual EXIT signal (final stage of exit)

GOOD HABITS ARE BEAUTIFUL
Avoid assuming the value of a variable is never going to be changed at some super distant time (especially if such a bug would raise an error).
Do cause trap to expand the value of a variable immediately if applicable to your code. Any variable name passed to trap in single quotes will delay the expansion of its value until after the catch.
Avoid the assumption that a file name will not contain any spaces.
Do use Bash ${VAR#Q} or $(printf '%q' "$VAR") to overcome issues caused by spaces and other special characters like quotes and carriage returns in file names.
zTemp=$(mktemp --tmpdir "$(basename "$0")-XXX.ps")
trap "rm -f ${zTemp#Q}" EXIT

The alternative of using a predictable file name with $$ is a gaping security hole and you should never, ever, ever think about using it. Even if it is just a simple personal script on your single user PC. It is a very bad habit you should not obtain. BugTraq is full of "insecure temp file" incidents. See here, here and here for more information on the security aspect of temp files.
I was initially thinking of quoting the insecure TMP1 and TMP2 assignments, but on second thought that would probably not be a good idea.

I prefer using tempfile which creates a file in /tmp in the safe manner and you do not have to worry about its naming:
tmp=$(tempfile -s "your_sufix")
trap "rm -f '$tmp'" exit

You don't have to bother removing those tmp files created with mktemp. They will be deleted anyway later.
Use mktemp if you can as it generates more unique files then '$$' prefix. And it looks like more cross platform way to create temp files then explicitly put them into /tmp.

Related

How to capture / redirect stdout/stderr from a source command into a variable (in a bash script)?

Usually I capture the output via a subshell:
result="$(command 2>&1)"
If the command is source, the subshell swallows some (all?) changes to the scripts shell's environment.
How can I capture the output of source into a variable?
Surprisingly tricky question!
My first thought was to use a named pipe (mkfifo(1)), but those have a finite buffer size, so if the sourced script fills up the buffer the script would hang. And you can't use a background process to drain the buffer because you want the output in a variable in the original process eventually.
I'm sure there's a way to make it work entirely in memory, but in the end I think a simple and stupid redirect to a temporary file is the most straightforward and robust solution:
OUTPUT_FILE=$(mktemp)
source other_script.sh >$OUTPUT_FILE 2>&1
OUTPUT="$(< "$OUTPUT_FILE")"
rm -f "$OUTPUT_FILE"
(See this question and in particular BashFAQ 062 for security implications of mktemp though.)

Is lockfile necessary for reading and writing the same file of two processes

I'm working with Bash script and meeting such a situation:
one bash script will write things into a file, and the other bash script will read things from the same file.
In this case, is lockfile necessary? I think I don't need to use lockfile because there are only one reading process and only one writing process but I'm not sure.
Bash write.sh:
#!/bin/bash
echo 'success' > tmp.log
Bash read.sh:
#!/bin/bash
while :
do
line=$(head -n 1 ./tmp.log)
if [[ "$line" == "success" ]]; then
echo 'done'
break
else
sleep 3
fi
done
BTW, the write.sh could write several key words, such as success, fail etc.
While many programmers ignore this, you can potentially run into a problem because writing to the file is not atomic. When the writer does
echo success > tmp.log
it could be split into two (or more) parts: first it writes suc, then it writes cess\n.
If the reader executes between those steps, it might get just suc rather than the whole success line. Using a lockfile would prevent this race condition.
This is unlikely to happen with short writes from a shell echo command, which is why most programmers don't worry about it. However, if the writer is a C program using buffered output, the buffer could be flushed at arbitrary times, which would likely end with a partial line.
Also, since the reader is reading the file from the beginning each time, you don't have to worry about starting the read where the previous one left off.
Another way to do this is for the writer to write into a file with a different name, then rename the file to what the reader is looking for. Renaming is atomic, so you're guaranteed to read all of it or nothing.
At least from your example, it doesn't look like read.sh really cares about what gets written to tmp.log, only that write.sh has created the file. In that case, all read.sh needs to check is that the file exists.
write.sh can simply be
: > tmp.log
and read.sh becomes
until [ -e tmp.log ]; do
sleep 3
done
echo "done"

Shell script that can check if it was backgrounded at invocation

I have written a script that relies on other server responses (uses wget to pull data), and I want it to always be run in the background unquestionably. I know one solution is to just write a wrapper script that will call my script with an & appended, but I want to avoid that clutter.
Is there a way for a bash (or zsh) script to determine if it was called with say ./foo.sh &, and if not, exit and re-launch itself as such?
The definition of a background process (I think) is that it has a controlling terminal but it is not part of that terminal's foreground process group. I don't think any shell, even zsh, gives you any access to that information through a builtin.
On Linux (and perhaps other unices), the STAT column of ps includes a + when the process is part of its terminal's foreground process group. So a literal answer to your question is that you could put your script's content in a main function and invoke it with:
case $(ps -o stat= -p $$) in
*+*) main "$#" &;;
*) main "$#";;
esac
But you might as well run main "$#" & anyway. On Unix, fork is cheap.
However, I strongly advise against doing what you propose. This makes it impossible for someone to run your script and do something else afterwards — one would expect to be able to write your_script; my_postprocessing or your_script && my_postprocessing, but forking the script's main task makes this impossible. Considering that the gain is occasionally saving one character when the script is invoked, it's not worth making your script markedly less useful in this way.
If you really mean for the script to run in the background so that the user can close his terminal, you'll need to do more work — you'll need to daemonize the script, which includes not just backgrounding but also closing all file descriptors that have the terminal open, making the process a session leader and more. I think that will require splitting your script into a daemonizing wrapper script and a main script. But daemonizing is normally done for programs that never terminate unless explicitly stopped, which is not the behavior you describe.
I do not know, how to do this, but you may set variable in parent script and check for it in child:
if [[ -z "$_BACKGROUNDED" ]] ; then
_BACKGROUNDED=1 exec "$0" "$#" & exit
fi
# Put code here
Works both in bash and zsh.
the "tty" command says "not a tty" if you're in the background, or gives the controlling terminal name (/dev/pts/1 for example) if you're in the foreground. A simple way to tell.
Remember that you can't (or, not recommended to) edit the running script. This question and the answers give workarounds.
I don't write shell scripts a long time ago, but I can give you a very good idea (I hope). You can check the value of $$ (this is the PID of the process) and compare with the output of the command "jobs -l". This last command will return the PID of all the backgrounded processes (jobs) and if the value of $$ is contained in the result of the "jobs -l", this means that the current script is running on background.

Can a shell script indicate that its lines be loaded into memory initially?

UPDATE: this is a repost of How to make shell scripts robust to source being changed as they run
This is a little thing that bothers me every now and then:
I write a shell script (bash) for a quick and dirty job
I run the script, and it runs for quite a while
While it's running, I edit a few lines in the script, configuring it for a different job
But the first process is still reading the same script file and gets all screwed up.
Apparently, the script is interpreted by loading each line from the file as it is needed. Is there some way that I can have the script indicate to the shell that the entire script file should be read into memory all at once? For example, Perl scripts seem to do this: editing the code file does not affect a process that's currently interpreting it (because it's initially parsed/compiled?).
I understand that there are many ways I could get around this problem. For example, I could try something like:
cat script.sh | sh
or
sh -c "`cat script.sh`"
... although those might not work correctly if the script file is large and there are limits on the size of stream buffers and command-line arguments. I could also write an auxiliary wrapper that copies a script file to a locked temporary file and then executes it, but that doesn't seem very portable.
So I was hoping for the simplest solution that would involve modifications only to the script, not the way in which it is invoked. Can I just add a line or two at the start of the script? I don't know if such a solution exists, but I'm guessing it might make use of the $0 variable...
The best answer I've found is a very slight variation on the solutions offered to How to make shell scripts robust to source being changed as they run. Thanks to camh for noting the repost!
#!/bin/sh
{
# Your stuff goes here
exit
}
This ensures that all of your code is parsed initially; note that the 'exit' is critical to ensuring that the file isn't accessed later to see if there are additional lines to interpret. Also, as noted on the previous post, this isn't a guarantee that other scripts called by your script will be safe.
Thanks everyone for the help!
Use an editor that doesn't modify the existing file, and instead creates a new file then replaces the old file. For example, using :set writebackup backupcopy=no in Vim.
How about a solution to how you edit it.
If the script is running, before editing it, do this:
mv script script-old
cp script-old script
rm script-old
Since the shell keep's the file open as long as you don't change the contents of the open inode everything will work okay.
The above works because mv will preserve the old inode while cp will create a new one. Since a file's contents will not actually be removed if it is opened, you can remove it right away and it will be cleaned up once the shell closes the file.
According to the bash documentation if instead of
#!/bin/bash
body of script
you try
#!/bin/bash
script=$(cat <<'SETVAR'
body of script
SETVAR)
eval "$script"
then I think you will be in business.
Consider creating a new bang path for your quick-and-dirty jobs. If you start your scripts with:
#!/usr/local/fastbash
or something, then you can write a fastbash wrapper that uses one of the methods you mentioned. For portability, one can just create a symlink from fastbash to bash, or have a comment in the script saying one can replace fastbash with bash.
If you use Emacs, try M-x customize-variable break-hardlink-on-save. Setting this variable will tell Emacs to write to a temp file and then rename the temp file over the original instead of editing the original file directly. This should allow the running instance to keep its unmodified version while you save the new version.
Presumably, other semi-intelligent editors would have similar options.
A self contained way to make a script resistant to this problem is to have the script copy and re-execute itself like this:
#!/bin/bash
if [[ $0 != /tmp/copy-* ]] ; then
rm -f /tmp/copy-$$
cp $0 /tmp/copy-$$
exec /tmp/copy-$$ "$#"
echo "error copying and execing script"
exit 1
fi
rm $0
# rest of script...
(This will not work if the original script begins with the characters /tmp/copy-)
(This is inspired by R Samuel Klatchko's answer)

What does $$ mean in the shell?

I once read that one way to obtain a unique filename in a shell for temp files was to use a double dollar sign ($$). This does produce a number that varies from time to time... but if you call it repeatedly, it returns the same number. (The solution is to just use the time.)
I am curious to know what $$ actually is, and why it would be suggested as a way to generate unique filenames.
$$ is the process ID (PID) in bash. Using $$ is a bad idea, because it will usually create a race condition, and allow your shell-script to be subverted by an attacker. See, for example, all these people who created insecure temporary files and had to issue security advisories.
Instead, use mktemp. The Linux man page for mktemp is excellent. Here's some example code from it:
tempfoo=`basename $0`
TMPFILE=`mktemp -t ${tempfoo}` || exit 1
echo "program output" >> $TMPFILE
In Bash $$ is the process ID, as noted in the comments it is not safe to use as a temp filename for a variety of reasons.
For temporary file names, use the mktemp command.
$$ is the id of the current process.
Every process in a UNIX like operating system has a (temporarily) unique identifier, the PID. No two processes running at the same time can have the same PID, and $$ refers to the PID of the bash instance running the script.
This is very much not a unique idenifier in the sense that it will never be reused (indeed, PIDs are reused constantly). What it does give you is a number such that, if another person runs your script, they will get a different identifier whilst yours is still running. Once yours dies, the PID may be recycled and someone else might run your script, get the same PID, and so get the same filename.
As such, it is only really sane to say "$$ gives a filename such that if someone else runs the same script whist my instance is still running, they will get a different name".
$$ is your PID. It doesn't really generate a unique filename, unless you are careful and no one else does it exactly the same way.
Typically you'd create something like /tmp/myprogramname$$
There're so many ways to break this, and if you're writing to locations other folks can write to it's not too difficult on many OSes to predict what PID you're going to have and screw around -- imagine you're running as root and I create /tmp/yourprogname13395 as a symlink pointing to /etc/passwd -- and you write into it.
This is a bad thing to be doing in a shell script. If you're going to use a temporary file for something, you ought to be using a better language which will at least let you add the "exclusive" flag for opening (creating) the file. Then you can be sure you're not clobbering something else.
$$ is the pid (process id) of the shell interpreter running your script. It's different for each process running on a system at the moment, but over time the pid wraps around, and after you exit there will be another process with same pid eventually.As long as you're running, the pid is unique to you.
From the definition above it should be obvious that no matter how many times you use $$ in a script, it will return the same number.
You can use, e.g. /tmp/myscript.scratch.$$ as your temp file for things that need not be extremely reliable or secure. It's a good practice to delete such temp files at the end of your script, using, for example, trap command:
trap "echo 'Cleanup in progress'; rm -r $TMP_DIR" EXIT
$$ is the pid of the current shell process. It isn't a good way to generate unique filenames.
It's the process ID of the bash process. No concurrent processes will ever have the same PID.
The $$ is the process id of the shell in which your script is running. For more details, see the man page for sh or bash. The man pages can be found be either using a command line "man sh", or by searching the web for "shell manpage"
Let me second emk's answer -- don't use $$ by itself as a "unique" anything. For files, use mktemp. For other IDs within the same bash script, use "$$$(date +%s%N)" for a reasonably good chance of uniqueness.
-k
In Fish shell (3.1.2):
The $ symbol can also be used multiple times, as a kind of "dereference" operator (the * in C or C++)
set bar bazz
set foo bar
echo $foo # bar
echo $$foo # same as echo $bar → bazz
Also, You can grab login username via this command. Eg.
echo $(</proc/$$/login id). After that, you need to use getent command.

Resources