Bash Script:
#!/bin/bash
...
exec {LOCK} > foo.out
flock -x ${LOCK}
...
I understand that exec without an argument simply redirects all the output of the current shell to the foo.out file. Questions:
What does the first argument to exec {LOCK} mean, given that it seems to have a special significance because it is in curly braces (but not ${...}).
What is the value of ${LOCK} and where did it come from (I don't think that I defined this variable)?
This is not valid or useful bash. It will just result in two different error messages.
Instead, the intended code was this:
#!/bin/bash
...
exec {LOCK}> foo.out
flock -x ${LOCK}
...
It uses:
{name}> to open for writing and assign fd number to name
exec to apply the redirection to the current, keeping the fd open for the duration of the shell
flock to lock the assigned fd, which it will inherit from the current shell
So effectively, it creates a mutex based on the file foo.out, ensuring that only one instance is allowed to run things after the flock at a time. Any other instances will wait until the previous one is done.
Here is what I finally figured out:
exec {LOCK}> foo.out
changes stdout of the current shell to the file foo.out. The fd for the open file is set the variable ${LOCK}. Setting fd to the {LOCK} variable is a feature of bash.
flock -x ${LOCK}
is simply locking using the file descriptor.
Related
I'm not familiar with the exec command. A bash tutorial about how to lock files throws this:
exec 200>lockfile
flock 200
...
flock -u 200
I got that it is creating a file named lockfile and assigning it an FD of 200. Then the second command locks that file. When done working with the file, the last command unlocks it.
Doing so, any other concurrent instance of the same script will stay at that second line until the first instance unlocks the file. Cool.
Now, what I don't understand is what is exec doing.
Directly from the bash command line, both options seem to work:
exec 200>lockfile
200>lockfile
But when the second option is used in the script, a "Bad file descriptor error" is raised.
Why is exec needed to avoid the error?
--- edit ---
After some more "serious research", I've found an answer here. The exec command makes the FD stay for the entire script or current shell.
So doing:
200>lockfile flock 200
Would work. But later flock -u 200 would raise a "Bad FD error".
The manual seems to mention shell replacement with given command. What does that has to do with file descriptors?
This is explained in the second sentence:
exec: exec [-cl] [-a name] file [redirection ...]
Exec FILE, replacing this shell with the specified program.
If FILE is not specified, the redirections take effect in this
shell. [...]
Essentially, doing exec 42> foo.txt from inside myscript.sh opens foo.txt for writing on FD 42 in the current process.
This is similar to running ./myscript.sh 42> foo.txt from a shell in the first place, or using open and dup2 in a C program.
I have a shell script which writes all output to logfile
and terminal, this part works fine, but if I execute the script
a new shell prompt only appear if I press enter. Why is that and how do I fix it?
#!/bin/bash
exec > >(tee logfile)
echo "output"
First, when I'm testing this, there always is a new shell prompt, it's just that sometimes the string output comes after it, so the prompt isn't last. Did you happen to overlook it? If so, there seems to be a race where the shell prints the prompt before the tee in the background completes.
Unfortunately, that cannot fixed by waiting in the shell for tee, see this question on unix.stackexchange. Fragile workarounds aside, the easiest way to solve this that I see is to put your whole script inside a list:
{
your-code-here
} | tee logfile
If I run the following script (suppressing the newline from the echo), I see the prompt, but not "output". The string is still written to the file.
#!/bin/bash
exec > >(tee logfile)
echo -n "output"
What I suspect is this: you have three different file descriptors trying to write to the same file (that is, the terminal): standard output of the shell, standard error of the shell, and the standard output of tee. The shell writes synchronously: first the echo to standard output, then the prompt to standard error, so the terminal is able to sequence them correctly. However, the third file descriptor is written to asynchronously by tee, so there is a race condition. I don't quite understand how my modification affects the race, but it appears to upset some balance, allowing the prompt to be written at a different time and appear on the screen. (I expect output buffering to play a part in this).
You might also try running your script after running the script command, which will log everything written to the terminal; if you wade through all the control characters in the file, you may notice the prompt in the file just prior to the output written by tee. In support of my race condition theory, I'll note that after running the script a few times, it was no longer displaying "abnormal" behavior; my shell prompt was displayed as expected after the string "output", so there is definitely some non-deterministic element to this situation.
#chepner's answer provides great background information.
Here's a workaround - works on Ubuntu 12.04 (Linux 3.2.0) and on OS X 10.9.1:
#!/bin/bash
exec > >(tee logfile)
echo "output"
# WORKAROUND - place LAST in your script.
# Execute an executable (as opposed to a builtin) that outputs *something*
# to make the prompt reappear normally.
# In this case we use the printf *executable* to output an *empty string*.
# Use of `$ec` is to ensure that the script's actual exit code is passed through.
ec=$?; $(which printf) ''; exit $ec
Alternatives:
#user2719058's answer shows a simple alternative: wrapping the entire script body in a group command ({ ... }) and piping it to tee logfile.
An external solution, as #chepner has already hinted at, is to use the script utility to create a "transcript" of your script's output in addition to displaying it:
script -qc yourScript /dev/null > logfile # Linux syntax
This, however, will also capture stderr output; if you wanted to avoid that, use:
script -qc 'yourScript 2>/dev/null' /dev/null > logfile
Note, however, that this will suppress stderr output altogether.
As others have noted, it's not that there's no prompt printed -- it's that the last of the output written by tee can come after the prompt, making the prompt no longer visible.
If you have bash 4.4 or newer, you can wait for your tee process to exit, like so:
#!/usr/bin/env bash
case $BASH_VERSION in ''|[0-3].*|4.[0-3]) echo "ERROR: Bash 4.4+ needed" >&2; exit 1;; esac
exec {orig_stdout}>&1 {orig_stderr}>&2 # make a backup of original stdout
exec > >(tee -a "_install_log"); tee_pid=$! # track PID of tee after starting it
cleanup() { # define a function we'll call during shutdown
retval=$?
exec >&$orig_stdout # Copy your original stdout back to FD 1, overwriting the pipe to tee
exec 2>&$orig_stderr # If something overwrites stderr to also go through tee, fix that too
wait "$tee_pid" # Now, wait until tee exits
exit "$retval" # and complete exit with our original exit status
}
trap cleanup EXIT # configure the function above to be called during cleanup
echo "Writing something to stdout here"
I have this command sequence that I'm having trouble understanding:
[me#mine ~]$ (echo 'test'; cat) | bash
echo $?
1
echo 'this is the new shell'
this is the new shell
exit
[me#mine ~]$
As far as I can understand, here is what happens:
A pipe is created.
stdout of echo 'test' is sent to the pipe.
bash receives 'test' on stdin.
echo $? returns 1, which is what happens when you run test without args.
cat runs.
It is copying stdin to stdout.
stdout is sent to the pipe.
bash will execute whatever you type in, but stderr won't get printed to the screen (we used |, not |&).
I have three questions:
It looks like, even though we run two commands, we use the same pipe and bash process for both commands. Is that the case?
Where do the prompts go?
When something like cat uses stdin, does it take exclusive ownership of stdin as long as the shell runs, or can other things use it?
I suspect I'm missing some detail with ttys, but I'm not sure. Any help or details or man excerpt appreciated!
So...
Yes, there's a single pipe sending commands to a single instance of bash. Note:
$ echo 'date "+%T hello $$"; sleep 1; date "+%T world $$"' | bash
22:18:52 hello 72628
22:18:53 world 72628
There are no prompts. From the man page:
An interactive shell is one started without non-option arguments (unless -s is specified) and without the -c option whose standard input and error are both connected to terminals. PS1 is set and $- includes i if bash is interactive.
So a pipe is not an interactive shell, and therefore has no prompt.
Stdin and stdout can only connect to one thing at a time. cat will take stdin from the process that ran it (for example, your interactive shell) and send its stdout through the pipe to bash. If you need multiple things to be able to submit to the stdin of that cat, consider using a named pipe.
Does that cover it?
I'm looking at https://stackoverflow.com/a/10225050/1737158
And in same Q there is an answer with timeout command but it's not in all OSes, so I want to avoid it.
What I try to do is:
demo="$(top)" &
TASK_PID=$!
sleep 3
echo "TASK_PID: $TASK_PID"
echo "demo: $demo"
And I expect to have nothing in $demo variable while top command never ends.
Now I get an empty result. Which is "acceptable" but when i re-use the same thing with the command which should return value, I still get an empty result, which is not ok. E.g.:
demo="$(uptime)" &
TASK_PID=$!
sleep 3
echo "TASK_PID: $TASK_PID"
echo "demo: $demo"
This should return uptime result but it doesn't. I also tried to kill the process by TASK_PID but I always get. If a command fails, I expect to have stderr captures somehow. It can be in different variable but it has to be captured and not leaked out.
What happens when you execute var=$(cmd) &
Let's start by noting that the simple command in bash has the form:
[variable assignments] [command] [redirections]
for example
$ demo=$(echo 313) declare -p demo
declare -x demo="313"
According to the manual:
[..] the text after the = in each variable assignment undergoes tilde expansion, parameter expansion, command substitution, arithmetic expansion, and quote removal before being assigned to the variable.
Also, after the [command] above is expanded, the first word is taken to be the name of the command, but:
If no command name results, the variable assignments affect the current shell environment. Otherwise, the variables are added to the environment of the executed command and do not affect the current shell environment.
So, as expected, when demo=$(cmd) is run, the result of $(..) command substitution is assigned to the demo variable in the current shell.
Another point to note is related to the background operator &. It operates on the so called lists, which are sequences of one or more pipelines. Also:
If a command is terminated by the control operator &, the shell executes the command asynchronously in a subshell. This is known as executing the command in the background.
Finally, when you say:
$ demo=$(top) &
# ^^^^^^^^^^^ simple command, consisting ONLY of variable assignment
that simple command is executed in a subshell (call it s1), inside which $(top) is executed in another subshell (call it s2), the result of this command substitution is assigned to variable demo inside the shell s1. Since no commands are given, after variable assignment, s1 terminates, but the parent shell never receives the variables set in child (s1).
Communicating with a background process
If you're looking for a reliable way to communicate with the process run asynchronously, you might consider coprocesses in bash, or named pipes (FIFO) in other POSIX environments.
Coprocess setup is simpler, since coproc will setup pipes for you, but note you might not reliably read them if process is terminated before writing any output.
#!/bin/bash
coproc top -b -n3
cat <&${COPROC[0]}
FIFO setup would look something like this:
#!/bin/bash
# fifo setup/clean-up
tmp=$(mktemp -td)
mkfifo "$tmp/out"
trap 'rm -rf "$tmp"' EXIT
# bg job, terminates after 3s
top -b >"$tmp/out" -n3 &
# read the output
cat "$tmp/out"
but note, if a FIFO is opened in blocking mode, the writer won't be able to write to it until someone opens it for reading (and starts reading).
Killing after timeout
How you'll kill the background process depends on what setup you've used, but for a simple coproc case above:
#!/bin/bash
coproc top -b
sleep 3
kill -INT "$COPROC_PID"
cat <&${COPROC[0]}
I'd like to source a script (it sets up variables in my shell):
source foo.sh args
But under flock so that only one instance operates at a time (it does a lot of disk access which I'd like to ensure is serialized).
$ source flock lockfile foo.sh args
-bash: source: /usr/bin/flock: cannot execute binary file
and
$ flock lockfile source foo.sh args
flock: source: Success
don't work.
Is there some simple syntax for this I'm missing? Let's assume I can't edit foo.sh to put the locking commands inside it.
You can't source a script directly via flock because it is an external command and source is a shell builtin. You actually have two problems because of this:
flock doesn't know any command called source because it's built into bash
Even if flock could run it, the changes would not affect the state of the calling shell as you'd want with source because it's happening in a child process.
And, passing flock to source won't work because source expects a script. To do what you want, you need to lock by fd. Here's an example
#!/bin/bash
exec 9>lockfile
flock 9
echo whoopie
sleep 5
flock -u 9
Run two instances of this script in the same directory at the same time and you will see one wait for the other.