bash race condition using generated output file name - bash

I am trying to generate output using a randomized file name. "Generating" is simulated with "cat" in this example:
cat report.csv > "test_$(openssl rand -base64 102).csv"
I often get an error like this:
-bash: test_Q6eheRaVfktCTCfWSU/tjRNA1y+6juwlyuo1lEId/7HZTCQIE7/rt+/9MlTI+pjT
9It3l7FtBldMmaqHNWpspwCI5kCpR+s51RA2o9xAZ6BrZ+7UBR5atK9qWdSO/N/X
BAnvDkGm.csv: No such file or directory
The probability for this error is higher the higher the number of random characters is, which suggests a race condition. Solving the problem by using a variable for the random characters is obvious and is not what I am asking. Rather, my question is: What are the individual steps that bash performs, and where is the race condition?
I would have thought that bash executes the command as follows:
Create a pipe to capture the output of openssl rand
fork/exec openssl rand, passing that file handle as stdout, and wait for the process to finish (and check error status)
read from the pipe to get the value used in string interpolation, then close the pipe
perform string interpolation to build the output file name
open the output file
fork/exec cat, passing the handle for the output file as stdout
wait for the process to finish (and check error status), then close the output file
Nothing here suggests a race condition. I could imagine that bash instead runs cat in parallel and opens another pipe to buffer its output before it goes into the output file, but that wouldn't cause a "No such file or directory" either.
As was commented, slashes in the filename are an obvious problem, but the error occurs even without slashes. Setting the number of random bytes to 8 sometimes produces errors like this, without a slash and with the correct number of characters (so no slash was hidden):
-bash: test_9od1IhDt5A4=.csv: No such file or directory
The following command waits 2 seconds, then runs the command. In exactly those cases where the strange error message appears, it waits 4 seconds instead. Is there some kind of repeat login in bash that does this?
cat report.csv > "test_$(sleep 2; openssl rand -base64 9).csv"
Confirmed the double execution by echoing to stderr instead of sleeping:
cat report.csv > "test_$(echo foo 1>&2; openssl rand -base64 9).csv"

Several things are happening here.
The key part is that an error in the command substitution causes it to be evaluated twice. This seems to be a bug in the bash version used by Apple. The changelog at https://github.com/bminor/bash/blob/master/CHANGES says for version bash-4.3-alpha: "Fixed a bug that could result in double evaluation of command substitutions when they appear in failed redirections." I ran my tests on "GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin18)", the version pre-installed on macOS.
Steps to reproduce this bug in a simple way:
lap47:~/Documents> echo foo > "test_$(echo bar 1>&2; echo 'foo')"
bar
lap47:~/Documents> echo foo > "test_$(echo bar 1>&2; echo 'f/oo')"
bar
bar
-bash: test_f/oo: No such file or directory
lap47:~/Documents>
The second important part is that evaluating the command substitution twice invokes openssl rand twice, producing different random numbers. The second invocation seems to be used to generate the error message after the redirection failed. This way, the error message does not reflect the condition that caused the error.
Finally, the root cause of the failed redirection is indeed a slash in the file name. This failure causes the substitution to be invoked again, producing a different file name which somewhat likely doesn't contain a slash.

Related

Output time to a file with the Unix "time" command, but leave the output of the command to the console

I time a command that has some output. I want to output the real time from the time command to a file, but leave the output of the command to the console.
For example, if I do time my_command I get this printed in the console:
several lines of output from my_command ...
real 1m25.970s
user 0m0.427s
sys 0m0.518s
In this case, I want to store only 1m25.970s to a file, but still print the output of the command to the console.
The time command is tricky. The POSIX specification of time
doesn't define the default output format, but does define a format for the -p (presumably for 'POSIX') option. Note the (not easily understood) discussion of command sequences in pipelines.
The Bash specification say time prefixes a 'pipeline', which means that time cmd1 | cmd2 times both cmd1 and cmd2. It writes its results to standard error. The Korn shell is similar.
The POSIX format requires a single space between the tags such as real and the time; the default format often uses a tab instead of a space. Note that the /usr/bin/time command may have yet another output format. It does on macOS, for example, listing 3 times on a single line, by default, with the label after the time value; it supports -p to print in an approximation to the POSIX format (but it has multiple spaces between label and time).
You can easily get all the information written to standard error into a file:
(time my_command) 2> log.file
If my_command or any programs it invokes reports any errors to standard error, those will got to the log file too. And you will get all three lines of the output from time written to the file.
If your shell is Bash, you may be able to use process substitution to filter some of the output.
I wouldn't try it with a single command line; the hieroglyphs needed to make it work are ghastly and best encapsulated in shell scripts.
For example, a shell script time.filter to capture the output from time and write only the real time to a log file (default log.file, configurable by providing an alternative log file name as the first argument
#!/bin/sh
output="${1:-log.file}"
shift
sed -E '/^real[[:space:]]+(([0-9]+m)?[0-9]+[.][0-9]+s?)/{ s//\1/; w '"$output"'
d;}
/^(user|sys)[[:space:]]+(([0-9]+m)?[0-9]+[.][0-9]+s?)/d' "$#"
This assumes your sed uses -E to enable extended regular expressions.
The first line of the script finds the line containing the real label and the time after it (in a number of possible formats — but not all). It accepts an optional minutes value such as 60m05.003s, or just a seconds value 5.00s, or just 5.0 (POSIX formats — at least one digit after the decimal point is required). It captures the time part and prints it to the chosen file (by default, log.file; you can specify an alternative name as the first argument on the command line). Note that even GNU sed treats everything after the w command as file name; you have to continue the d (delete) command and the close brace } on a newline. GNU sed does not require the semicolon after d; BSD (macOS) sed does. The second line recognizes and deletes the lines reportin the user and sys times. Everything else is passed through unaltered.
The script processes any files you give it after the log file name, or standard input if you give it none. A better command line notation would use an explicit option (-l logfile) and getopts to specify the log file.
With that in place, we can devise a program that reports to standard error and standard output — my_command:
echo "nonsense: error: positive numbers are required for argument 1" >&2
dribbler -s 0.4 -r 0.1 -i data -t
echo "apoplexy: unforeseen problems induced temporary amnesia" >&2
You could use cat data instead of the dribbler command. The dribbler command as shown reads lines from data, writes them to standard output, with a random delay with a gaussian distribution between lines. The mean delay is 0.4 seconds; the standard deviation is 0.1 seconds. The other two lines are pretending to be commands that report errors to standard error.
My data file contained a nonsense 'poem' called 'The Great Panjandrum'.
With this background in place, we can run the command and capture the real time in log.file, delete (ignore) the user and system time values, while sending the rest of standard error to standard error by using:
$ (time my_command) 2> >(tee raw.stderr | time.filter >&2)
nonsense: error: positive numbers are required for argument 1
So she went into the garden
to cut a cabbage-leaf
to make an apple-pie
and at the same time
a great she-bear coming down the street
pops its head into the shop
What no soap
So he died
and she very imprudently married the Barber
and there were present
the Picninnies
and the Joblillies
and the Garyulies
and the great Panjandrum himself
with the little round button at top
and they all fell to playing the game of catch-as-catch-can
till the gunpowder ran out at the heels of their boots
apoplexy: unforeseen problems induced temporary amnesia
$ cat log.file
0m7.278s
(The time taken is normally between 6 and 8 seconds. There are 17 lines, so you'd expect it to take around 6.8 seconds at 0.4 seconds per line.) The blank line is from time; it is pretty hard to remove that blank line, and only that blank line, especially as POSIX says it is optional. It isn't worth it.

greater than symbol at beginning of line

I've just seen the following in a script and I'm not sure what it means:
.............
started=$STATUSDIR/.$EVENT_ID-started
errs=$STATUSDIR/.$EVENT_ID-errors
# started is used to capture the time we started, so
# that it can be used as the latest-result marker for
# the next run...
>"$started"
>"$errs"
# store STDERR on FD 3, then point STDERR to $errs
exec 3>&2 2>"$errs"
..............
Specifically, the ">" at the beginning of the lines. The script actually fails with "No such file or directory". The vars are all supplied via subsidiary scripts and there doesn't seem to be any logic to create the directories it's complaining about.
It's not the easiest thing to Google for, so I thought I'd ask it here so that future bash hackers might find the answers you lovely people are able to provide.
This is a redirection. It's the same syntax used for echo hello >file (or its less-conventional but equally-correct equivalent >file echo hello), just without the echo hello. :)
When attached to an empty command, the effect of a redirection is identical to what it would be with a command that ran and immediately exited with no output: It creates (if nonexistent) or truncates (if existent) the output file, closes that file, and lets the script move on to the next command.

Bash script - run process & send to background if good, or else

I need to start up a Golang web server and leave it running in the background from a bash script. If the script in question in syntactically correct (as it will be most of the time) this is simply a matter of issuing a
go run /path/to/index.go &
However, I have to allow for the possibility that index.go is somehow erroneous. I should explain that in Golang this for something as "trival" as importing a module that you then fail to use. In this case the go run /path/to/index.go bit will return an error message. In the terminal this would be something along the lines of
index.go:4:10: expected...
What I need to be able to do is to somehow change that command above so I can funnel any error messages into a file for examination at a later stage. I tried variants on go run /path/to/index.go >> errors.txt with the terminating & in different positions but to no avail.
I suspect that there is a bash way to do this by altering the priority of evaluation of the command via some judiciously used braces/brackets etc. However, that is way beyond my bash capabilities. I would be most obliged to anyone who might be able to help.
Update
A few minutes later... After a few more experiments I have found that this works
go run /path/to/index.go &> errors.txt &
Quite apart from the fact that I don't in fact understand why it works there remains the issue that it produces a 0 byte errors.txt file when the command goes to completion without Golang throwing up any error messages. Can someone shed light on what is going on and how it might be improved?
Taken from man bash.
Redirecting Standard Output and Standard Error
This construct allows both the standard output (file descriptor 1) and the standard error output (file descriptor 2) to be redirected to the file whose name is the expansion of word.
There are two formats for redirecting standard output and standard error:
&>word
and
>&word
Of the two forms, the first is preferred. This is semantically equivalent to
>word 2>&1
Appending Standard Output and Standard Error
This construct allows both the standard output (file descriptor 1) and the standard error output (file descriptor 2) to be appended to the file whose name is the expansion of word.
The format for appending standard output and standard error is:
&>>word
This is semantically equivalent to
>>word 2>&1
Narūnas K's answer covers why the &> redirection works.
The reason why the file is created anyway is because the shell creates the file before it even runs the command in question.
You can see this by trying no-such-command > file.out and seeing that even though the shell errors because no-such-command doesn't exist the file gets created (using &> on that test will get the shell's error in the file).
This is why you can't do things like sed 'pattern' file > file to edit a file in place.

First line in file is not always printed in bash script

I have a bash script that prints a line of text into a file, and then calls a second script that prints some more data into the same file. Lets call them script1.sh and script2.sh. The reason it's split into two scripts, is because I have different versions of script2.sh.
script1.sh:
rm -f output.txt
echo "some text here" > output.txt
source script2.sh
script2.sh:
./read_time >> output.txt
./run_program
./read_time >> output.txt
Variations on the three lines in script2.sh are repeated.
This seems to work most of the time, but every once in a while the file output.txt does not contain the line "some text here". At first I thought it was because I was calling script2.sh like this: ./script2.sh. But even using source the problem still occurs.
The problem is not reproducible, so even when I try to change something I don't know if it's actually fixed.
What could be causing this?
Edit:
The scripts are very simple. script1 is exactly as you see here, but with different file names. script 2 is what I posted, but then the same 3 lines repeated, and ./run_program can have different arguments. I did a grep for the output file, and for > but it doesn't show up anywhere unexpected.
The way these scripts are used is that script1 is created by a program (the only difference between the versions is the source script2.sh line. This script1.sh is then run on a different computer (linux on an FPGA actually) using ssh. Before that is done, the output file is also deleted using ssh. I don't know why, but I didn't write all of this. Also, I've checked the code running on the host. The only mention of the output file is when it is deleted using ssh, and when it is copied back to the host after the script1 is done.
Edit 2:
I finally managed to make the problem reproducible at a reasonable rate by stripping script2.sh of everything but a single line printing into the file. This also let me do the testing a bit faster. Once I had this I got the problem between 1 and 4 times for every 10 runs. Removing the command that was deleting the file over ssh before the script was run seems to have solved the problem. I will test it some more to be sure, but I think it's solved. Although I'm still not sure why it would be a problem. I thought that the ssh command would not exit before all the remove commands were executed.
It is hard to tell without seeing the real code. Most likely explanation is that you have a typo, > instead of >>, somewhere in one of the script2.sh files.
To verify this, set noclobber option with set -o noclobber. The shell will then terminate when trying to write to existing file with >.
Another possibility, is that the file is removed under certain rare conditions. Or it is damaged by some command which can have random access to it - look for commands using this file without >>. Or it is used by some command both as input and output which step on each other - look for the file used with <.
Lastly, you can have a racing condition with a command outputting to the file in background, started before that echo.
Can you grep all your scripts for 'output.txt'? What about scripts called inside read_time and run_program?
It looks like something in one of the script2.sh scripts must be either overwriting, truncating or doing a substitution on output.txt.
For example,there could be a '> output.txt' burried inside a conditional for a condition that rarely obtains. Just a guess, but it would explain why you don't always see it.
This is an interesting problem. Please post the solution when you find it!

Annoying cat/sed behavior when running through MATLAB on Mac

I am running a script to parse text email files that can be called by MATLAB or run from the command line. The script looks like this:
#!/bin/bash
MYSED=/opt/local/bin/gsed
"$MYSED" -n "/X-FileName/,/*/p" | "$MYSED" "/X-FileName/d" | "$MYSED" "/\-Original Message\-/q"
If I run cat message_file | ./parser.sh in my Terminal window, I get a parsed text file. If I do the same using the system command in MATLAB, I occasionally get the same parsed text followed by the error message
cat: stdout: Broken pipe
When I was using a sed command instead of a cat command, I was getting the same error message. This happens maybe on 1 percent of the files I am parsing, almost always large files where a lot gets deleted after the Original Message line. I do not get the error when I do not include the last pipe, the one deleting everything after 'Original Message'.
I would like to suppress the error message from cat if possible. Ideally, I would like to understand why running the script through MATLAB gives me an error while running it in Terminal does not? Since it tends to happen on larger files, I am guessing it has to do with a memory limitation, but 'broken pipe' is such a vague error message that I can't be sure. Any hints on either issue would be much appreciated.
I could probably run the script outside of MATLAB and save the processed files, but as some of the files are large I would much rather not duplicate them at this point.
The problem is occurring because of the final gsed command, "$MYSED" "/\-Original Message\-/q". This (obviously) quits as soon as it sees a match, and if the gsed feeding it tries to write anything after that it'll receive SIGPIPE and quit, and if there's enough data the same will happen to the first gsed, and if there's enough data after that SIGPIPE will be sent to the original cat command, which reports the error. Whether or not the error makes it back to cat or not will depend on timing, buffering, the amount of data, the phase of the moon, etc.
My first suggestion would be to put the "$MYSED" "/\-Original Message\-/q" command at the beginning of the pipeline, and have it do the reading from the file (rather than feeding it from cat). This'd mean changing the script to accept the file to read from as an argument:
#!/bin/bash
MYSED=/opt/local/bin/gsed
"$MYSED" "/\-Original Message\-/q" "$#" | "$MYSED" -n "/X-FileName/,/*/p" | "$MYSED" "/X-FileName/d"
...and then run it with ./parser.sh message_file. If my assumptions about the message file format are right, changing the order of the gsed commands this way shouldn't cause trouble. Is there any reason the message file needs to be piped to stdin rather than passed as an argument and read directly?

Resources