Running commands in subdirectories with bash - bash

I have a sequence of directories that I need to run various shell commands on and I've made a short script called dodirs.sh to simplify running the command in each directory:
#!/bin/bash
echo "Running in each directory: $#"
for d in ./*/; do
(
cd "$d"
pwd
eval "$#"
)
done
This is fine for many simple commands, but some have trouble, such as:
grep "free energy TOTEN" OUTCAR | tail -1
which looks for a string in a file located in each directory.
It seems that the pipe and/or the quotes is the trouble since if I say:
dodirs.sh grep "free energy TOTEN" OUTCAR
I get a sensible (if waaaay to long output) along the lines of:
Running in each directory: grep free energy TOTEN OUTCAR
...
OUTCAR: free energy TOTEN = -888.53122906 eV
OUTCAR: free energy TOTEN = -888.53132396 eV
OUTCAR: free energy TOTEN = -888.531324 eV
...
I notice the result of the echo loses the quotes, so that is a bit odd. On the other hand, if I say:
dodirs.sh grep "free energy TOTEN" OUTCAR | tail -1
then I get the nonsensical:
...
grep: energy: No such file or directory
grep: TOTEN: No such file or directory
...
Notice the echo doesn't echo at all now and it is clearly misinterpreting the line.
Is there some way I have to escape characters, or package the parameters inside my dodirs.sh script?
And maybe someone knows of a better approach altogether?

Consider:
#!/bin/bash
# use printf %q to generate a command line identical to what we're actually doing
printf "Running in each directory: " >&2
printf '%q ' "$#" >&2
echo >&2
# use && -- we don't want to execute the command if cd into a given directory failed!
for d in ./*/; do
(cd "$d" && echo "$PWD" >&2 && "$#")
done
This is much more predictable: It passes exact argument lists through, so for a general command you can just quote it naturally. (This is the exact same behavior as you get with find -exec or other tools which call execv*-family calls with a literal, passed-through argument list; thus, it means you get identical behavior to sudo, chpst, chroot, setsid, etc).
For a single command, invocation looks like what you'd expect:
dodirs grep "free energy TOTEN" OUTCAR
To execute shell directives, such as pipelines, explicitly execute a shell:
dodirs sh -c 'grep "free energy TOTEN" OUTCAR | tail -n 1'
# ^^ ^^
...or, if you're willing to let callers rely on implementation details (such as the fact that this is implemented with a shell, and exactly which shell it's implemented with), use eval:
dodirs eval 'grep "free energy TOTEN" OUTCAR | tail -n 1'
# ^^^^
This may be slightly more work, but it puts you in line with standard UNIX conventions, and avoids risking shell injection vulnerabilities if callers fail to quote their arguments to be eval-safe.

The quotes disappear because they aren't necessary once the shell identifies the words to pass to your script as arguments. Inside your script, $1 is grep, $2 is free energy TOTEN, etc.
You do need to escape the pipe (with a backslash \| or by quoting '|'), though, so that it also is passed as an argument to eval.
dodirs.sh grep "free energy TOTEN" OUTCAR \| tail -1

Related

Bash get the command that is piping into a script

Take the following example:
ls -l | grep -i readme | ./myscript.sh
What I am trying to do is get ls -l | grep -i readme as a string variable in myscript.sh. So essentially I am trying to get the whole command before the last pipe to use inside myscript.sh.
Is this possible?
No, it's not possible.
At the OS level, pipelines are implemented with the mkfifo(), dup2(), fork() and execve() syscalls. This doesn't provide a way to tell a program what the commands connected to its stdin are. Indeed, there's not guaranteed to be a string representing a pipeline of programs being used to generate stdin at all, even if your stdin really is a FIFO connected to another program's stdout; it could be that that pipeline was generated by programs calling execve() and friends directly.
The best available workaround is to invert your process flow.
It's not what you asked for, but it's what you can get.
#!/usr/bin/env bash
printf -v cmd_str '%q ' "$#" # generate a shell command representing our arguments
while IFS= read -r line; do
printf 'Output from %s: %s\n' "$cmd_str" "$line"
done < <("$#") # actually run those arguments as a command, and read from it
...and then have your script start the things it reads input from, rather than receiving them on stdin.
...thereafter, ./yourscript ls -l, or ./yourscript sh -c 'ls -l | grep -i readme'. (Of course, never use this except as an example; see ParsingLs).
It can't be done generally, but using the history command in bash it can maybe sort of be done, provided certain conditions are met:
history has to be turned on.
Only one shell has been running, or accepting new commands, (or failing that, running myscript.sh), since the start of myscript.sh.
Since command lines with leading spaces are, by default, not saved to the history, the invoking command for myscript.sh must have no leading spaces; or that default must be changed -- see Get bash history to remember only the commands run with space prefixed.
The invoking command needs to end with a &, because without it the new command line wouldn't be added to the history until after myscript.sh was completed.
The script needs to be a bash script, (it won't work with /bin/dash), and the calling shell needs a little prep work. Sometime before the script is run first do:
shopt -s histappend
PROMPT_COMMAND="history -a; history -n"
...this makes the bash history heritable. (Code swiped from unutbu's answer to a related question.)
Then myscript.sh might go:
#!/bin/bash
history -w
printf 'calling command was: %s\n' \
"$(history | rev |
grep "$0" ~/.bash_history | tail -1)"
Test run:
echo googa | ./myscript.sh &
Output, (minus the "&" associated cruft):
calling command was: echo googa | ./myscript.sh &
The cruft can be halved by changing "&" to "& fg", but the resulting output won't include the "fg" suffix.
I think you should pass it as one string parameter like this
./myscript.sh "$(ls -l | grep -i readme)"
I think that it is possible, have a look at this example:
#!/bin/bash
result=""
while read line; do
result=$result"${line}"
done
echo $result
Now run this script using a pipe, for example:
ls -l /etc | ./script.sh
I hope that will be helpful for you :)

How to recall a string in shell script

I made a script like this:
#! /usr/bin/bash
a=`ls ../wrfprd/wrfout_d0${i}* | cut -c22-25`
b=`ls ../wrfprd/wrfout_d0${i}* | cut -c27-28`
c=`ls ../wrfprd/wrfout_d0${i}* | cut -c30-31`
d=`ls ../wrfprd/wrfout_d0${i}* | cut -c33-34`
f=$a$b$c$d
echo $f
sed "s/.* startdate=.*/export startdate=${f}/g" ./post_process > post_process2
echo command works and gives 2008042118 that is what I want but in file post_process2 is like this export startdate= and can not recall variable f. I want to produce a line like export startdate=2008042118
First -- don't use ls here -- it's both expensive in terms of performance (compared to globbing, which is performed internal to the shell without starting any external programs), and doesn't guarantee useful output for the full range of possible filenames, making its use in this context inherently bug-prone. A better way to retrieve pieces from a filename, assuming a ksh-derived shell such as bash or zsh, would look like this:
#!/bin/bash
# this is an array, but we're only going to use the first element
file=( "../wrfprd/wrfout_d0${i}"* )
[[ -e $file ]] || { echo "No file found" >&2; exit 1; }
f=${file:22:4}${file:27:2}${file:30:2}${file:33:2}
Second, don't use sed to modify code -- doing so requires that your runtime user have permission to modify its own code, and moreover invites injection vulnerabilities. Just write your content out to a data file:
printf '%s\n' "$f" >startdate.txt
...and, in your second script, to read in the value from that file:
# if the shebang is #!/bin/bash
startdate=$(<startdate.txt)
# if the shebang is #!/bin/sh
startdate=$(cat startdate.txt)

Either getting original return value from xargs or simulate xargs

I am working with bash. I have a file F containing the command-line arguments for a Java program, and I need to store both outputs of the Java programs, i.e., output on standard output and the exit value. Storing the standard output works via
cat F | xargs java program > Output
But xargs does not give access to the exit-code of the Java program.
So well, I split it, running the program twice, once for standard output, once for the exit code --- but getting the exit code and running it correctly seems impossible. One might try
java program $(cat F)
but that doesn't work if F contains for example " ", namely one command-line argument for program which is a space. The problem is the expansion of the argument $(cat F).
Now I don't see a way to get around that problem? I don't want "$(cat F)", since I want that $(cat F) expands into many strings --- but I don't want further expansion of these strings.
If on the other hand there would be a better xargs, giving access to the original exit value, that would solve the problem, but I am not aware of that.
Does this do what you want?
cat F | xargs bash -c 'java program "$#"; echo "Returned: $?"' - > Output
Or, as #rici correctly points out, avoid the UUOC
xargs bash -c 'java program "$#"; echo "Returned: $?"' - < F > Output
Alternatively something like (though I haven't thought through all the ramifications of doing this so there may be a reason this is a bad idea).
{ sed -e 's/^/java program /' F | bash -s; echo "Returned $?"; } > Output
This lets you store the return code in a variable the xargs versions do not (at least not outside the xargs-spawned shell.
sed -e 's/^/java program /' F | bash -s > Output; ret=$?
To use a ${program} shell variable just expand it directly.
xargs bash -c 'java '"${program}"' "$#"; echo "Returned: $?"' - < F > Output
sed -e 's/^/java '"${program}"' /' F | bash -s > Output; ret=$?
Just beware of characters that are "magic" in the replacement of the s/// command.
I'm afraid the question is really not very clear, so I will make the assumptions here explicit:
The file F has one argument per line, with all whitespace other then newline characters being significant, and with no need to replace backslash escapes such as \t.
You only need to invoke the java program once, with all of the arguments.
You need the exit status to be preserved.
This can be done quite easily in bash by reading F into an array with mapfile:
# Invoked as: do_the_work program < F > output
do_the_work() {
local -a args
mapfile -t args
java "$#" "${args[#]}"
}
The status return of that function is precisely the status return of the java executable, so you could capture it immediately after the call:
do_the_work my_program
rc=$?
For convenience, the function allows you to also specify arguments on the command line; it uses "$#" to pass all the command-line arguments before passing the arguments read from stdin.
If you have GNU Parallel (and do not mind extra output on STDERR):
cat F | parallel -Xj1 --halt 1 java program > Output
echo $?

overwrite a file then append

I have a loop in my script that will append a list of email address's to a file "$CRN". If this script is executed again, it will append to this old list. I want it to overwrite with the new list rather then appending to the old list. I can submit my whole script if needed. I know I could test if "$CRN" exists then remove file, but I'm interested in some other suggestions? Thanks.
for arg in "$#"; do
if ls /students | grep -q "$arg"; then
echo "${arg}#mail.ccsf.edu">>$CRN
((students++))
elif ls /users | grep -q "$arg$"; then
echo "${arg}#ccsf.edu">>$CRN
((faculty++))
fi
Better do this :
CRN="/path/to/file"
:> "$CRN"
for arg; do
if printf '%s\n' /students/* | grep -q "$arg"; then
echo "${arg}#mail.ccsf.edu" >> "$CRN"
((students++))
elif printf '%s\n'/users/* | grep -q "${arg}$"; then
echo "${arg}#ccsf.edu" >> "$CRN"
((faculty++))
fi
done
don't parse ls output ! use bash glob instead. ls is a tool for interactively looking at file information. Its output is formatted for humans and will cause bugs in scripts. Use globs or find instead. Understand why: http://mywiki.wooledge.org/ParsingLs
"Double quote" every expansion, and anything that could contain a special character, eg. "$var", "$#", "${array[#]}", "$(command)". See http://mywiki.wooledge.org/Quotes http://mywiki.wooledge.org/Arguments and http://wiki.bash-hackers.org/syntax/words
take care to false positives like arg=foo and glob : foobar, that will match. You need grep -qw then if you want word boundaries. UP2U

Passing multiple arguments to a UNIX shell script

I have the following (bash) shell script, that I would ideally use to kill multiple processes by name.
#!/bin/bash
kill `ps -A | grep $* | awk '{ print $1 }'`
However, while this script works is one argument is passed:
end chrome
(the name of the script is end)
it does not work if more than one argument is passed:
$end chrome firefox
grep: firefox: No such file or directory
What is going on here?
I thought the $* passes multiple arguments to the shell script in sequence. I'm not mistyping anything in my input - and the programs I want to kill (chrome and firefox) are open.
Any help is appreciated.
Remember what grep does with multiple arguments - the first is the word to search for, and the remainder are the files to scan.
Also remember that $*, "$*", and $# all lose track of white space in arguments, whereas the magical "$#" notation does not.
So, to deal with your case, you're going to need to modify the way you invoke grep. You either need to use grep -F (aka fgrep) with options for each argument, or you need to use grep -E (aka egrep) with alternation. In part, it depends on whether you might have to deal with arguments that themselves contain pipe symbols.
It is surprisingly tricky to do this reliably with a single invocation of grep; you might well be best off tolerating the overhead of running the pipeline multiple times:
for process in "$#"
do
kill $(ps -A | grep -w "$process" | awk '{print $1}')
done
If the overhead of running ps multiple times like that is too painful (it hurts me to write it - but I've not measured the cost), then you probably do something like:
case $# in
(0) echo "Usage: $(basename $0 .sh) procname [...]" >&2; exit 1;;
(1) kill $(ps -A | grep -w "$1" | awk '{print $1}');;
(*) tmp=${TMPDIR:-/tmp}/end.$$
trap "rm -f $tmp.?; exit 1" 0 1 2 3 13 15
ps -A > $tmp.1
for process in "$#"
do
grep "$process" $tmp.1
done |
awk '{print $1}' |
sort -u |
xargs kill
rm -f $tmp.1
trap 0
;;
esac
The use of plain xargs is OK because it is dealing with a list of process IDs, and process IDs do not contain spaces or newlines. This keeps the simple code for the simple case; the complex case uses a temporary file to hold the output of ps and then scans it once per process name in the command line. The sort -u ensures that if some process happens to match all your keywords (for example, grep -E '(firefox|chrome)' would match both), only one signal is sent.
The trap lines etc ensure that the temporary file is cleaned up unless someone is excessively brutal to the command (the signals caught are HUP, INT, QUIT, PIPE and TERM, aka 1, 2, 3, 13 and 15; the zero catches the shell exiting for any reason). Any time a script creates a temporary file, you should have similar trapping around the use of the file so that it will be cleaned up if the process is terminated.
If you're feeling cautious and you have GNU Grep, you might add the -w option so that the names provided on the command line only match whole words.
All the above will work with almost any shell in the Bourne/Korn/POSIX/Bash family (you'd need to use backticks with strict Bourne shell in place of $(...), and the leading parenthesis on the conditions in the case are also not allowed with Bourne shell). However, you can use an array to get things handled right.
n=0
unset args # Force args to be an empty array (it could be an env var on entry)
for i in "$#"
do
args[$((n++))]="-e"
args[$((n++))]="$i"
done
kill $(ps -A | fgrep "${args[#]}" | awk '{print $1}')
This carefully preserves spacing in the arguments and uses exact matches for the process names. It avoids temporary files. The code shown doesn't validate for zero arguments; that would have to be done beforehand. Or you could add a line args[0]='/collywobbles/' or something similar to provide a default - non-existent - command to search for.
To answer your question, what's going on is that $* expands to a parameter list, and so the second and later words look like files to grep(1).
To process them in sequence, you have to do something like:
for i in $*; do
echo $i
done
Usually, "$#" (with the quotes) is used in place of $* in cases like this.
See man sh, and check out killall(1), pkill(1), and pgrep(1) as well.
Look into pkill(1) instead, or killall(1) as #khachik comments.
$* should be rarely used. I would generally recommend "$#". Shell argument parsing is relatively complex and easy to get wrong. Usually the way you get it wrong is to end up having things evaluated that shouldn't be.
For example, if you typed this:
end '`rm foo`'
you would discover that if you had a file named 'foo' you don't anymore.
Here is a script that will do what you are asking to have done. It fails if any of the arguments contain '\n' or '\0' characters:
#!/bin/sh
kill $(ps -A | fgrep -e "$(for arg in "$#"; do echo "$arg"; done)" | awk '{ print $1; }')
I vastly prefer $(...) syntax for doing what backtick does. It's much clearer, and it's also less ambiguous when you nest things.

Resources