How do I set a bash script's positional arguments from stdin? - bash

I have a bash script that I wish to read from a file to get it's arguments set. Basically my script reads arguments positionally ($1, $2, $3, etc.)
while test $# -gt 0; do
case $1 in
-h | --help)
echo "Help cruft"
exit 0
;;
esac
shift
done
One of the options I was hoping could be a config file that reads in arguments (for simple and easy config) so I was hoping the set -- command would work (-- to over ride the arguments). However, since they are defined in a file I have to read it in and use xargs to pass them:
-c | --config)
cat $2 | xargs set --
continue
;;
The trouble is that xargs buggers up the -- so I don't know how to accomplish this.
Note: I realize I could use source config_file and have it set variable; might be the final option. I wanted to know if I could do it like above and simplify the documentation.
A simplified example script:
# foo.sh
echo "x y z" | xargs set --
echo $*
# Command line
$ bash foo.sh a b c
xargs: set: No such file or directory
a b c

xargs can't execute set because:
set is a shell built-in, not an external command. xargs only knows how to execute commands. (Some shell built-ins shadow commands with the same name, such as printf, true, and [. So xargs can execute those commands, but the semantics might not be identical to the built-in.)
Even if xargs could execute set, it would have no effect because xargs does not run inside of the shell's environment; every command executed by xargs is a separate process. So you will get no error if you do this:
echo a b c | xargs bash -c 'set -- "${#}"' _
But it also won't do anything useful. (Substitute set with echo and you'll see that it does actually invoke the command.)
How to read arguments from a file.
First, you need to answer the question: what does it mean to have arguments in a file? Are they individual whitespace-separated words with no mechanism to include whitespace in any argument? (That would also be required for xargs to work in its default mode, so it is not a totally unreasonable assumption, although it is almost certainly going to get you into trouble at some point.)
In that case you don't need xargs at all; you can just use command substitution:
set -- $(<file)
While that will work fine, this won't:
echo a b c | set -- $(</dev/stdin)
because the pipeline (created by the | operator) causes the processes on either side to be run in subshells, and consequently the set doesn't modify the current shell's environment variables.
A more robust solution
Suppose that each argument is in a single line in the file, which makes it possible to include whitespace in an argument, but not a newline. Then we could use the useful mapfile built-in to read the arguments into an array, and set the positional arguments from the array. (Or just use the array directly, but that would be a different question.)
mapfile -t args < file
set -- "${args[#]}"
Again, watch out for piping into mapfile; it won't work, for the same reason that it didn't work with set.

Related

Store a command in a variable; implement without `eval`

This is almost the exact same question as in this post, except that I do not want to use eval.
Quick question short, I want to execute the command echo aaa | grep a by first storing it in a string variable Command='echo aaa | grep a', and then running it without using eval.
In the post above, the selected answer used eval. That works for me too. What concerns me a lot is that there are plenty of warnings about eval below, followed by some attempts to circumvent it. However, none of them are able to solve my problem (essentially the OP's). I have commented below their attempts, but since it has been there for a long time, I suppose it is better to post the question again with the restriction of not using eval.
Concrete Example
What I want is a shell script that runs my command when I am happy:
#!/bin/bash
# This script run-this-if.sh runs the commands when I am happy
# Warning: the following script does not work (on nose)
if [ "$1" == "I-am-happy" ]; then
"$2"
fi
$ run-if.sh I-am-happy [insert-any-command]
Your sample usage can't ever work with an assignment, because assignments are scoped to the current process and its children. Because there's no reason to try to support assignments, things get suddenly far easier:
#!/bin/sh
if [ "$1" = "I-am-happy" ]; then
shift; "$#"
fi
This then can later use all the usual techniques to run shell pipelines, such as:
run-if-happy "$happiness" \
sh -c 'echo "$1" | grep "$2"' _ "$untrustedStringOne" "$untrustedStringTwo"
Note that we're passing the execve() syscall an argv with six elements:
sh (the shell to run; change to bash etc if preferred)
-c (telling the shell that the following argument is the code for it to run)
echo "$1" | grep "$2" (the code for sh to parse)
_ (a constant which becomes $0)
...whatever the shell variable untrustedStringOne contains... (which becomes $1)
...whatever the shell variable untrustedStringTwo contains... (which becomes $2)
Note here that echo "$1" | grep "$2" is a constant string -- in single-quotes, with no parameter expansions or command substitutions -- and that untrusted values are passed into the slots that fill in $1 and $2, out-of-band from the code being evaluated; this is essential to have any kind of increase in security over what eval would give you.

Bash get the command that is piping into a script

Take the following example:
ls -l | grep -i readme | ./myscript.sh
What I am trying to do is get ls -l | grep -i readme as a string variable in myscript.sh. So essentially I am trying to get the whole command before the last pipe to use inside myscript.sh.
Is this possible?
No, it's not possible.
At the OS level, pipelines are implemented with the mkfifo(), dup2(), fork() and execve() syscalls. This doesn't provide a way to tell a program what the commands connected to its stdin are. Indeed, there's not guaranteed to be a string representing a pipeline of programs being used to generate stdin at all, even if your stdin really is a FIFO connected to another program's stdout; it could be that that pipeline was generated by programs calling execve() and friends directly.
The best available workaround is to invert your process flow.
It's not what you asked for, but it's what you can get.
#!/usr/bin/env bash
printf -v cmd_str '%q ' "$#" # generate a shell command representing our arguments
while IFS= read -r line; do
printf 'Output from %s: %s\n' "$cmd_str" "$line"
done < <("$#") # actually run those arguments as a command, and read from it
...and then have your script start the things it reads input from, rather than receiving them on stdin.
...thereafter, ./yourscript ls -l, or ./yourscript sh -c 'ls -l | grep -i readme'. (Of course, never use this except as an example; see ParsingLs).
It can't be done generally, but using the history command in bash it can maybe sort of be done, provided certain conditions are met:
history has to be turned on.
Only one shell has been running, or accepting new commands, (or failing that, running myscript.sh), since the start of myscript.sh.
Since command lines with leading spaces are, by default, not saved to the history, the invoking command for myscript.sh must have no leading spaces; or that default must be changed -- see Get bash history to remember only the commands run with space prefixed.
The invoking command needs to end with a &, because without it the new command line wouldn't be added to the history until after myscript.sh was completed.
The script needs to be a bash script, (it won't work with /bin/dash), and the calling shell needs a little prep work. Sometime before the script is run first do:
shopt -s histappend
PROMPT_COMMAND="history -a; history -n"
...this makes the bash history heritable. (Code swiped from unutbu's answer to a related question.)
Then myscript.sh might go:
#!/bin/bash
history -w
printf 'calling command was: %s\n' \
"$(history | rev |
grep "$0" ~/.bash_history | tail -1)"
Test run:
echo googa | ./myscript.sh &
Output, (minus the "&" associated cruft):
calling command was: echo googa | ./myscript.sh &
The cruft can be halved by changing "&" to "& fg", but the resulting output won't include the "fg" suffix.
I think you should pass it as one string parameter like this
./myscript.sh "$(ls -l | grep -i readme)"
I think that it is possible, have a look at this example:
#!/bin/bash
result=""
while read line; do
result=$result"${line}"
done
echo $result
Now run this script using a pipe, for example:
ls -l /etc | ./script.sh
I hope that will be helpful for you :)

Using xargs to run bash scripts on multiple input lists with arguments

I am trying to run a script on multiple lists of files while also passing arguments in parallel. I have file_list1.dat, file_list2.dat, file_list3.dat. I would like to run script.sh which accepts 3 arguments: arg1, arg2, arg3.
For one run, I would do:
sh script.sh file_list1.dat $arg1 $arg2 $arg3
I would like to run this command in parallel for all the file lists.
My attempt:
Ncores=4
ls file_list*.dat | xargs -P "$Ncores" -n 1 [sh script.sh [$arg1 $arg2 $arg3]]
This results in the error: invalid number for -P option. I think the order of this command is wrong.
My 2nd attempt:
echo $arg1 $arg2 $arg3 | xargs ls file_list*.dat | xargs -P "$Ncores" -n 1 sh script.sh
But this results in the error: xargs: ls: terminated by signal 13
Any ideas on what the proper syntax is for passing arguments to a bash script with xargs?
I'm not sure I understand exactly what you want to do. Is it to execute something like these commands, but in parallel?
sh script.sh $arg1 $arg2 $arg3 file_list1.dat
sh script.sh $arg1 $arg2 $arg3 file_list2.dat
sh script.sh $arg1 $arg2 $arg3 file_list3.dat
...etc
If that's right, this should work:
Ncores=4
printf '%s\0' file_list*.dat | xargs -0 -P "$Ncores" -n 1 sh script.sh "$arg1" "$arg2" "$arg3"
The two major problems in your version were that you were passing "Ncores" as a literal string (rather than using $Ncores to get the value of the variable), and that you had [ ] around the command and arguments (which just isn't any relevant piece of shell syntax). I also added double-quotes around all variable references (a generally good practice), and used printf '%s\0' (and xargs -0) instead of ls.
Why did I use printf instead of ls? Because ls isn't doing anything useful here that printf or echo or whatever couldn't do as well. You may think of ls as the tool for getting lists of filenames, but in this case the wildcard expression file_list*.dat gets expanded to a list of files before the command is run; all ls would do with them is look at each one, say "yep, that's a file" to itself, then print it. echo could do the same thing with less overhead. But with either ls or echo the output can be ambiguous if any filenames contain spaces, quotes, or other funny characters. Some versions of ls attempt to "fix" this by adding quotes or something around filenames with funny characters, but that might or might not match how xargs parses its input (if it happens at all).
But printf '%s\0' is unambiguous and predictable -- it prints each string (filename in this case) followed by a NULL character, and that's exactly what xargs -0 takes as input, so there's no opportunity for confusion or misparsing.
Well, ok, there is one edge case: if there aren't any matching files, the wildcard pattern will just get passed through literally, and it'll wind up trying to run the script with the unexpanded string "file_list*.dat" as an argument. If you want to avoid this, use shopt -s nullglob before this command (and shopt -u nullglob afterward, to get back to normal mode).
Oh, and one more thing: sh script.sh isn't the best way to run scripts. Give the script a proper shebang line at the beginning (#!/bin/sh if it uses only basic shell features, #!/bin/bash or #!/usr/bin/env bash if it uses any bashisms), and run it with ./script.sh.

Either getting original return value from xargs or simulate xargs

I am working with bash. I have a file F containing the command-line arguments for a Java program, and I need to store both outputs of the Java programs, i.e., output on standard output and the exit value. Storing the standard output works via
cat F | xargs java program > Output
But xargs does not give access to the exit-code of the Java program.
So well, I split it, running the program twice, once for standard output, once for the exit code --- but getting the exit code and running it correctly seems impossible. One might try
java program $(cat F)
but that doesn't work if F contains for example " ", namely one command-line argument for program which is a space. The problem is the expansion of the argument $(cat F).
Now I don't see a way to get around that problem? I don't want "$(cat F)", since I want that $(cat F) expands into many strings --- but I don't want further expansion of these strings.
If on the other hand there would be a better xargs, giving access to the original exit value, that would solve the problem, but I am not aware of that.
Does this do what you want?
cat F | xargs bash -c 'java program "$#"; echo "Returned: $?"' - > Output
Or, as #rici correctly points out, avoid the UUOC
xargs bash -c 'java program "$#"; echo "Returned: $?"' - < F > Output
Alternatively something like (though I haven't thought through all the ramifications of doing this so there may be a reason this is a bad idea).
{ sed -e 's/^/java program /' F | bash -s; echo "Returned $?"; } > Output
This lets you store the return code in a variable the xargs versions do not (at least not outside the xargs-spawned shell.
sed -e 's/^/java program /' F | bash -s > Output; ret=$?
To use a ${program} shell variable just expand it directly.
xargs bash -c 'java '"${program}"' "$#"; echo "Returned: $?"' - < F > Output
sed -e 's/^/java '"${program}"' /' F | bash -s > Output; ret=$?
Just beware of characters that are "magic" in the replacement of the s/// command.
I'm afraid the question is really not very clear, so I will make the assumptions here explicit:
The file F has one argument per line, with all whitespace other then newline characters being significant, and with no need to replace backslash escapes such as \t.
You only need to invoke the java program once, with all of the arguments.
You need the exit status to be preserved.
This can be done quite easily in bash by reading F into an array with mapfile:
# Invoked as: do_the_work program < F > output
do_the_work() {
local -a args
mapfile -t args
java "$#" "${args[#]}"
}
The status return of that function is precisely the status return of the java executable, so you could capture it immediately after the call:
do_the_work my_program
rc=$?
For convenience, the function allows you to also specify arguments on the command line; it uses "$#" to pass all the command-line arguments before passing the arguments read from stdin.
If you have GNU Parallel (and do not mind extra output on STDERR):
cat F | parallel -Xj1 --halt 1 java program > Output
echo $?

Passing multiple arguments to a UNIX shell script

I have the following (bash) shell script, that I would ideally use to kill multiple processes by name.
#!/bin/bash
kill `ps -A | grep $* | awk '{ print $1 }'`
However, while this script works is one argument is passed:
end chrome
(the name of the script is end)
it does not work if more than one argument is passed:
$end chrome firefox
grep: firefox: No such file or directory
What is going on here?
I thought the $* passes multiple arguments to the shell script in sequence. I'm not mistyping anything in my input - and the programs I want to kill (chrome and firefox) are open.
Any help is appreciated.
Remember what grep does with multiple arguments - the first is the word to search for, and the remainder are the files to scan.
Also remember that $*, "$*", and $# all lose track of white space in arguments, whereas the magical "$#" notation does not.
So, to deal with your case, you're going to need to modify the way you invoke grep. You either need to use grep -F (aka fgrep) with options for each argument, or you need to use grep -E (aka egrep) with alternation. In part, it depends on whether you might have to deal with arguments that themselves contain pipe symbols.
It is surprisingly tricky to do this reliably with a single invocation of grep; you might well be best off tolerating the overhead of running the pipeline multiple times:
for process in "$#"
do
kill $(ps -A | grep -w "$process" | awk '{print $1}')
done
If the overhead of running ps multiple times like that is too painful (it hurts me to write it - but I've not measured the cost), then you probably do something like:
case $# in
(0) echo "Usage: $(basename $0 .sh) procname [...]" >&2; exit 1;;
(1) kill $(ps -A | grep -w "$1" | awk '{print $1}');;
(*) tmp=${TMPDIR:-/tmp}/end.$$
trap "rm -f $tmp.?; exit 1" 0 1 2 3 13 15
ps -A > $tmp.1
for process in "$#"
do
grep "$process" $tmp.1
done |
awk '{print $1}' |
sort -u |
xargs kill
rm -f $tmp.1
trap 0
;;
esac
The use of plain xargs is OK because it is dealing with a list of process IDs, and process IDs do not contain spaces or newlines. This keeps the simple code for the simple case; the complex case uses a temporary file to hold the output of ps and then scans it once per process name in the command line. The sort -u ensures that if some process happens to match all your keywords (for example, grep -E '(firefox|chrome)' would match both), only one signal is sent.
The trap lines etc ensure that the temporary file is cleaned up unless someone is excessively brutal to the command (the signals caught are HUP, INT, QUIT, PIPE and TERM, aka 1, 2, 3, 13 and 15; the zero catches the shell exiting for any reason). Any time a script creates a temporary file, you should have similar trapping around the use of the file so that it will be cleaned up if the process is terminated.
If you're feeling cautious and you have GNU Grep, you might add the -w option so that the names provided on the command line only match whole words.
All the above will work with almost any shell in the Bourne/Korn/POSIX/Bash family (you'd need to use backticks with strict Bourne shell in place of $(...), and the leading parenthesis on the conditions in the case are also not allowed with Bourne shell). However, you can use an array to get things handled right.
n=0
unset args # Force args to be an empty array (it could be an env var on entry)
for i in "$#"
do
args[$((n++))]="-e"
args[$((n++))]="$i"
done
kill $(ps -A | fgrep "${args[#]}" | awk '{print $1}')
This carefully preserves spacing in the arguments and uses exact matches for the process names. It avoids temporary files. The code shown doesn't validate for zero arguments; that would have to be done beforehand. Or you could add a line args[0]='/collywobbles/' or something similar to provide a default - non-existent - command to search for.
To answer your question, what's going on is that $* expands to a parameter list, and so the second and later words look like files to grep(1).
To process them in sequence, you have to do something like:
for i in $*; do
echo $i
done
Usually, "$#" (with the quotes) is used in place of $* in cases like this.
See man sh, and check out killall(1), pkill(1), and pgrep(1) as well.
Look into pkill(1) instead, or killall(1) as #khachik comments.
$* should be rarely used. I would generally recommend "$#". Shell argument parsing is relatively complex and easy to get wrong. Usually the way you get it wrong is to end up having things evaluated that shouldn't be.
For example, if you typed this:
end '`rm foo`'
you would discover that if you had a file named 'foo' you don't anymore.
Here is a script that will do what you are asking to have done. It fails if any of the arguments contain '\n' or '\0' characters:
#!/bin/sh
kill $(ps -A | fgrep -e "$(for arg in "$#"; do echo "$arg"; done)" | awk '{ print $1; }')
I vastly prefer $(...) syntax for doing what backtick does. It's much clearer, and it's also less ambiguous when you nest things.

Resources