Use argument twice from standard output pipelining - shell

I have a command line tool which receives two arguments:
TOOL arg1 -o arg2
I would like to invoke it with the same argument provided it for arg1 and arg2, and to make that easy for me, i thought i would do:
each <arg1_value> | TOOL $1 -o $1
but that doesn't work, $1 is not replaced, but is added once to the end of the commandline.
An explicit example, performing:
cp fileA fileA
returns an error fileA and fileA are identical (not copied)
While performing:
echo fileA | cp $1 $1
returns the following error:
usage: cp [-R [-H | -L | -P]] [-fi | -n] [-apvX] source_file target_file
cp [-R [-H | -L | -P]] [-fi | -n] [-apvX] source_file ... target_directory
any ideas?

If you want to use xargs, the [-I] option may help:
-I replace-str
Replace occurrences of replace-str in the initial-arguments with names read from standard input. Also, unquoted blanks do not terminate input items; instead the separa‐
tor is the newline character. Implies -x and -L 1.
Here is a simple example:
mkdir test && cd test && touch tmp
ls | xargs -I '{}' cp '{}' '{}'
Returns an Error cp: tmp and tmp are the same file

The xargs utility will duplicate its input stream to replace all placeholders in its argument if you use the -I flag:
$ echo hello | xargs -I XXX echo XXX XXX XXX
hello hello hello
The placeholder XXX (may be any string) is replaced with the entire line of input from the input stream to xargs, so if we give it two lines:
$ printf "hello\nworld\n" | xargs -I XXX echo XXX XXX XXX
hello hello hello
world world world
You may use this with your tool:
$ generate_args | xargs -I XXX TOOL XXX -o XXX
Where generate_args is a script, command or shell function that generates arguments for your tool.
The reason
each <arg1_value> | TOOL $1 -o $1
did not work, apart from each not being a command that I recognise, is that $1 expands to the first positional parameter of the current shell or function.
The following would have worked:
set - "arg1_value"
TOOL "$1" -o "$1"
because that sets the value of $1 before calling you tool.

You can re-run a shell to perform variable expansion, with sh -c. The -c takes an argument which is command to run in a shell, performing expansion. Next arguments of sh will be interpreted as $0, $1, and so on, to use in the -c. For example:
sh -c 'echo $1, i repeat: $1' foo bar baz will print execute echo $1, i repeat: $1 with $1 set to bar ($0 is set to foo and $2 to baz), finally printing bar, i repeat: bar

The $1,$2...$N are only visible to bash script to interpret arguments to those scripts and won't work the way you want them to. Piping redirects stdout to stdin and is not what you are looking for either.
If you just want a one-liner, use something like
ARG1=hello && tool $ARG1 $ARG1

Using GNU parallel to use STDIN four times, to print a multiplication table:
seq 5 | parallel 'echo {} \* {} = $(( {} * {} ))'
Output:
1 * 1 = 1
2 * 2 = 4
3 * 3 = 9
4 * 4 = 16
5 * 5 = 25

One could encapsulate the tool using awk:
$ echo arg1 arg2 | awk '{ system("echo TOOL " $1 " -o " $2) }'
TOOL arg1 -o arg2
Remove the echo within the system() call and TOOL should be executed in accordance with requirements:
echo arg1 arg2 | awk '{ system("TOOL " $1 " -o " $2) }'

Double up the data from a pipe, and feed it to a command two at a time, using sed and xargs:
seq 5 | sed p | xargs -L 2 echo
Output:
1 1
2 2
3 3
4 4
5 5

Related

bash or zsh: how to pass multiple inputs to interactive piped parameters?

I have 3 different files that I want to compare
words_freq
words_freq_deduped
words_freq_alpha
For each file, I run a command like so, which I iterate on constantly to compare the results.
For example, I would do this:
$ cat words_freq | grep -v '[soe]'
$ cat words_freq_deduped | grep -v '[soe]'
$ cat words_freq_alpha | grep -v '[soe]'
and then review the results, and then do it again, with an additional filter
$ cat words_freq | grep -v '[soe]' | grep a | grep r | head -n20
a
$ cat words_freq_deduped | grep -v '[soe]' | grep a | grep r | head -n20
b
$ cat words_freq_alpha | grep -v '[soe]' | grep a | grep r | head -n20
c
This continues on until I've analyzed my data.
I would like to write a script that could take the piped portions, and pass it to each of these files, as I iterate on the grep/head portions of the command.
e.g. The following would dump the results of running the 3 commands above AND also compare the 3 results, and dump additional calculations on them
$ myScript | grep -v '[soe]' | grep a | grep r | head -n20
the letters were in all 3 runs, and it took 5 seconds
a
b
c
How can I do this using bash/python or zsh for the myScript part?
EDIT: After asking the question, it occurred to me that I could use eval to do it, like so, which I've added as an answer as well
The following approach allows me to process multiple files by using eval, which I know is frowned upon - any other suggestions are greatly appreciated!
$ myScript "grep -v '[soe]' | grep a | grep r | head -n20"
myScript
#!/usr/bin/env bash
function doIt(){
FILE=$1
CMD="cat $1 | $2"
echo processing file "$FILE"
eval "$CMD"
echo
}
doIt words_freq "$#"
doIt words_freq_deduped "$#"
doIt words_freq_alpha "$#"
You can't avoid your shell from running pipes itself, so using it like that isn't very practical - you'd need to either quote everything and then eval it, which would make it hard to pass arguments with spaces, or quote every pipe, which you can then eval, making it so you have to quote every pipe. But yeah, these solutions are kinda hacky.
I'd suggest doing one of these two:
Keep your editor open, and put whatever you want to run inside the doIt function itself before you run it. Then run it in your shell without any arguments:
#!/usr/bin/env bash
doIt() {
# grep -v '[soe]' < "$1"
grep -v '[soe]' < "$1" | grep a | grep r | head -n20
}
doIt words_freq
doIt words_freq_deduped
doIt words_freq_alpha
Or, you could always use a "for" in your shell, which you can use Ctrl+r to find in your history when you want to use:
$ for f in words_freq*; do grep -v '[soe]' < "$f" | grep a | grep r | head -n20; done
But if you really want your approach, I tried to make it accept spaces, but it ended up being even hackier:
#!/usr/bin/env bash
doIt() {
local FILE=$1
shift
echo processing file "$FILE"
local args=()
for n in $(seq 1 $#); do
arg=$1
shift
if [[ $arg == '|' ]]; then
args+=('|')
else
args+=("\"$arg\"")
fi
done
eval "cat '$FILE' | ${args[#]}"
}
doIt words_freq "$#"
doIt words_freq_deduped "$#"
doIt words_freq_alpha "$#"
With this version you can use it like this:
$ ./myScript grep "a a" "|" head -n1
Notice that it need you to quote the |, and that it now handles arguments with spaces.
Not fully understood problem correctly.
I understood you want to write a script without pipes, by including the filtering logic into the script.
And feeding the filtering patterns as arguments.
Here is a gawk script (standard Linux awk).
With one sweep on 3 input files, without piping.
script.awk
BEGIN {
RS="!#!#!#!#!#!#!#";
# set record separator to something unlikely matched, causing each file to be read entirely as a single record
}
$0 !~ excludeRegEx # if file does not match excludeRegEx
&& $0 ~ includeRegEx1 # and match includeRegEx1
&& $0 ~ includeRegEx2 { # and match includeRegEx2
system "head -n20 "FILENAME; # call shell command "head -n20 " on current filename
}
Running script.awk
awk -v excludeRegEx='[soe]' \
-v includeRegEx1='a' \
-v includeRegEx2='r' \
-f script.awk words_freq words_freq_deduped words_freq_alpha
The following approach allows me to process multiple files by using eval, which I know is frowned upon - any other suggestions are greatly appreciated!
$ myScript "grep -v '[soe]' | grep a | grep r | head -n20"
myScript
#!/usr/bin/env bash
function doIt(){
FILE=$1
CMD="cat $1 | $2"
echo processing file "$FILE"
eval "$CMD"
echo
}
doIt words_freq "$#"
doIt words_freq_deduped "$#"
doIt words_freq_alpha "$#"

Set a command to a variable in bash script problem

Trying to run a command as a variable but I am getting strange results
Expected result "1" :
grep -i nosuid /etc/fstab | grep -iq nfs
echo $?
1
Unexpected result as a variable command:
cmd="grep -i nosuid /etc/fstab | grep -iq nfs"
$cmd
echo $?
0
It seems it returns 0 as the command was correct not actual outcome. How to do this better ?
You can only execute exactly one command stored in a variable. The pipe is passed as an argument to the first grep.
Example
$ printArgs() { printf %s\\n "$#"; }
# Two commands. The 1st command has parameters "a" and "b".
# The 2nd command prints stdin from the first command.
$ printArgs a b | cat
a
b
$ cmd='printArgs a b | cat'
# Only one command with parameters "a", "b", "|", and "cat".
$ $cmd
a
b
|
cat
How to do this better?
Don't execute the command using variables.
Use a function.
$ cmd() { grep -i nosuid /etc/fstab | grep -iq nfs; }
$ cmd
$ echo $?
1
Solution to the actual problem
I see three options to your actual problem:
Use a DEBUG trap and the BASH_COMMAND variable inside the trap.
Enable bash's history feature for your script and use the hist command.
Use a function which takes a command string and executes it using eval.
Regarding your comment on the last approach: You only need one function. Something like
execAndLog() {
description="$1"
shift
if eval "$*"; then
info="PASSED: $description: $*"
passed+=("${FUNCNAME[1]}")
else
info="FAILED: $description: $*"
failed+=("${FUNCNAME[1]}")
done
}
You can use this function as follows
execAndLog 'Scanned system' 'grep -i nfs /etc/fstab | grep -iq noexec'
The first argument is the description for the log, the remaining arguments are the command to be executed.
using bash -x or set -x will allow you to see what bash executes:
> cmd="grep -i nosuid /etc/fstab | grep -iq nfs"
> set -x
> $cmd
+ grep -i nosuid /etc/fstab '|' grep -iq nfs
as you can see your pipe | is passed as an argument to the first grep command.

command fails when fed argument via xargs, but not when fed the argument directly

I have a bash function
agg_generror () {
echo $1
find ${folder} -name "${prefix}_*_${1}_${suffix}.count" | xargs -I % sh -c 'cat %; echo "";' | awk 'BEGIN{e=0;t=0} {e+=$1;t+=$2} END{print e/t}' > generror_${1}
}
which if I call directly
agg_generror 17.5
works and doesn't complain.
But if I do
echo 17.5 | xargs -I % sh -c 'agg_generror %'
It fails with
17.5
awk: fatal: division by zero attempted
Why may the behaviour different in the two cases?
while read; do agg_generror $REPLY; done < input.txt

Pipe stdout to command which itself needs to read from own stdin

I would like to get the stdout from a process into another process not using stdin, as that one is used for another purpose.
In short I want to accomplish something like that:
echo "a" >&4
cat | grep -f /dev/fd/4
I got it running using an file as source for file descriptor 4, but that is not what I want:
# Variant 1
cat file | grep -f /dev/fd/4 4<pattern
# Variant 2
exec 4<pattern
cat | grep -f /dev/fd/4
exec 4<&-
My best try is that, but I got the following error message:
# Variant 3
cat | (
echo "a" >&4
grep -f /dev/fd/4
) <&4
Error message:
test.sh: line 5: 4: Bad file descriptor
What is the best way to accomplish that?
You don't need to use multiple streams to do this:
$ printf foo > pattern
$ printf '%s\n' foo bar | grep -f pattern
foo
If instead of a static file you want to use the output of a command as the input to -f you can use a process substitution:
$ printf '%s\n' foo bar | grep -f <(echo foo)
foo
For POSIX shells that lack process substitution, (e.g. dash, ash, yash, etc.).
If the command allows string input, (grep allows it), and the input string containing search targets isn't especially large, (i.e. the string doesn't exceed the length limit for the command line), there's always command substitution:
$ printf '%s\n' foo bar baz | grep $(echo foo)
foo
Or if the input file is multi-line, separating quoted search items with '\n' works the same as grep OR \|:
$ printf '%s\n' foo bar baz | grep "$(printf "%s\n" foo bar)"
foo
bar

Executing bash -c with xargs

I had a job to perform that involved:
grep lines from a log
find a number in the line
perform basic arithmetic on the number (say, number + 1234)
The final result is a bunch of numbers separated by a newline.
If the input was:
1000
2000
3000
Then the required output was:
2234
3234
4234
I ended up with the following command:
cat log.txt | grep "word" | cut -d'|' -f7 | cut -d' ' -f5 | xargs -n 1 bash -c 'echo $(($1 + 1234))' args
I found the xargs -n 1 bash -c 'echo $(($1 + 1234))' args snippet in an answer to this question but I don't understand the need for the final args argument that is passed in. I can change it to anything, args could be blah, but if I omit it the arithmetic fails and the output is the numbers unchanged:
1000
2000
3000
Could anyone shed some light on why args is a required argument to bash -c?
A simple awk command can do the same - in a clean way:
awk -F'|' '/word/{split($7,a," "); print a[5]+1234}' log.txt
Man bash:
-c If the -c option is present, then commands are read from the first non-option argument command_string. If there are arguments after
the command_string, they are assigned to the positional parameters, starting with $0.
So, for your case, 'args' is a placeholder that goes in $0, making your actual input go in $1.
You should be able to alter your command to:
grep "word" log.txt | cut -d'|' -f7 | cut -d' ' -f5 | xargs -n 1 bash -c 'echo $(($0 + 1234))'

Resources