Create a chained command line in bash function - bash

I have a question: I would like to create a function that (dependantly from number of entered arguments) would create so called "cained" command line. The current code I wrote look as follow:
function ignore {
if [ -n "$#" ] && [ "$#" > 0 ]; then
count=$#
if [ ${count} -eq 1 ]; then
return "grep -iv $1"
else
for args in "$#" do
## Here should be code that would put (using pseudo code) as many "grep -iv $a | grep -iv $(a+1) | ... | grep -iv $(a+n)", where the part "$(a+..) represent the placeholder of next argument"
done
fi
fi
}
Any ideas? Thanks
Update
I would like to precise above. The above functions would become used as following:
some_bash_function | ignore
example:
apt-get cache search apache2 | ignore doc lib
Maybe this will help bit more

This seems horribly inefficient. A much better solution would look like grep -ive "${array[0]}" -e "${array[1]}" -e "${array[2]}" etc. Here's a simple way to build that.
# don't needlessly use Bash-only function declaration syntax
ignore () {
local args=()
local t
for t; do
args+=(-e "$t")
done
grep -iv "${args[#]}"
}
In the end, git status | ignore foo bar baz is not a lot simpler than git status | grep -ive foo -e bar -e baz so this function might not be worth these 116 bytes (spaces included). But hopefully at least this can work to demonstrate a way to build command lines programmatically. The use of arrays is important; there is no good way to preserve quoting of already quoted values if you smash everything into a single string.
A more sustainable solution still is to just combine everything into a single regex. You can do that with grep -iv 'foo\|bar\|baz' though personally, I would probably switch to the more expressive regex dialect of grep -E; grep -ivE 'foo|bar|baz'.
If you really wanted to build a structure of pipes, I guess a recursive function would work.
# FIXME: slow and ugly, prefer the one above
ignore_slowly () {
if [[ $# -eq 1 ]]; then
grep -iv "$1"
else
local t=$1
shift
grep -iv "$t" | ignore_slowly "$#"
fi
}
But generally, you want to minimize the number of processes you create.

Though inefficient, what you want can be done like this:
#!/bin/bash
ignore () {
printf -v pipe 'grep -iv %q | ' "$#"
pipe=${pipe%???} # remove trailing ' | '
bash -c "$pipe"
}
ignore 'regex1' 'regex2' … 'regexn' < file

Related

Piping the same output through one or several grep commands on condition

I am currently writing a bash script to modify the output of my LaTeX compilations to have only what I find relevant printing on the console. As I would like this script to be extremely thorough, I set up different options to toggle different output filters at the same time depending of the nature of the informations given through the compilation (Fatal error, warning, over/underfull h/vbox...).
For those who may not know, we often need to perform several compilations in a row to have a full LaTeX document with correct labels, page numbering, index, table of contents... + other commands like bibtex or makeglossaries for bibliography and, well, glossaries. I therefore have a loop that execute everything and stops if there is a fatal error encountered, but should continue if it is only a minor warning.
My main command line is piping the pdflatex output through a reversed grep that finds errors line (starting by !). Like this, the script stops only if grep found a fatal error.
: | pdflatex --halt-on-error $# | { ! grep --color=auto '^!.*' -A200; }
But when I activate any other filters (eg. '*.full.*' for over/underfull lines), I need to be able to continue compiling to be able to identify it there is a major necessity to correct it (hey, sometimes, underfull lines are just not that ugly...).
That means my grep command cannot be inverted as in the first line, and I cannot (or don't know how to) use the same grep with a different regex. notice that if if using a different grep, it should also be read from the pdflatex output and I cannot pipe it directly following the above snippet.
To sum up, it should roughly look like this :
pdflatex --> grep for fatal errors --> if more filters, grep for those filters
--> pass to next step
I came up with several attempts that did not work properly :
This one works only if I want to compile WITH the warnings. Looking only for errors does not work.
latex_compilation() {
: | pdflatex --halt-on-error $# | tee >({ ! grep --color=auto '^!.*' -A200; }) >({ grep --color=auto "$warnings_filter" -A5 };) >/dev/null
}
latex_compilation() {
: | pdflatex --halt-on-error $# | tee >({ ! grep --color=auto '^!.*' -A200; }) >/dev/null | ({ grep --color=auto "$warnings_filter" -A5 };)
}
or even desperately
latex_compilation() {
: | pdflatex --halt-on-error $# |
if [[ "$warnings_on" = true ]]; then
{ grep --color=auto "$warnings_filter" -A5 };
fi
{ ! grep --color=auto '^!.*' -A200; }
}
This one would work but uses 2 compilation processes for each step (you could easily go up to 7/8 compilations steps for a big and complex document). It should be avoided if possible.
latex_compilation() {
if [[ "$warnings_on" = true ]]; then
: | pdflatex --halt-on-error $# | \
{ grep --color=auto "$warnings_filter" -A5 };
fi
: | pdflatex --halt-on-error $# | \
{ ! grep --color=auto '^!.*' -A200; }
}
I spent hours looking for solutions online, but didn't find any yet.
I really hope this is clear enough because it is a mess to sum up, moreover writing it. You can find the relavant code here if needed for clarity.
This one would work but uses 2 compilation processes
So let's use one.
latex_compilation() {
local tmp
tmp=$(pdflatext ... <&-)
if [[ "$warnings_on" = true ]]; then
grep --color=auto "$warnings_filter" -A5 <<<"$tmp"
fi
! grep --color=auto '^!.*' -A200 <<<"$tmp"
}
Or you can do that asynchronously, by parsing the output, in your chosem programmign langauge. For Bash see https://mywiki.wooledge.org/BashFAQ/001 :
line_is_warning() { .... }
latex_compilation() {
local outputlines=0 failed
while IFS= read -r line; do
if "$warnings_on" && line_is_warning "$line"; do
outputlines=5 # will output 5 lines after
fi
if [[ "$line" =~ ^! ]]; then
failed=1
outputlines=200 # will output 200 lines after
fi
if ((outputlines != 0)); then
((outputlines--))
printf "%s\n" "$line"
fi
done < <(pdflatext ... <&-)
if ((failed)); then return 1; fi
}
But Bash will be extremely slow. Consider using AWK or Python or Perl.
looking for solutions online
Exactly, you have to write a solution yourself, for your specific requirements.
his one works only if I want to compile WITH the warnings. Looking only for errors does not work.
You can write whole code blocks inside >( ... ) and basically anywhere. The exit status of a pipeline is the exit status of rightmost command (except set -o pipefail). Put the failing command as the rightmost of the pipeline.
latex_compilation() {
pdflatex --halt-on-error "$#" <&- |
tee >(
if "$warnings_on"; then
grep --color=auto "$warnings_filter" -A5
else
cat >/dev/null
fi
) |
! grep --color=auto '^!.*' -A200
}
Suggesting to use awk filtering pattern.
Read more about awk filtering pattern here.
With awk you can create complex filtering patterns logic: !=not, &&=and, ||=or.
For example if you have 3 filtering RegExp patterns: Pattern_1, Pattern_2, Pattern 3.
Example 1
You can make a combined filter all 3 patterns in the following command:
awk '/Pattern_1/ && /Pattern_2/ && /Pattern_3/ 1' scanned_file1 scanned_file2 ...
The result will be printing only lines that match all 3 pattern.
Example 2
You can make a combined inverse filter all 3 pattern in the following command:
awk '!/Pattern_1/ && !/Pattern_2/ && !/Pattern_3/ 1' scanned_file1 scanned_file2 ...
The result will be printing lines not matching any of the 3 patterns.
Example 3
You can make a combined inverse filter Pattern_1 and match Pattern_2 or Pattern_3:
awk '!/Pattern_1/ && (/Pattern_2/ || /Pattern_3/)' scanned_file1 scanned_file2 ...
The result will be printing lines not matching Pattern_1 but match Pattern_2 or Pattern_3.

Calling bash script from bash script

I have made two programms and I'm trying to call the one from the other but this is appearing on my screen:
cp: cannot stat ‘PerShip/.csv’: No such file or directory
cp: target ‘tmpship.csv’ is not a directory
I don't know what to do. Here are the programms. Could somebody help me please?
#!/bin/bash
shipname=$1
imo=$(grep "$shipname" shipsNAME-IMO.txt | cut -d "," -f 2)
cp PerShip/$imo'.csv' tmpship.csv
dist=$(octave -q ShipDistance.m 2>/dev/null)
grep "$shipname" shipsNAME-IMO.txt | cut -d "," -f 2 > IMO.txt
idnumber=$(cut -b 4-10 IMO.txt)
echo $idnumber,$dist
#!/bin/bash
rm -f shipsdist.csv
for ship in $(cat shipsNAME-IMO.txt | cut -d "," -f 1)
do
./FindShipDistance "$ship" >> shipsdist.csv
done
cat shipsdist.csv | sort | head -n 1
The code and error messages presented suggest that the second script is calling the first with an empty command-line argument. That would certainly happen if input file shipsNAME-IMO.txt contained any empty lines or otherwise any lines with an empty first field. An empty line at the beginning or end would do it.
I suggest
using the read command to read the data, and manipulating IFS to parse out comma-delimited fields
validating your inputs and other data early and often
making your scripts behave more pleasantly in the event of predictable failures
More generally, using internal Bash features instead of external programs where the former are reasonably natural.
For example:
#!/bin/bash
# Validate one command-line argument
[[ -n "$1" ]] || { echo empty ship name 1>&2; exit 1; }
# Read and validate an IMO corresponding to the argument
IFS=, read -r dummy imo tail < <(grep -F -- "$1" shipsNAME-IMO.txt)
[[ -f PerShip/"${imo}.csv" ]] || { echo no data for "'$imo'" 1>&2; exit 1; }
# Perform the distance calculation and output the result
cp PerShip/"${imo}.csv" tmpship.csv
dist=$(octave -q ShipDistance.m 2>/dev/null) ||
{ echo "failed to compute ship distance for '${imo}'" 2>&1; exit 1; }
echo "${imo:3:7},${dist}"
and
#!/bin/bash
# Note: the original shipsdist.csv will be clobbered
while IFS=, read -r ship tail; do
# Ignore any empty ship name, however it might arise
[[ -n "$ship" ]] && ./FindShipDistance "$ship"
done < shipsNAME-IMO.txt |
tee shipsdist.csv |
sort |
head -n 1
Note that making the while loop in the second script part of a pipeline will cause it to run in a subshell. That is sometimes a gotcha, but it won't cause any problem in this case.

How to extract code into a funciton when using xargs -P?

At fisrt,I have write the code,and it run well.
# version1
all_num=10
thread_num=5
a=$(date +%H%M%S)
seq 1 ${all_num} | xargs -n 1 -I {} -P ${thread_num} sh -c 'echo abc{}'
b=$(date +%H%M%S)
echo -e "startTime:\t$a"
echo -e "endTime:\t$b"
Now I want to extract code into a funciton,but it was wrong,how to fix it?
get_file(i){
echo "abc"+i
}
all_num=10
thread_num=5
a=$(date +%H%M%S)
seq 1 ${all_num} | xargs -n 1 -I {} -P ${thread_num} sh -c "$(get_file {})"
b=$(date +%H%M%S)
echo -e "startTime:\t$a"
echo -e "endTime:\t$b"
Because /bin/sh isn't guaranteed to have support for either printing text that when evaluates defines your function, or exporting functions through the environment, we need to do this the hard way, just duplicating the text of the function inside the copy of sh started by xargs.
Other questions already exist in this site describing how to accomplish this with bash, which is quite considerably easier. See f/e How can I use xargs to run a function in a command substitution for each match?
#!/bin/sh
all_num=10
thread_num=5
batch_size=1 # but with a larger all_num, turn this up to start fewer copies of sh
a=$(date +%H%M%S) # warning: this is really inefficient
seq 1 ${all_num} | xargs -n "${batch_size}" -P "${thread_num}" sh -c '
get_file() { i=$1; echo "abc ${i}"; }
for arg do
get_file "$arg"
done
' _
b=$(date +%H%M%S)
printf 'startTime:\t%s\n' "$a"
printf 'endTime:\t%s\n' "$b"
Note:
echo -e is not guaranteed to work with /bin/sh. Moreover, for a shell to be truly compliant, echo -e is required to write -e to its output. See Why is printf better than echo? on UNIX & Linux Stack Exchange, and the APPLICATION USAGE section of the POSIX echo specification.
Putting {} in a sh -c '...{}...' position is a Really Bad Idea. Consider the case where you're passed in a filename that contains $(rm -rf ~)'$(rm -rf ~)' -- it can't be safely inserted in an unquoted context, or a double-quoted context, or a single-quoted context, or a heredoc.
Note that seq is also nonstandard and not guaranteed to be present on all POSIX-compliant systems. i=0; while [ "$i" -lt "$all_num" ]; do echo "$i"; i=$((i + 1)); done is an alternative that will work on all POSIX systems.

How can I expand arguments to a bash function into a chain of piped commands?

I often find myself doing something like this a lot:
something | grep cat | grep bat | grep rat
when all I recall is that those three words must have occurred somewhere, in some order, in the output of something...Now, i could do something like this:
something | grep '.*cat.*bat.*rat.*'
but that implies ordering (bat appears after cat). As such, I was thinking of adding a bash function to my environment called mgrep which would turn:
mgrep cat bat rat
into
grep cat | grep bat | grep rat
but I'm not quite sure how to do it (or whether there is an alternative?). One idea would be to for loop over the parameters like so:
while (($#)); do
grep $1 some_thing > some_thing
shift
done
cat some_thing
where some_thing is possibly some fifo like when one does >(cmd) in bash but I'm not sure. How would one proceed?
I believe you could generate a pipeline one command at a time, by redirecting stdin at each step. But it's much simpler and cleaner to generate your pipeline as a string and execute it with eval, like this:
CMD="grep '$1' " # consume the first argument
shift
for arg in "$#" # Add the rest in a pipeline
do
CMD="$CMD | grep '$arg'"
done
eval $CMD
This will generate a pipeline of greps that always reads from standard input, as in your model. Note that it protects spaces in quoted arguments, so that it works correctly if you write:
mgrep 'the cat' 'the bat' 'the rat'
Thanks to Alexis, this is what I did:
function mgrep() #grep multiple keywords
{
CMD=''
while (($#)); do
CMD="$CMD grep \"$1\" | "
shift
done
eval ${CMD%| }
}
You can write a recursive function; I'm not happy with the base case, but I can't think of a better one. It seems a waste to need to call cat just to pass standard input to standard output, and the while loop is a bit inelegant:
mgrep () {
local e=$1;
# shift && grep "$e" | mgrep "$#" || while read -r; do echo "$REPLY"; done
shift && grep "$e" | mgrep "$#" || cat
# Maybe?
# shift && grep "$e" | mgrep "$#" || echo "$(</dev/stdin)"
}

best way to find top-level directory for path in bash

I need a command that will return the top level base directory for a specified path in bash.
I have an approach that works, but seems ugly:
echo "/go/src/github.myco.com/viper-ace/psn-router" | cut -d "/" -f 2 | xargs printf "/%s"
It seems there is a better way, however all the alternatives I've seen seem worse.
Thanks for any suggestions!
One option is using awk:
echo "/go/src/github.myco.com/viper-ace/psn-router" |
awk -F/ '{print FS $2}'
/go
As a native-bash approach forking no subshells and invoking no other programs (thus, written to minimize overhead), which works correctly in corner cases including directories with newlines:
topdir() {
local re='^(/+[^/]+)'
[[ $1 =~ $re ]] && printf '%s\n' "${BASH_REMATCH[1]}"
}
Like most other solutions here, invocation will then look something like outvar=$(topdir "$path").
To minimize overhead even further, you could pass in the destination variable name rather than capturing stdout:
topdir() {
local re='^(/+[^/]+)'
[[ $1 =~ $re ]] && printf -v "$2" '%s' "${BASH_REMATCH[1]}"
}
...used as: topdir "$path" outvar, after which "$outvar" will expand to the result.
not sure better but with sed
$ echo "/go/src/github.myco.com/viper-ace/psn-router" | sed -E 's_(/[^/]+).*_\1_'
/go
Here's a sed possibility. Still ugly. Handles things like ////////home/path/to/dir. Still blows up on newlines.
$ echo "////home/path/to/dir" | sed 's!/*\([^/]*\).*!\1!g'
/home
Newlines breaking it:
$ cd 'testing '$'\n''this'
$ pwd
/home/path/testing
this
$ pwd | sed 's!/*\([^/]*\).*!/\1!g'
/home
/this
If you know your directories will be rather normally named, your and anubhava's solutions certainly seem to be more readable.
This is bash, sed and tr in a function :
#!/bin/bash
function topdir(){
dir=$( echo "$1" | tr '\n' '_' )
echo "$dir" | sed -e 's#^\(/[^/]*\)\(.*\)$#\1#g'
}
topdir '/go/src/github.com/somedude/someapp'
topdir '/home/somedude'
topdir '/with spaces/more here/app.js'
topdir '/with newline'$'\n''before/somedir/somefile.txt'
Regards!

Resources