multiline awk script inside shell script - shell

#!/usr/bin/tcsh
cmd='BEGIN{c=0}{
if($1=="Net"){print $0}
if($1=="v14")
{
if($4>=200)
{print "Drop more than 200 at "$1}
}
}'
awk -f "$cmd" input_file.txt > output_file.txt
I am trying to execute shell script which contains multiline awk script inside it.
storing awk script (especially multiline awk script) to a variable cmd & then excuting that cmd in awk -f "$cmd" input_file.txt > output_file.txt.
this is giving an error like below
awk: fatal: can't open source file `BEGIN{c=0}{
if($1=="Net"){print $0}
if($1=="v14")
{
if($4>=200)
{print"Drop more than 200 at $1}
}
}' for reading (No such file or directory)
my questin is how do i execute shell script which contains multiline awk script inside it?
can you please help me with this as i couldn't figureout even after searching in google/reference manual?

You use awk -f when you want to pass a file name for the script to execute.
Here your awk script is an inline string, so just removing the -f option will fix your issue.
awk "$cmd" input_file.txt > output_file.txt

Don't write [t]csh scripts, see any of the many results of https://www.google.com/search?q=csh+why+not, use a Bourne-derived shell like bash.
Don't store an awk script in a shell variable and then ask awk to interpret the contents of that variable, just store the script in a function and call that.
So, do something like this:
#!/usr/bin/env bash
foo() {
awk '
{ print "whatever", $0 }
' "${#:--}"
}
foo input_file.txt > output_file.txt

This is the equivalent script
$1=="Net"
$1=="v14" && $4>=200 {print "Drop more than 200 at "$1}
save into a file, for example test.awk and run as
$ awk -f test.awk input_file > output_file
Or, as for simple one time scripts you can just
$ awk '$1=="Net"; $1=="v14" && $4>=200 {print "Drop more than 200 at "$1}' input_file > output_file
obviously the above line can be inserted in a shell script as well.

Don't know in tcsh but in bash it is also possible using heredoc :
#!/usr/bin/bash
awk -f <(cat - <<-'_EOF_'
BEGIN{c=0}{
if($1=="Net"){print $0}
if($1=="v14")
{
if($4>=200)
{print "Drop more than 200 at "$1}
}
}
_EOF_
) input_file.txt > output_file.txt

Related

awk issue inside for loop

I have many files with different names that end with txt.
rtfgtq56.txt
fgutr567.txt
..
So I am running this command
for i in *txt
do
awk -F "\t" '{print $2}' $i | grep "K" | awk '{print}' ORS=';' | awk -F "\t" '{OFS="\t"; print $i, $1}' > ${i%.txt*}.k
done
My problem is that I want to add the name of every file in the first column, so I run this part:
awk -F "\t" '{OFS="\t"; print $i, $1}' > ${i%.txt*}
$i means the file that are in the for loop,
but it did not work because awk can't read the $i in the for loop.
Do you know how I can solve it?
You want to refactor eveything into a single Awk script anyway, and take care to quote your shell variables.
for i in *.txt
do
awk -F "\t" '/K/{a = a ";" $2}
END { print FILENAME, substr(a, 1) }' "$i" > "${i%.txt*}.k"
done
... assuming I untangled your logic correctly. The FILENAME Awk variable contains the current input file name.
More generally, if you genuinely want to pass a variable from a shell script to Awk, you can use
awk -v awkvar="$shellvar" ' .... # your awk script here
# Use awkwar to refer to the Awk variable'
Perhaps see also useless use of grep.
Using the -v option of awk, you can create an awk Variable based on a shell variable.
awk -v i="$i" ....
Another possibility would be to make i an environment variable, which means that awk can access it via the predefined ENVIRON array, i.e. as ENVIRON["i"].

Writing an AWK instruction in a bash script

In a bash script, I need to do this:
cat<<EOF> ins.exe
grep 'pattern' file | awk '{print $2}' > results
EOF
The problem is that $2 is interpreted as a variable and the file ins.exe ends up containing
"grep 'pattern' file | awk '{print }' > results", without the $2.
I've tried using
echo "grep 'pattern' file | awk '{print $2}' > results" >> ins.exe
But it's the same problem.
How can I fix this?
Just escape the $:
cat<<EOF> ins.exe
awk '/pattern/ { print \$2 }' file > results
EOF
No need to pipe grep to awk, by the way.
With bash, you have another option as well, which is to use <<'EOF'. This means that no expansions will occur within the string.

Why am I not able to store bash output to shell?

I have the following script:
#!/bin/bash
…code setting array ids, etc…
for i in "${!ids[#]}" ; do
echo "#${ids[i]}_${pos[i]}_${wild[i]}_${sub[i]}"
curl -sS "http://www.uniprot.org/uniprot/"${ids[i]}".fasta";
done |
sed '/^>/ d' |
sed -r 's/[#]+/>/g' |
perl -npe 'chomp if ($.!=1 && !s/^>/\n>/)' > $id.pph.fasta
However the results will not store in the file. I can output the result to the terminal and store in file by doing:
./myscript > result.txt
However I want to do this within the script and output to file outside the loop.
Add
exec 1>result.txt
to the top of the script, and all output will be redirected.
Here is a variation of your script:
#!/bin/sh
for i in ${!ids[*]}
do
echo ">${ids[i]}_${pos[i]}_${wild[i]}_${sub[i]}"
curl -Ss www.uniprot.org/uniprot/${ids[i]}.fasta
done |
awk '
/>/ {if (z++) printf RS; print; printf RS; getline; next}
1
END {printf RS}
' ORS= > $id.pph.fasta

Calling another program in a shell script and using the returned data

I have a c program named calculate which sorts out the correct data
Data input (initial file.txt):
abcd!1023!92
ckdw!3251!io
efgh!9873!xk
Data returned:
abcd!1023!92
efgh!9873!xk
My shell script contains:
./calculate | awk -F '!' '{sum += $2} END{print sum}' "$1"
When I run the script ./check file.txt it ignores the values returned from the calculate function and instead calculates from the input file.
How do I fix this so that the "awk" function works from the data returned from the ./test function?
You are passing a file to the awk script as well as input.
./calculate | awk -F '!' '{sum += $2} END{print sum}' "$1"
awk only uses one or the other and it prefers file arguments when given.
Drop the "$1" bit from there.
I did my try.
sorting.sh (my version of your filtering program)
#!/usr/bin/env bash
egrep 'a|e' input.txt
program.sh (your shell command)
./program.sh | awk -F '!' '{sum += $2} END{print sum}'
#Updated, deleted "$1" as stated by Etan Reisner

shell variable inside awk without -v option it is working

I noticed that a shell script variable can be used inside an awk script like this:
var="help"
awk 'BEGIN{print "'$var'" }'
Can anyone tell me how to change the value of var inside awk while retaining the value outside of awk?
Similarly to accessing a variable of shell script inside awk, can we access shell array inside awk? If so, how?
It is impossible; the only variants you have:
use command substitution and write output of awk to the variable;
write data to file and then read from the outer shell;
produce shell output and then execute it with eval.
Examples.
Command substitution, one variable:
$ export A=10
$ A=$(awk 'END {print 2*ENVIRON["A"]}' < /dev/null)
$ echo $A
20
Here you multiple A by two and write the result of multiplication back.
eval; two variables:
$ A=10
$ B=10
$ eval $(awk 'END {print "A="2*ENVIRON["A"]"; B="2*ENVIRON["B"]}' < /dev/null)
$ echo $A
20
$ echo $B
20
$ awk 'END {print "A="2*ENVIRON["A"]"; B="2*ENVIRON["B"]}' < /dev/null
A=40; B=40
It uses a file intermediary, but it does work:
var="hello world"
cat > /tmp/my_script.awk.$$ <<EOF
BEGIN { print \"$var\" }
EOF
awk /tmp/my_script.awk.$$
rm -f /tmp/my_script.awk.$$
This uses the here document feature of the shell, Check your shell manual for the rules about interpolation within a here document.

Resources