Store String var in awk command - bash

I have txt file like this:
Dupont Charles
Martin Paul
Dupuis Jean
I want, for each line, to make a login corresponding to first 2 caracters of each names.
For instance : Dupont Charles ==> duch
awk '{print tolower(substr($1,1,1)substr($2,1,1))}' liste.txt
works perfectly.
but i want to store each login in var and call a bash script (uscript) with that var...
awk '{login=tolower(substr($1,1,1)substr($2,1,1));print $login; script1 $login; }' liste.txt
But it does not work and the content of login is not what I want.

The way to use variables in awk is different from Bash. In awk, a variable does not have a leading $. So what you want to say is:
awk '{login=tolower(substr($1,1,1)substr($2,1,1));print login; }' liste.txt
# ^
# intead of $login
However, you are willing to use a script script1 with this value. For this, you may want to use system() to call an external command............... or use a while block to handle all in one.
while IFS= read -r name surname
do
login=${name:0:2}${surname:0:2}
script1 "$login"
done < file
This uses the same logic to get the 3 first characters of a variable:
$ v="123456789"
$ echo ${v:0:3}
123

awk '{print tolower(substr($1,1,2)substr($2,1,2))}' liste.txt | xargs script1
e.g. using echo instead of script1 which of course I don't have:
$ awk '{print tolower(substr($1,1,2)substr($2,1,2))}' liste.txt | xargs echo
duch mapa duje
or if script1 requires 1 arg at a time:
awk '{print tolower(substr($1,1,2)substr($2,1,2))}' liste.txt | xargs -n1 script1
e.g.
$ awk '{print tolower(substr($1,1,2)substr($2,1,2))}' liste.txt | xargs -n1 echo
duch
mapa
duje

Related

how to pass in a variable to awk commandline

I'm having some trouble passing bash script variables into awk command-line.
Here is pseudocode:
for FILE in $INPUT_DIR/*.txt; do
filename=`echo $FILE | sed -n 's/^.*\(chr[0-9A-Z]*\).*.vcf$/\1/p'`
OUTPUT_FILE=$OUTPUT_DIR/$filename.snps.txt
egrep -v "^#" $FILE | awk '{print $2,$4,$5}' > $OUTPUT_FILE
done
The final line where I awk the columns, I would like it to be flexible or user input. For example, the user could want columns 6,7,and 8 as well, or column 133 and 138, or column 245 through 248. So how do I custom this so I can have that 'print $2 .... $5' be a user input thing? For example the user would run this script like : bash script.sh input_dir output_dir [user inputs whatever string of columns], and then I would get those columns in the output. I tried passing it in, but I guess I'm not getting the syntax right.
With awk, you should declare the variable before use it. This is better than the escape method (awk '{print $'$var'}'):
awk -v var1="$col1" -v var2="$col2" 'BEGIN {print var1,var2 }'
Where $col1 and $col2 would be the input variables.
Maybe you can try an input variable as string with "$2,$4,$5" and print this variable to get the values (I am not sure if this works)
The following test works for me:
A="\$3" ; ls -l | awk "{ print $A }"

Get the part of string after the delimiter

The string format is
Executed: variable_name
What is the simplest way to get the *variable_name* sub-string?
foo="Executed: variable_name"
echo ${foo##* } # Strip everything up to and including the rightmost space.
or
set -- $foo
echo $2
Unlike other solutions using awk, sed, and whatnot, these don't fork other programs and save thousands of CPU cycles since they execute completely in your shell. They are also more portable (unlike ${i/* /} which is a bashism).
With sed:
echo "Executed: variable_name" | sed 's/[^:]*: //'
Using awk:
echo 'Executed: variable_name' | awk -F' *: *' '{print $2}'
variable_name
If you have the string in a variable:
$ i="Executed: variable_name"
$ echo ${i/* /}
variable_name
If you have the string as output of a command
$ cmd
Executed: variable_name
$ cmd | awk '{print $NF}'
variable_name
Note that 'NF' means "number of fields", so $NF is always the last field on the line. Fields are assumed to be separated by spaces (unless -F is specified)
If your variable_name could have spaces in it, the
-F' *: *'
mentioned previously ensures that only the ": " is used as a field separator. However, this will preserve spaces at the end of the line if there are any.
If the line is mixed in with other output, you might need to filter.
Either grep..
$ cmd | grep '^Executed: ' | awk '{print $NF}'
or more clever awk..
$ cmd | awk '/^Executed: /{print $NF}'

How to get the desired output

I am getting an input from a shell script, something like:
USER1_OLD:USER1_NEW,USER2_OLD:USER2_NEW ....
The number of key pairs can vary. I need to get output like:
USER1_OLD,USER2_OLD,......
One way using awk:
$ ./script.sh | awk '{printf "%s",NR==1?$1:","$1}' FS=: RS=,
USER1_OLD,USER2_OLD
It's not clear if you want a trailing comma, if you do the script can be simpler:
$ ./script.sh | awk '{print $1}' FS=: RS=, ORS=,
USER1_OLD,USER2_OLD,

Explode to Array

I put together this shell script to do two things:
Change the delimiters in a data file ('::' to ',' in this case)
Select the columns and I want and append them to a new file
It works but I want a better way to do this. I specifically want to find an alternative method for exploding each line into an array. Using command line arguments doesn't seem like the way to go. ANY COMMENTS ARE WELCOME.
# Takes :: separated file as 1st parameters
SOURCE=$1
# create csv target file
TARGET=${SOURCE/dat/csv}
touch $TARGET
echo #userId,itemId > $TARGET
IFS=","
while read LINE
do
# Replaces all matches of :: with a ,
CSV_LINE=${LINE//::/,}
set -- $CSV_LINE
echo "$1,$2" >> $TARGET
done < $SOURCE
Instead of set, you can use an array:
arr=($CSV_LINE)
echo "${arr[0]},${arr[1]}"
The following would print columns 1 and 2 from infile.dat. Replace with
a comma-separated list of the numbered columns you do want.
awk 'BEGIN { IFS='::'; OFS=","; } { print $1, $2 }' infile.dat > infile.csv
Perl probably has a 1 liner to do it.
Awk can probably do it easily too.
My first reaction is a combination of awk and sed:
Sed to convert the delimiters
Awk to process specific columns
cat inputfile | sed -e 's/::/,/g' | awk -F, '{print $1, $2}'
# Or to avoid a UUOC award (and prolong the life of your keyboard by 3 characters
sed -e 's/::/,/g' inputfile | awk -F, '{print $1, $2}'
awk is indeed the right tool for the job here, it's a simple one-liner.
$ cat test.in
a::b::c
d::e::f
g::h::i
$ awk -F:: -v OFS=, '{$1=$1;print;print $2,$3 >> "altfile"}' test.in
a,b,c
d,e,f
g,h,i
$ cat altfile
b,c
e,f
h,i
$

Using awk in BASH alias or function

I've a command that works just fine at the command line, but not when I try to put it in an alias or function.
$ awk '{print $1}' /tmp/textfile
0
That's correct, as '0' is in position 1 of "textfile".
$ alias a="awk '{print $1}' /tmp/textfile"
$ a
1 0 136 94
That's the entire line in "textfile". I've tried every variety of quotes, parentheses and backticks that I could imagine might work. I can get the same problem in a wide variety of formats.
What am I not understanding?
You need to escape the $ like so:
alias a="awk '{print \$1}' /tmp/textfile"
Otherwise your alias is:
awk '{print }' /tmp/textfile
Which prints the whole file...
Use a function instead of alias
myfunc(){ awk '{print $1}' file; }

Resources