Setting multiple field to awk variables at once - bash

I am trying to set an awk variable field to several field at once.
Right now I can only set the variables one by one.
for line in `cat file.txt`;do
var1=`echo $line | awk -F, '{print $1}'`
var2=`echo $line | awk -F, '{print $2}'`
var3=`echo $line | awk -F, '{print $3}'`
#Some complex code....
done
I think this is costly cause it parses the linux variable several times. Is there a special syntax to set the variable at once? I know that awk has a BEGIN and END block but the reason I am trying to avoid the BEGIN and END block is to avoid nested awk.
I plan to place another loop and awk code in the #Some complex code.... part.
for line in `cat file.txt`;do
var1=`echo $line | awk -F, '{print $1}'`
var2=`echo $line | awk -F, '{print $2}'`
var3=`echo $line | awk -F, '{print $3}'`
for line2 in `cat file_old.txt`;do
vara=`echo $line2 | awk -F, '{print $1}'`
varb=`echo $line2 | awk -F, '{print $2}'`
# Do comparison of $var1,var2 and $vara,$varb , then do something with either
done
done

You can use the IFS internal field separator to use a comma (instead of whitespace) and do the assignments in a while loop:
SAVEIFS=$IFS;
IFS=',';
while read line; do
set -- $line;
var1=$1;
var2=$2;
var3=$3;
...
done < file.txt
IFS=$SAVEIFS;
This will save a copy of your current IFS, change it to a , character, and then iterate over each line in your file. The line set -- $line; will convert each word (separated by a comma) into a numeric-variable ($1, $2, etc.). You can either use these variables directly, or assign them to other (more meaningful) variable names.
Alternatively, you could use IFS with the answer provided by William:
IFS=',';
while read var1 var2 var3; do
...
done < file.txt
They are functionally identical and it just comes down to whether or not you want to explicitly set var1=$1 or have it defined in the while-loop's head.

Why are you using awk at all?
while IFS=, read var1 var2 var3; do
...
done < file.txt

#!/bin/bash
FILE="/tmp/values.txt"
function parse_csv() {
local lines=$lines;
> $FILE
OLDIFS=$IFS;
IFS=","
i=0
for val in ${lines}
do
i=$((++i))
eval var${i}="${val}"
done
IFS=$OLDIFS;
for ((j=1;j<=i;++j))
do
name="var${j}"
echo ${!name} >> $FILE
done
}
for lines in `cat file_old.txt`;do
parse_csv;
done
The problem you have described has only got 3 values, would there be a chance that 3 values may differ and be 4 or 5 or undefined ?
if so the above will parse through the csv line by line and output each value at a time on a new line in a file called /tmp/values.txt
feel free to modify to match your requirements its far more dynamic than defining 3 values

Related

Trim line to the first comma (bash)

I have a line from which I need to cut the branch name to the first comma:
commit 2bea9e0351dae65f18d2de11621049b465b1e868 (HEAD, origin/MGB-322, refs/pipelines/36877)
I need to cut out MGB-322.
The number of characters in a line is always different.
awk -F "origin/" '{print $2}' - this is how I cut out
MGB-322, refs/pipelines/36877)
But how to tell it to trim to the first comma?
I tried doing it via substr,
awk -F "origin/" '{print substr ($2,1, index $2 ,)}'
But it is not clear how to correctly specify the comma in index
With any awk. Use / and , as field separator:
awk '{print $3}' FS='[/,]' file
Output:
MGB-322
See: 8 Powerful Awk Built-in Variables – FS, OFS, RS, ORS, NR, NF, FILENAME, FNR
With OP's code fix: considered that you have only occurrence of origin in case you have more than occurrence then change $NF to $2 in following code. Written and tested in https://ideone.com/xjv2we
awk -F"origin/" '{print $NF}' Input_file
sed could be also helpful here, generic solution it's based on first occurrence of comma and / as per OP's thread title. I have written this on mobile so couldn't test it as of now should with though and will test it after sometime.
sed 's/\([^,]*\),\([^/]*\)\/\(.*\)/\3/' Input_file
"I need to cut out MGB-322."
You can use cut in two steps:
echo "${line}" | cut -d"/" -f2 | cut -d"," -f1
I would prefer one step with awk (already anwered by others) or sed
echo "${line}" | sed -r 's/.*origin.(.*), refs.*/\1/'
Why spawn procs? bash's built-in parameter parsing will handle this.
If
$: line="commit 2bea9e0351dae65f18d2de11621049b465b1e868 (HEAD, origin/MGB-322, refs/pipelines/36877)"
then
$: [[ "$line" =~ .*origin.(.*), ]] && echo "${BASH_REMATCH[1]}"
MGB-322
or maybe
$: tmp=${line#*, origin/}; echo ${tmp%,*}
MGB-322
or even
$: IFS=",/" read _ _ x _ <<< "$line" && echo $x
MGB-322
c.f. https://www.gnu.org/software/bash/manual/html_node/Shell-Parameter-Expansion.html

Bash: read property file into Array

I'm trying to read a property file like this one into a set of arrays:
DATABASE="mysql57"
DB_DRIVER_XA="com.mysql.cj.jdbc.MysqlXADataSource"
DB_DRIVER_CLASS="com.mysql.cj.jdbc.Driver"
DATABASE="db2_111"
DB_DRIVER_XA="com.ibm.db2.jcc.DB2XADataSource"
DB_DRIVER_CLASS="com.ibm.db2.jcc.DB2Driver"
I've found the following grep to be useful to store each key into its array:
filename=conf.properties
dblist=($(grep "DATABASE" $filename))
xadriver=($(grep "DB_DRIVER_XA" $filename))
driver=($(grep "DB_DRIVER_CLASS" $filename))
The problem is that the above solution stores into the array KEY=VALUE:
printf '%s\n' "${dblist[#]}"
DATABASE="mysql57"
DATABASE="db2_111"
I'd like to have in each array only the value. Is there a simple way to do it rather than looping over the array and maybe use "cut" to remove the "KEY=" part?
Sure:
databases=()
xas=()
classes=()
while IFS="=" read -r var value; do
without_quotes=${value//\"/}
case $var in
DATABASE) databases+=( "$without_quotes" ) ;;
DB_DRIVER_XA) xas+=( "$without_quotes" ) ;;
DB_DRIVER_CLASS) classes+=( "$without_quotes" ) ;;
esac
done < file
declare -p databases xas classes
declare -a databases='([0]="mysql57" [1]="db2_111")'
declare -a xas='([0]="com.mysql.cj.jdbc.MysqlXADataSource" [1]="com.ibm.db2.jcc.DB2XADataSource")'
declare -a classes='([0]="com.mysql.cj.jdbc.Driver" [1]="com.ibm.db2.jcc.DB2Driver")'
The take-away is to use IFS with the read command to split the line into fields, and store the results in separate variables.
Use awk -F= to split each line into key and value, and sed to strip out the quotes.
dblist=( $(awk -F= '$1=="DATABASE" {print $2}' "$filename" | sed 's/"//g'))
xadriver=($(awk -F= '$1=="DB_DRIVER_XA" {print $2}' "$filename" | sed 's/"//g'))
driver=( $(awk -F= '$1=="DB_DRIVER_CLASS" {print $2}' "$filename" | sed 's/"//g'))
Then, it would be better to use readarray to populate arrays to prevent word splitting on spaces and glob expansion on * and ?.
readarray -t dblist < <(awk -F= '$1=="DATABASE" {print $2}' "$filename" | sed 's/"//g')
readarray -t xadriver < <(awk -F= '$1=="DB_DRIVER_XA" {print $2}' "$filename" | sed 's/"//g')
readarray -t driver < <(awk -F= '$1=="DB_DRIVER_CLASS" {print $2}' "$filename" | sed 's/"//g')

Passing two awk columns into a while read command

SampleFile:
Two Words,Extra
My code:
cat SampleFile | awk -F "," '{print $1" "$2}' | while read var1 var2;
do
echo $var1
done
This will print out only Two and var2 will take Words. Is there a way so that I can pass Two Words into var1?
You don't have to use awk for this. Bash has a built-in variable to determine where words are split:
while IFS=, read -r var1 var2; do
echo "$var1"
done < SampleFile
IFS is set to ,, so word splitting takes place at commas.
Instead of piping to the while loop, I use redirection, which has the advantage of not spawning a subshell.
A remark: you don't need cat in this context. Awk can take a file name as an argument, so
cat SampleFile | awk -F "," '{print $1" "$2}'
becomes
awk -F "," '{print $1, $2}' SampleFile
Also, when using print, you don't need to explicitly introduce spaces: if you comma-separate your fields, awk will replace the comma by the value of the OFS (output field separator) variable, which defaults to a space.
...| while read var1; do echo $var1 done

Save output of awk to two different variables

Okay. I am kind of lost and google search isn't helping me much.
I have a command like:
filesize_filename=$(echo $line | awk ' ''{print $5":"$9}')
echo $filesize_filename
1024:/home/test
Now this one saves the two returns or awk'ed items into one variable. I'd like to achieve something like this:
filesize,filename=$(echo $line | awk ' ''{print $5":"$9}')
So I can access them individually like
echo $filesize
1024
echo $filename
/home/test
How to I achieve this?
Thanks.
Populate a shell array with the awk output and then do whatever you like with it:
$ fileInfo=( $(echo "foo 1024 bar /home/test" | awk '{print $2, $4}') )
$ echo "${fileInfo[0]}"
1024
$ echo "${fileInfo[1]}"
/home/test
If the file name can contain spaces then you'll have to adjust the FS and OFS in awk and the IFS in shell appropriately.
You may not need awk at all of course:
$ line="foo 1024 bar /home/test"
$ fileInfo=( $line )
$ echo "${fileInfo[1]}"
1024
$ echo "${fileInfo[3]}"
/home/test
but beware of globbing chars in $line matching on local file names in that last case. I expect there's a more robust way to populate a shell array from a shell variable but off the top of my head I can't think of it.
Use bash's read for that:
read size name < "$(awk '{print $5, $9}' <<< "$line")"
# Now you can output them separately
echo "$size"
echo "$name"
You can use process substitution on awk's output:
read filesize filename < <(echo "$line" | awk '{print $5,$9}')
You can totally avoid awk by doing:
read _ _ _ _ filesize _ _ _ filename _ <<< "$line"

behavior of awk in read line

$ cat file
11 asasaw121
12 saasks122
13 sasjaks22
$ cat no
while read line
do
var=$(awk '{print $1}' $line)
echo $var
done<file
$ cat yes
while read line
do
var=$(echo $line | awk '{print $1}')
echo $var
done<file
$ sh no
awk: can't open file 11
source line number 1
awk: can't open file 12
source line number 1
awk: can't open file 13
source line number 1
$ sh yes
11
12
13
Why doesn't the first one work? What does awk expect to find in $1 in it? I think understanding this will help me avoid numerous scripting problems.
awk always expects a file name as input
In following, $line is string not a file.
var=$(awk '{print $1}' $line)
You could say (Note double quotes around variable)
var=$(awk '{print $1}' <<<"$line")
Why doesn't the first one work?
Because of this line:
var=$(awk '{print $1}' $line)
Which assumes $line is a file.
You can make it:
var=$(echo "$line" | awk '{print $1}')
OR
var=$(awk '{print $1}' <<< "$line")
awk '{print $1}' $line
^^ awk expects to see a file path or list of file paths here
what it is getting from you is the actual file line
What you want to do is pipe the line into awk as you do in your second example.
You got the answers to your specific questions but I'm not sure it's clear that you would never actually do any of the above.
To print the first field from a file you'd either do this:
while IFS= read -r first rest
do
printf "%s\n" "$first"
done < file
or this:
awk '{print $1}' file
or this:
cut -d ' ' -f1 <file
The shell loop would NOT be recommended.

Resources