Set variable equal to value based on match in bash - bash

I have a script where I want to set a variable equal to a specific value based on a match from a file (in bash).
For example:
File in .csv contains:
Name,ID,Region
Device1,1,USA
Device2,2,UK
I want to declare variables at the beginning like this:
region1=USA
regions2=UK
region3=Ireland
etc...
Then, whilst reading the csv, I need to match the Regioncolumn's name to the global variable set at the beginning of a file, to then use this in an API. So if a device in the csv has a region set of USA, I should be able to use region1 during the update call in the API. I want to use a while loop to iterate over the csv file line by line, and update each device's region.
Does anyone maybe know how I can achieve this? Any help would be greatly appreciated.
PS: This is not a homework assignment before anyone asks :-)

Would you please try the following:
declare -A regions # an associative array
# declare your variables
region1=USA
region2=UK
region3=Ireland
# associate each region's name with region's code
for i in "${!region#}"; do # expands to region1, region2, region3
varname="$i"
regions[${!varname}]="$i" # maps "USA" to "region1", "UK" to "region2" ..
done
while IFS=, read -r name id region; do
((nr++)) || continue # skips the header line
region_group="${regions[$region]}"
echo "region=$region, region_group=$region_group" # for a test
# call API here
done < file.csv
Output:
region=USA, region_group=region1
region=UK, region_group=region2
region=Ireland, region_group=region3
BTW if the variable declaration at the beginning is under your control, it will be easier to say:
# declare an associative aarray
declare -A regions=(
[USA]="region1"
[UK]="region2"
[Ireland]="region3"
)
while IFS=, read -r name id region; do
((nr++)) || continue # skip the header line
region_group="${regions[$region]}"
echo "region=$region, region_group=$region_group" # for a test
# call API here
done < file.csv

Use awk to create a lookup table. eg:
$ cat a.sh
#!/bin/sh
cat << EOD |
region1=USA
region2=UK
region3=Ireland
EOD
{
awk 'NR==FNR { lut[$2] = $1; next}
{$3 = lut[$3]} 1
' FS== /dev/fd/3 FS=, OFS=, - << EOF
Name,ID,Region
Device1,1,USA
Device2,2,UK
EOF
} 3<&0
$ ./a.sh
Name,ID,
Device1,1,region1
Device2,2,region2
or, slightly less obscure (and more portable to just use regular files, I'm not sure about the platforms on which /dev/fd/3 is actually valid:
$ cat input
Name,ID,Region
Device1,1,USA
Device2,2,UK
$ cat lut
region1=USA
region2=UK
region3=Ireland
$ awk 'NR==FNR{lut[$2] = $1; next} {$3 = lut[$3]} 1' FS== lut FS=, OFS=, input
Name,ID,
Device1,1,region1
Device2,2,region2

Related

Reading CSV file in Shell Scripting

I am trying to read values from a CSV file dynamically based on the header. Here's how my input files can look like.
File 1:
name,city,age
john,New York,20
jane,London,30
or
File 2:
name,age,city,country
john,20,New York,USA
jane,30,London,England
I may not be following the best way to accomplish this but I tried the following code.
#!/bin/bash
{
read -r line
line=`tr ',' ' ' <<< $line`
while IFS=, read -r `$line`
do
echo $name
echo $city
echo $age
done
} < file.txt
I am expecting the above code read the values of the header as the variable names. I know that the order of columns can be different for the input file. But, I expect the files to have name, city and age columns in the input file. Is this the right approach? If so, what is the fix for the above code if fails with the error - "line7: name: command not found".
The issue is caused by the backticks. Bash will evaluate the contents and replace the backticks with the output from the command it just evaluated.
You can simply use the variable after the read command to achieve what you want:
#!/bin/bash
{
read -r line
line=`tr ',' ' ' <<< $line`
echo "$line"
while IFS=, read -r $line ; do
echo "person: $name -- $city -- $age"
done
} < file.txt
Some notes on your code:
The backtick syntax is legacy syntax, it is now preferred to use $(...) to evaluate commands. The new syntax is more flexible.
You can enable automatic script failure with set -euo pipefail (see here). This will make your script stop if it encounters an error.
You code is currently very sensitive to invalid header data:
with a file like
n ame,age,city,country
john,20,New York,USA
jane,30,London,England
your script (or rather the version in the beginning of my answer) will run without errors but with invalid output.
It is also good practice to quote variables to prevent unwanted splitting.
To make it much more robust, you can change it as follows:
#!/bin/bash
set -euo pipefail
# -e and -o pipefail will make the script exit
# in case of command failure (or piped command failure)
# -u will exit in case a variable is undefined
# (in you case, if the header is invalid)
{
read -r line
readarray -d, -t header < <(printf "%s" "$line")
# using an array allows to detect if one of the header entries
# contains an invalid character
# the printf is needed because bash would add a newline to the
# command input if using heredoc (<<<).
while IFS=, read -r "${header[#]}" ; do
echo "$name"
echo "$city"
echo "$age"
done
} < file.txt
A slightly different approach can let awk handle the field separation and ordering of the desired output given either of the input files. Below awk stores the desired output order in the f[] (field) array set in the BEGIN rule. Then on the first line in a file (FNR==1) the array a[] is deleted and filled with the headings from the current file. At that point you just loop over the field names in-order in the f[] array and output the corresponding field from the current line, e.g.
awk -F, '
BEGIN { f[1]="name"; f[2]="city"; f[3]="age" } # desired order
FNR==1 { # on first line read header
delete a # clear a array
for (i=1; i<=NF; i++) # loop over headings
a[$i] = i # index by heading, val is field no.
next # skip to next record
}
{
print "" # optional newline between outputs
for (i=1; i<=3; i++) # loop over desired field order
if (f[i] in a) # validate field in a array
print $a[f[i]] # output fields value
}
' file1 file2
Example Use/Output
In your case with the content you show in file1 and file2, you would have:
$ awk -F, '
> BEGIN { f[1]="name"; f[2]="city"; f[3]="age" } # desired order
> FNR==1 { # on first line read header
> delete a # clear a array
> for (i=1; i<=NF; i++) # loop over headings
> a[$i] = i # index by heading, val is field no.
> next # skip to next record
> }
> {
> print "" # optional newline between outputs
> for (i=1; i<=3; i++) # loop over desired field order
> if (f[i] in a) # validate field in a array
> print $a[f[i]] # output fields value
> }
> ' file1 file2
john
New York
20
jane
London
30
john
New York
20
jane
London
30
Where both files are read and handled identically despite having different field orderings. Let me know if you have further questions.
If using Bash verison ≥ 4.2, it is possible to use an associative array to capture an arbitrary number of fields with their name as a key:
#!/usr/bin/env bash
# Associative array to store columns names as keys and and values
declare -A fields
# Array to store columns names with index
declare -a column_name
# Array to store row's values
declare -a line
# Commands block consuming CSV input
{
# Read first line to capture column names
IFS=, read -r -a column_name
# Proces records
while IFS=, read -r -a line; do
# Store column values to corresponding field name
for ((i=0; i<${#column_name[#]}; i++)); do
# Fills fields' associative array
fields["${column_name[i]}"]="${line[i]}"
done
# Dump fields for debug|demo purpose
# Processing of each captured value could go there instead
declare -p fields
done
} < file.txt
Sample output with file 1
declare -A fields=([country]="USA" [city]="New York" [age]="20" [name]="john" )
declare -A fields=([country]="England" [city]="London" [age]="30" [name]="jane" )
For older Bash version, without associative array, use indexed column name alternatively:
#!/usr/bin/env bash
# Array to store columns names with index
declare -a column_name
# Array to store values for a line
declare -a value
# Commands block consuming CSV input
{
# Read first line to capture column names
IFS=, read -r -a column_name
# Proces records
while IFS=, read -r -a value; do
# Print record separator
printf -- '--------------------------------------------------\n'
# Print captured field name and value
for ((i=0; i<"${#column_name[#]}"; i++)); do
printf '%-18s: %s\n' "${column_name[i]}" "${value[i]}"
done
done
} < file.txt
Output:
--------------------------------------------------
name : john
age : 20
city : New York
country : USA
--------------------------------------------------
name : jane
age : 30
city : London
country : England

Read values from a file, increment or change them and store them again in the same place

So, I have a bash script which reads several variables from different external files, increments or changes these variables and then stores the new values in the files.
Something like this:
var1=$(< file1)
var2=$(< file2)
var3=$(< file3)
# then for example:
((var1=var1+1))
((var1=var1-1))
var3=foo
echo $var1 > file1
echo $var2 > file2
echo $var3 > file3
This works just fine, but I find it a bit bulky, especially when there are a lot of variables stored like this. I think it would be more elegant to store all the values in a single file which could look something like this:
#File containing values
var1=1
var2=2
var3=foo
Unfortunately I can't figure out how to read the values from such a file and store the new values in the same place afterwards? I have looked into sed and awk but so far I couldn't find a solution that works in this particular case.
Any Suggestions?
An awk script can handle this i.e. to find out all name=value lines, find all integer value and increment it:
awk 'BEGIN {FS=OFS="="} NF==2 && $2+0 == $2 {++$2} 1' file
#File containing values
var1=2
var2=3
var3=foo
If you want to save changes inline then use this gnu-awk command:
awk -i inplace 'BEGIN {FS=OFS="="} NF==2 && $2+0 == $2 {++$2} 1' file
Explanation:
FS=OFS="=": Set input and output field separator to =
NF==2: Number of fields are 2
&&: ANDed with
$2+0 == $2: Find only numeric values
++$2: increment 2nd field
1: Print each line
Ok, since my question appears to have been imprecise I accepted the answer by #anubhava as correct even though it didn't quite work for me. But it seems to be the correct answer to my question and pointed me in the right direction. Based on that answer I found a solution that works for me:
I now have a file named 'storage' containing all the variable names and values like this:
var1 1
var2 1
var3 foo
In my script there are three scenarios:
Incrementing or decrementing silently
A value is read from the file (by searching for the variable name and reading the last field in that line), silently incremented or decremented and saved to the file again:
awk '/var1/{++$NF} {print > "storage" }' storage # incrementing
awk '/var1/{--$NF} {print > "storage" }' storage # decrementing
Toggle between two values
Depending on user input a variable can be set to one of two values for example like this:
PS3="Please choose an option"
options=("Option 1" "Option 2")
select opt in "${options[#]}"
do
case $opt in
"Option 1")
awk '/var2/{$NF=0} {print > "storage" }' storage # this sets the value to 0
break
;;
"Option 2")
awk '/var2/{$NF=1} {print > "storage" }' storage # this sets the value to 1
break
;;
esac
done
Reading user input
The script reads a value from the file and prints it. Then it waits for user input and stores the input in the file
var3=$(awk '/var3/{print $NF}' storage) # reading the current value from the file and storing it in the variable
echo The current value is $var3
read -p "Please enter the new value" var3
awk -v var3="$var3" '/var3/{$NF=var3} {print > "storage" }' storage # writing the new value to the file
This does exactly what I was looking for. So, thank you #anubhava for pointing me in the right direction!

How can I assign each column value to Its name?

I have a MetaData.csv file that contains many values to perform an analysis. All I want are:
1- Reading column names and making variables similar to column names.
2- Put values in each column into variables as an integer that can be read by other commands. column_name=Its_value
MetaData.csv:
MAF,HWE,Geno_Missing,Inds_Missing
0.05,1E-06,0.01,0.01
I wrote the following codes but it doesn't work well:
#!/bin/bash
Col_Names=$(head -n 1 MetaData.csv) # Cut header (camma sep)
Col_Names=$(echo ${Col_Names//,/ }) # Convert header to space sep
Col_Names=($Col_Names) # Convert header to an array
for i in $(seq 1 ${#Col_Names[#]}); do
N="$(head -1 MetaData.csv | tr ',' '\n' | nl |grep -w
"${Col_Names[$i]}" | tr -d " " | awk -F " " '{print $1}')";
${Col_Names[$i]}="$(cat MetaData.csv | cut -d"," -f$N | sed '1d')";
done
Output:
HWE=1E-06: command not found
Geno_Missing=0.01: command not found
Inds_Missing=0.01: command not found
cut: 2: No such file or directory
cut: 3: No such file or directory
cut: 4: No such file or directory
=: command not found
Expected output:
MAF=0.05
HWE=1E-06
Geno_Missing=0.01
Inds_Missing=0.01
Problems:
1- I want to use array length (${#Col_Names[#]}) as the final iteration which is 5, but the array index start from 0 (0-4). So MAF column was not captured by the loop. Loop also iterate twice (once 0-4 and again 2-4!).
2- When I tried to call values in variables (echo $MAF), they were empty!
Any solution is really appreciated.
This produces the expected output you posted from the sample input you posted:
$ awk -F, -v OFS='=' 'NR==1{split($0,hdr); next} {for (i=1;i<=NF;i++) print hdr[i], $i}' MetaData.csv
MAF=0.05
HWE=1E-06
Geno_Missing=0.01
Inds_Missing=0.01
If that's not all you need then edit your question to clarify your requirements.
If I'm understanding your requirements correctly, would you please try something like:
#!/bin/bash
nr=1 # initialize input line number to 1
while IFS=, read -r -a ary; do # split the line on "," then assign "ary" to the fields
if (( nr == 1 )); then # handle the header line
col_names=("${ary[#]}") # assign column names
else # handle the body lines
for (( i = 0; i < ${#ary[#]}; i++ )); do
printf -v "${col_names[i]}" "${ary[i]}"
# assign the variable "${col_names[i]}" to the input field
done
# now you can access the values via its column name
echo "Fnames=$Fnames"
echo "MAF=$MAF"
fname_list+=("$Fnames") # create a list of Fnames
fi
(( nr++ )) # increment the input line number
done < MetaData.csv
echo "${fname_list[#]}" # print the list of Fnames
Output:
Fnames=19.vcf.gz
MAF=0.05
Fnames=20.vcf.gz
MAF=
Fnames=21.vcf.gz
MAF=
Fnames=22.vcf.gz
MAF=
19.vcf.gz 20.vcf.gz 21.vcf.gz 22.vcf.gz
The statetemt IFS=, read -a ary is mostly equivalent to your
first three lines; it splits the input on ",", and assigns the
array variable ary to the field values.
There are several ways to use a variable's value as a variable name
(Indirect Variable References). printf -v VarName Value is one of them.
[EDIT]
Based on the OP's updated input file, here is an another version:
#!/bin/bash
nr=1 # initialize input line number to 1
while IFS=, read -r -a ary; do # split the line on "," then assign "ary" to the fields
if (( nr == 1 )); then # handle the header line
col_names=("${ary[#]}") # assign column names
else # handle the body lines
for (( i = 0; i < ${#ary[#]}; i++ )); do
printf -v "${col_names[i]}" "${ary[i]}"
# assign the variable "${col_names[i]}" to the input field
done
fi
(( nr++ )) # increment the input line number
done < MetaData.csv
for n in "${col_names[#]}"; do # iterate over the variable names
echo "$n=${!n}" # print variable name and its value
done
# you can also specify the variable names literally as follows:
echo "MAF=$MAF HWE=$HWE Geno_Missing=$Geno_Missing Inds_Missing=$Inds_Missing"
Output:
MAF=0.05
HWE=1E-06
Geno_Missing=0.01
Inds_Missing=0.01
MAF=0.05 HWE=1E-06 Geno_Missing=0.01 Inds_Missing=0.01
As for the output, the first four lines are printed by echo "$n=${!n}" and the last line is printed by echo "MAF=$MAF ....
You can choose either statement depending on your usage of the variables in the following code.
I don't really think you can implement a robust CSV reader/parser in Bash, but you can implement it to work to some extent with simple CSV files. For example, a very simply bash-implemented CSV might look like this:
#!/bin/bash
set -e
ROW_NUMBER='0'
HEADERS=()
while IFS=',' read -ra ROW; do
if test "$ROW_NUMBER" == '0'; then
for (( I = 0; I < ${#ROW[#]}; I++ )); do
HEADERS["$I"]="${ROW[I]}"
done
else
declare -A DATA_ROW_MAP
for (( I = 0; I < ${#ROW[#]}; I++ )); do
DATA_ROW_MAP[${HEADERS["$I"]}]="${ROW[I]}"
done
# DEMO {
echo -e "${DATA_ROW_MAP['Fnames']}\t${DATA_ROW_MAP['Inds_Missing']}"
# } DEMO
unset DATA_ROW_MAP
fi
ROW_NUMBER=$((ROW_NUMBER + 1))
done
Note that is has multiple disadvantages:
it only works with ,-separated fields (truly "C"SV);
it cannot handle multiline records;
it cannot handle field escapes;
it considers the first row always represents a header row.
This is why many commands may produce and consume \0-delimited data just because this control character may be easier to use. Now what I'm not sure about is whether test is the only external command executed by bash (I believe it is, but it can be probably re-implemented using case so that no external test is executed?).
Example of use (with the demo output):
./read-csv.sh < MetaData.csv
19.vcf.gz 0.01
20.vcf.gz
21.vcf.gz
22.vcf.gz
I wouldn't recommend using this parser at all, but would recommend using a more CSV-oriented tool (Python would probably be the easiest choice to use; + or if your favorite language, as you mentioned, is R, then probably this is another option for you: Run R script from command line ).

Parse out key=value pairs into variables

I have a bunch of different kinds of files I need to look at periodically, and what they have in common is that the lines have a bunch of key=value type strings. So something like:
Version=2 Len=17 Hello Var=Howdy Other
I would like to be able to reference the names directly from awk... so something like:
cat some_file | ... | awk '{print Var, $5}' # prints Howdy Other
How can I go about doing that?
The closest you can get is to parse the variables into an associative array first thing every line. That is to say,
awk '{ delete vars; for(i = 1; i <= NF; ++i) { n = index($i, "="); if(n) { vars[substr($i, 1, n - 1)] = substr($i, n + 1) } } Var = vars["Var"] } { print Var, $5 }'
More readably:
{
delete vars; # clean up previous variable values
for(i = 1; i <= NF; ++i) { # walk through fields
n = index($i, "="); # search for =
if(n) { # if there is one:
# remember value by name. The reason I use
# substr over split is the possibility of
# something like Var=foo=bar=baz (that will
# be parsed into a variable Var with the
# value "foo=bar=baz" this way).
vars[substr($i, 1, n - 1)] = substr($i, n + 1)
}
}
# if you know precisely what variable names you expect to get, you can
# assign to them here:
Var = vars["Var"]
Version = vars["Version"]
Len = vars["Len"]
}
{
print Var, $5 # then use them in the rest of the code
}
$ cat file | sed -r 's/[[:alnum:]]+=/\n&/g' | awk -F= '$1=="Var"{print $2}'
Howdy Other
Or, avoiding the useless use of cat:
$ sed -r 's/[[:alnum:]]+=/\n&/g' file | awk -F= '$1=="Var"{print $2}'
Howdy Other
How it works
sed -r 's/[[:alnum:]]+=/\n&/g'
This places each key,value pair on its own line.
awk -F= '$1=="Var"{print $2}'
This reads the key-value pairs. Since the field separator is chosen to be =, the key ends up as field 1 and the value as field 2. Thus, we just look for lines whose first field is Var and print the corresponding value.
Since discussion in commentary has made it clear that a pure-bash solution would also be acceptable:
#!/bin/bash
case $BASH_VERSION in
''|[0-3].*) echo "ERROR: Bash 4.0 required" >&2; exit 1;;
esac
while read -r -a words; do # iterate over lines of input
declare -A vars=( ) # refresh variables for each line
set -- "${words[#]}" # update positional parameters
for word; do
if [[ $word = *"="* ]]; then # if a word contains an "="...
vars[${word%%=*}]=${word#*=} # ...then set it as an associative-array key
fi
done
echo "${vars[Var]} $5" # Here, we use content read from that line.
done <<<"Version=2 Len=17 Hello Var=Howdy Other"
The <<<"Input Here" could also be <file.txt, in which case lines in the file would be iterated over.
If you wanted to use $Var instead of ${vars[Var]}, then substitute printf -v "${word%%=*}" %s "${word*=}" in place of vars[${word%%=*}]=${word#*=}, and remove references to vars elsewhere. Note that this doesn't allow for a good way to clean up variables between lines of input, as the associative-array approach does.
I will try to explain you a very generic way to do this which you can adapt easily if you want to print out other stuff.
Assume you have a string which has a format like this:
key1=value1 key2=value2 key3=value3
or more generic
key1_fs2_value1_fs1_key2_fs2_value2_fs1_key3_fs2_value3
With fs1 and fs2 two different field separators.
You would like to make a selection or some operations with these values. To do this, the easiest is to store these in an associative array:
array["key1"] => value1
array["key2"] => value2
array["key3"] => value3
array["key1","full"] => "key1=value1"
array["key2","full"] => "key2=value2"
array["key3","full"] => "key3=value3"
This can be done with the following function in awk:
function str2map(str,fs1,fs2,map, n,tmp) {
n=split(str,map,fs1)
for (;n>0;n--) {
split(map[n],tmp,fs2);
map[tmp[1]]=tmp[2]; map[tmp[1],"full"]=map[n]
delete map[n]
}
}
So, after processing the string, you have the full flexibility to do operations in any way you like:
awk '
function str2map(str,fs1,fs2,map, n,tmp) {
n=split(str,map,fs1)
for (;n>0;n--) {
split(map[n],tmp,fs2);
map[tmp[1]]=tmp[2]; map[tmp[1],"full"]=map[n]
delete map[n]
}
}
{ str2map($0," ","=",map) }
{ print map["Var","full"] }
' file
The advantage of this method is that you can easily adapt your code to print any other key you are interested in, or even make selections based on this, example:
(map["Version"] < 3) { print map["var"]/map["Len"] }
The simplest and easiest way is to use the string substitution like this:
property='my.password.is=1234567890=='
name=${property%%=*}
value=${property#*=}
echo "'$name' : '$value'"
The output is:
'my.password.is' : '1234567890=='
Yore.
Using bash's set command, we can split the line into positional parameters like awk.
For each word, we'll try to read a name value pair delimited by =.
When we find a value, assign it to the variable named $key using bash's printf -v feature.
#!/usr/bin/env bash
line='Version=2 Len=17 Hello Var=Howdy Other'
set $line
for word in "$#"; do
IFS='=' read -r key val <<< "$word"
test -n "$val" && printf -v "$key" "$val"
done
echo "$Var $5"
output
Howdy Other
SYNOPSIS
an awk-based solution that doesn't require manually checking the fields to locate the desired key pair :
approach being avoid splitting unnecessary fields or arrays - only performing regex match via function call when needed
only returning FIRST occurrence of input key value. Subsequent matches along the row are NOT returned
i just called it S() cuz it's the closest letter to $
I only included an array (_) of the 3 test values for demo purposes. Those aren't needed. In fact, no state information is being kept at all
caveat being : key-match must be exact - this version of the code isn't for case-insensitive or fuzzy/agile matching
Tested and confirmed working on
- gawk 5.1.1
- mawk 1.3.4
- mawk-2/1.9.9.6
- macos nawk
CODE
# gawk profile, created Fri May 27 02:07:53 2022
{m,n,g}awk '
function S(__,_) {
return \
! match($(_=_<_), "(^|["(_="[:blank:]]")")"(__)"[=][^"(_)"*") \
? "^$" \
: substr(__=substr($-_, RSTART, RLENGTH), index(__,"=")+_^!_)
}
BEGIN { OFS = "\f" # This array is only for testing
_["Version"] _["Len"] _["Var"] # purposes. Feel free to discard at will
} {
for (__ in _) {
print __, S(__) } }'
OUTPUT
Var
Howdy
Len
17
Version
2
So either call the fields in BAU fashion
- $5, $0, $NF, etc
or call S(QUOTED_KEY_VALUE), case-sensitive, like
As a safeguard, to prevent mis-interpreting null strings
or invalid inputs as $0, a non-match returns ^$
instead of empty string
S("Version") to get back 2.
As a bonus, it can safely handle values in multibyte unicode, both for values and even for keys, regardless of whether ur awk is UTF-8-aware or not :
1 ✜
🤡
2 Version
2
3 Var
Howdy
4 Len
17
5 ✜=🤡 Version=2 Len=17 Hello Var=Howdy Other
I know this is particularly regarding awk but mentioning this as many people come here for solutions to break down name = value pairs ( with / without using awk as such).
I found below way simple straight forward and very effective in managing multiple spaces / commas as well -
Source: http://jayconrod.com/posts/35/parsing-keyvalue-pairs-in-bash
change="foo=red bar=green baz=blue"
#use below if var is in CSV (instead of space as delim)
change=`echo $change | tr ',' ' '`
for change in $changes; do
set -- `echo $change | tr '=' ' '`
echo "variable name == $1 and variable value == $2"
#can assign value to a variable like below
eval my_var_$1=$2;
done

set multiple variables from one awk command?

This is a very common script:
#!/bin/bash
teststr="col1 col2"
var1=`echo ${teststr} | awk '{print $1}'`
var2=`echo ${teststr} | awk '{print $2}'`
echo var1=${var1}
echo var2=${var2}
However I dont like this, especially when there are more fields to parse.
I guess there should be a better way like:
(var1,var2)=`echo ${teststr} | awk '{print $1 $2}'
(in my imagination)
Is that so?
Thanks for help to improve effeciency and save some CPU power.
This might work for you:
var=(col0 col1 col2)
echo "${var[1]}"
col1
Yes, you can, but the best practice is to use the awk way to pass variables to awk.
Example using shell script variables
awk -v awkVar1="$scriptVar1" -v awkVar2="$scriptVar2" '<your awk code>'
Example using environmental variables
awk -v awkVar1=ENVIRON["ENV_VAR1"] -v awkVar2=ENVIRON["ENV_VAR2"] '<your awk code>'
It's possible to use script and environmental variables at the same time
awk -v awkVar1=ENVIRON["ENV_VAR1"] -v awkVar2="$scriptVar2" '<your awk code>'
You may find bash tricks to circumvent the awk way to do it, but it's not safe.
Explanation and more examples
Awk works this way, because it's a programming language by itself and has it's own way to use variables 'inside' awk statements.
By 'inside' i mean the part between the single quotes.
Let's see an example, where we turn off DHCP in a config file, all done using variables in a shell script. I'm going to explain the last line of code.
The script isn't optimal, it's main purpose is to use script variables. Explaining how the script does its job is out of scope of this answer, the focus is on explaining the use of variables.
#!/bin/bash
# set some variables
# set path to the config file to edit
CONFIG_FILE=/etc/netplan/01-netcfg.yaml
# find the line number of the line to change using awk and assign it to a variable
DHCP_LINE=$(awk '/dhcp4: yes/{print FNR}' $CONFIG_FILE)
# get the number of spaces used for identation using awk and assign it to a variable
SPACES=$(awk -v awkDHCP_LINE="$DHCP_LINE" 'FNR==awkDHCP_LINE {print match($0,/[^ ]|$/)-1}' $CONFIG_FILE)
# find DHCP setting and turn it off if needed
awk -v awkDHCP_LINE="$DHCP_LINE" -v awkSPACES="$SPACES" 'FNR==awkDHCP_LINE {sub("dhcp4: yes", "dhcp4: no")}' $CONFIG_FILE
Let's break this last line up to pieces for explanation.
awk -v awkDHCP_LINE="$DHCP_LINE" -v awkSPACES="$SPACES"
This part above assigns the value of DHCP_LINE script variable to the awkDHCP_LINE awk variable and the the value of SPACES script variable to the awkSPACESawk variable.
Please note, that the SPACES variable is passed to awk for the sake of showing how to pass multiple variables only; the awk command doesn't process it.
'FNR==awkDHCP_LINE {sub("dhcp4: yes", "dhcp4: no")}'
This one above is the 'inside' part of awk where the variable(s) passed to awk can be used.
$CONFIG_FILE
This part is outside awk, a generic script variable is used to specify the file that should be processed.
I hope this clears things a bit :)
Note: if you have lots of variables to pass, the solution provided by #potong may prove a better approach depending on your use case.
Bash has Array Support, We just need to supply values dynamically :)
function test_set_array_from_awk(){
# Note : -a is required as declaring array
let -a myArr;
# Hard Coded Valeus
# myArr=( "Foo" "Bar" "other" );
# echo "${myArr[1]}" # Print Bar
# Dynamic values
myArr=( $(echo "" | awk '{print "Foo"; print "Bar"; print "Fooo-And-Bar"; }') );
# Value #index 0
echo "${myArr[0]}" # Print Foo
# Value #index 1
echo "${myArr[1]}" # Print Bar
# Array Length
echo ${#myArr[#]} # Print 3 as array length
# Safe Reading with Default value
echo "${myArr[10]-"Some-Default-Value"}" # Print Some-Default-Value
echo "${myArr[10]-0}" # Print 0
echo "${myArr[10]-''}" # Print ''
echo "${myArr[10]-}" # Print nothing
# With Dynamic Index
local n=2
echo "${myArr["${n}"]-}" # Print Fooo-And-Bar
}
# calling test function
test_set_array_from_awk
Bash Array Documentation : http://tldp.org/LDP/abs/html/arrays.html
You can also use shell set builtin to place whitespace seperated (or more accurately, IFS seperated) into the variables $1, $2 and so on:
#!/bin/bash
teststr="col1 col2"
set -- $teststr
echo "$1" # col1
echo "$2" # col2

Resources