Read a data file containing multiple variables in bash

Read a data file containing multiple variables in bash - bash

I want to read the following variables from a data file in bash.
#/tmp/input.dat
$machie=1234-567*890ABC
$action=REPLACE
$location=test_location
Thanks for your help.
Tas

#!/usr/bin/env bash
case $BASH_VERSION in
''|[0-3].*) echo "ERROR: Bash 4.0 or newer is required" >&2; exit 1;;
esac
# read input filename from command line, default to "/tmp/input.dat"
input_file=${1:-/tmp/input.dat}
declare -A vars=()
while IFS= read -r line; do
[[ $line = "#"* ]] && continue # Skip comments in input
[[ $line = *=* ]] || continue # Skip lines not containing an "="
line=${line#'$'} # strip leading "$"
key=${line%%=*} # remove everything after first "=" to get key
value=${line#*=} # remove everything before first "=" to get value
vars[$key]=$value # add key/value pair to associative array
done <"$input_file"
# print the variables we read for debugging purposes
declare -p vars >&2
echo "Operation is ${vars[action]}; location is ${vars[location]}" >&2
See:
BashFAQ #1 - How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?
BashFAQ #6 - How can I use variable variables (indirect variables, pointers, references) or associative arrays?; here, we're using associative arrays, but you could use the same technique to assign directly to named variables.[1]
Parameter expansion, the syntax used for isolating the "key" and "value" sections of each line; also covered in BashFAQ #100.
[1] - Note that if you aren't going to use associative arrays, as suggested in this answer, for security reasons it's best to use a prefixed namespace: printf -v "var_$key" %s "$value" -- generating variable names you would dereference as $var_action or $var_location -- is much safer than printf -v "$key" %s "$value", as the former ensures that your data file can't overwrite a security-critical environment variable such as PATH or LD_PRELOAD, by way of causing such attempts to harmlessly set $var_PATH or $var_LD_PRELOAD.

Related

How to read a line into variables, which is in 'var=value' format, in a bash script?

I'm trying to read a line from output, which looks like this (it comes from slurm, for those who are familiar with it):
cpu=00:00:00,energy=0,fs/disk=1389.75K,mem=556K,pages=0,vmem=203640K
After reading the line, there should be variables cpu, energy, a.s.o. with the respective value.
Initially I tried to source the output via piping:
line=$(tr ',/' ';_' <<< "cpu=00:00:00,energy=0,fs/disk=1389.75K,mem=556K,pages=0,vmem=203640K)"
source <<< $line
. <<< $line
But that doesn't work since source and . needs a file. So my working attempt now is:
file=$(mktemp)
{ sacct [...] | tr ',/' ';_' > $file && source $file && rm $file } || echo "Error"
My question would be: is there a better way to achieve the same result without creating a temporary file?

Another way that avoids the unsafe eval:
#!/usr/bin/env bash
line="cpu=00:00:00,energy=0,fs/disk=1389.75K,mem=556K,pages=0,vmem=203640K"
# Turn the / into an underscore since / can't be in an identifier
# Then read the line into an array splitting on commas
IFS=, read -ra vars <<<"${line//\//_}"
# Define all the variables
declare -- "${vars[#]}"
# And display them
declare -p cpu energy fs_disk mem pages vmem
prints out
declare -- cpu="00:00:00"
declare -- energy="0"
declare -- fs_disk="1389.75K"
declare -- mem="556K"
declare -- pages="0"
declare -- vmem="203640K"

If using bash version at laest 4.0, it is safer to parse it into an associative array that can store arbitrary key strings. Otherwise, some identifiers will fail dramatically as invalid Bash variable identifiers like fs/disk which cannot be used as a Bash variable name:
#!/usr/bin/env bash
line='cpu=00:00:00,energy=0,fs/disk=1389.75K,mem=556K,pages=0,vmem=203640K'
# Read the line into a regular array, splitting keys and values at , and = signs
IFS=,= read -r -a kv <<<"$line"
# Generates an Associative array elements delcarations
# by print quoting [key]=value pairs
# shellcheck disable=SC2155 # Safely generated declaration
declare -A map="($(
printf '[%q]=%q ' "${kv[#]}"
))"
# Print out a nice output for demo purpose:
for k in "${!map[#]}"; do
printf '%-8s %s\n' "$k" "${map[$k]}"
done
Output:
fs/disk 1389.75K
vmem 203640K
cpu 00:00:00
pages 0
energy 0
mem 556K
Alternate method to populate the Associative array from the line, using a loop:
declare -A map=()
while IFS='=' read -r -d, k v; do
map["$k"]="$v"
done <<<"$line"

try
eval `echo 'cpu=00:00:00,energy=0,fs/disk=1389.75K,mem=556K,pages=0,vmem=203640K' | tr ',/' ';_'`

Assign and/or manipulate incoming variables (string) from external program in bash

I have an external program which hands me a bunch of information via stdin ($1) to my script.
I get a line like the following:
session number="2018/06/20-234",data name="XTRDF_SLSLWX3_FSLO",data group="Testing",status="Error",data type="0"
Now I want to use this line split into single variables.
I thought about two ways until now:
INPUT='session number="2018/06/20-234",data name="XTRDF_SLSLWX3_FSLO",data group="Testing",status="Error",data type="0"'
echo "$INPUT" | tr ',' '\n' | tr ' ' '_' > vars.tmp
set vars.tmp
This will do the job until I have a data_name variable with a space in it, my trim command will automatically change it to _ and my assigned variable is no longer correct in upcoming checks.
So I thought about loading the input into a array and do some pattern substitution on the array to delete everything until and including the = and do some variable assignments afterwards
INPUT='session number="2018/06/20-234",data name="XTRDF_SLSLWX3_FSLO",data group="Testing",status="Error",data type="0"'
IFS=',' read -r -a array <<< "$INPUT"
array=("${array[#]/#*=/}")
session_number="${array[0]}"
data_name="${array[1]}"
....
But now I have a strange behaviour cutting the input if there is a = somewhere in the data name or data group and I have no idea if this is the way to do it. I'm pretty sure there should be no = in the data name or data group field compared to a space but you never know...
How could I do this?

Simple Case: No Commas Within Strings
If you don't need to worry about commas or literal quotes inside the quoted data, the following handles the case you asked about (stray =s within the data) sanely:
#!/usr/bin/env bash
case $BASH_VERSION in ''|[123].*) echo "ERROR: Requires bash 4.0 or newer" >&2; exit 1;; esac
input='session number="2018/06/20-234",data name="XTRDF_SLSLWX3_FSLO",data group="Testing",status="Error",data type="0"'
declare -A data=( )
IFS=, read -r -a pieces <<<"$input"
for piece in "${pieces[#]}"; do
key=${piece%%=*} # delete everything past the *first* "=", ignoring later ones
value=${piece#*=} # delete everything before the *first* "=", ignoring later ones
value=${value#'"'} # remove leading quote
value=${value%'"'} # remove trailing quote
data[$key]=$value
done
declare -p data
...results in (whitespace added for readability, otherwise literal output):
declare -A data=(
["data type"]="0"
[status]="Error"
["data group"]="Testing"
["data name"]="XTRDF_SLSLWX3_FSLO"
["session number"]="2018/06/20-234"
)
Handling Commas Inside Quotes
Now, let's say you do need to worry about commas inside your quotes! Consider the following input:
input='session number="123",error="Unknown, please try again"'
Now, if we try to split on commas without considering their position, we'll have error="Unknown and have please try again as a stray value.
To solve this, we can use GNU awk with the FPAT feature.
#!/usr/bin/env bash
case $BASH_VERSION in ''|[123].*) echo "ERROR: Requires bash 4.0 or newer" >&2; exit 1;; esac
input='session number="123",error="Unknown, please try again"'
# Why do so many awk people try to write one-liners? Isn't this more readable?
awk_script='
BEGIN {
FPAT = "[^=,]+=(([^,]+)|(\"[^\"]+\"))"
}
{
printf("%s\0", NF)
for (i = 1; i <= NF; i++) {
printf("%s\0", $i)
}
}
'
while :; do
IFS= read -r -d '' num_fields || break
declare -A data=( )
for ((i=0; i<num_fields; i++)); do
IFS= read -r -d '' piece || break
key=${piece%%=*}
value=${piece#*=}
value=${value#'"'}
value=${value%'"'}
data[$key]=$value
done
declare -p data # maybe invoke a callback here, before going on to the next line
done < <(gawk "$awk_script" <<<"$input")
...whereafter output is properly:
declare -A data=(["session number"]="123" [error]="Unknown, please try again" )

Read 'n' lines from file and assign each line to distinct variable [duplicate]

This question already has answers here:
Assign each line of file to be a variable
(2 answers)
Closed 5 years ago.
I generate a text file based on nodes in the cluster.
cat primary_nodes.txt
clusterNodea
clusterNodeb
..
....
clusterNoden
While I try to generate variable for each line, it gives me following output
while read PRIMARY_NODE$((i++)); do
echo $PRIMARY_NODE1
echo $PRIMARY_NODE2
done < primary_node.txt
clusterNodea
clusterNodea
clusterNodeb
What I want is:
It should return total no.of nodes
Each line should be assigned to PRIMARY_NODE1...n variable incrementally
Return all variable with its value.

The Right Thing: An Array
In bash 4:
readarray -t primary_node <primary_node.txt
Thereafter:
echo "${primary_node[0]}" # clusterNodea
echo "${primary_node[1]}" # clusterNodeb
Or to iterate over the values:
for node in "${primary_node[#]}"; do
echo "Processing node $node"
done
The Wrong Thing: Distinct Variables
i=0
while IFS= read -r line; do
printf -v "primary_node$((i++))" '%s' "$line"
done <primary_node.txt
echo "$primary_node1"
References
BashFAQ #1 - "How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?"
BashFAQ #5 - "How can I use array variables?"
BashFAQ #6 - "How can I use variable variables (indirect variables, pointers, references) or associative arrays?"

Arrays are your friend here:
readarray -t MYARRAY <primary_nodes.txt
$ echo ${MYARRAY[0]}
clusterNodea
$ echo ${MYARRAY[1]}
clusterNodeb
Note that it's possible to also use the following:
$ MYARRAY=($(cat primary_nodes.txt))
however this should be avoided for as file globbing and literal whitespaces can give unexpected results as Charles Duffy points out below

A bit of dynamic variable assignment using declare and indirect expansion as
#!/bin/bash
i=1
while IFS= read -r line; do
# The 'declare' syntax creates variables on the fly with the values
# read from the file
declare primaryNode$((i++))="$line"
done <file
count="$((i-1))"
# Now the variables are created; those can be individually accessed as
# '$primaryNode1'..'$primaryNoden'; but to print it on a loop, use
# indirect expansion using ${!var} syntax
for ((idx=1; idx<=count; idx++)); do
temp=primaryNode$idx
printf "PRIMARY NODE%s=%s\n" "$idx" "${!temp}"
done

Bash assign variable using another variable

I'm not exactly sure how to word the title, but what I am trying to do is to set username to $DEV_ENVIRONMENT, $STAGE_ENVIRONMENT, or $PROD_ENVIRONMENT respectively, each of which is defined in my properties file.
My function would take in one argument (DEV, STAGE, or PROD) and check the hostname against the edges defined in $_ENVIRONMENT.
while IFS=',' read -ra line
do
for i in "${line[#]}"
do
if [ $(hostname) = $i ]
then
username="$1"_USERNAME
break
fi
done
done <<< "${1}_ENVIRONMENT"
So for instance, if I pass in DEV, then I would like username set to $DEV_USERNAME and I'd like the while loop to search through the nodes defined in $DEV_ENVIRONMENT, both of which would be values read in from my properties file.

Bash supports indirection expansion, so you can do:
var=${1}_ENVIRONMENT
username=${!var}
rather than the slightly more cumbersome and potentially dangerous eval:
eval username=\$${1}_ENVIRONMENT
So for your code:
while IFS=',' read -ra line
do
for i in "${line[#]}"
do
if [ $(hostname) = $i ]
then
var="$1"_USERNAME
username=${!var}
break
fi
done
done <<< "${1}_ENVIRONMENT"

How to parse $QUERY_STRING from a bash CGI script?

I have a bash script that is being used in a CGI. The CGI sets the $QUERY_STRING environment variable by reading everything after the ? in the URL. For example, http://example.com?a=123&b=456&c=ok sets QUERY_STRING=a=123&b=456&c=ok.
Somewhere I found the following ugliness:
b=$(echo "$QUERY_STRING" | sed -n 's/^.*b=\([^&]*\).*$/\1/p' | sed "s/%20/ /g")
which will set $b to whatever was found in $QUERY_STRING for b. However, my script has grown to have over ten input parameters. Is there an easier way to automatically convert the parameters in $QUERY_STRING into environment variables usable by bash?
Maybe I'll just use a for loop of some sort, but it'd be even better if the script was smart enough to automatically detect each parameter and maybe build an array that looks something like this:
${parm[a]}=123
${parm[b]}=456
${parm[c]}=ok
How could I write code to do that?

Try this:
saveIFS=$IFS
IFS='=&'
parm=($QUERY_STRING)
IFS=$saveIFS
Now you have this:
parm[0]=a
parm[1]=123
parm[2]=b
parm[3]=456
parm[4]=c
parm[5]=ok
In Bash 4, which has associative arrays, you can do this (using the array created above):
declare -A array
for ((i=0; i<${#parm[#]}; i+=2))
do
array[${parm[i]}]=${parm[i+1]}
done
which will give you this:
array[a]=123
array[b]=456
array[c]=ok
Edit:
To use indirection in Bash 2 and later (using the parm array created above):
for ((i=0; i<${#parm[#]}; i+=2))
do
declare var_${parm[i]}=${parm[i+1]}
done
Then you will have:
var_a=123
var_b=456
var_c=ok
You can access these directly:
echo $var_a
or indirectly:
for p in a b c
do
name="var$p"
echo ${!name}
done
If possible, it's better to avoid indirection since it can make code messy and be a source of bugs.

you can break $QUERY down using IFS. For example, setting it to &
$ QUERY="a=123&b=456&c=ok"
$ echo $QUERY
a=123&b=456&c=ok
$ IFS="&"
$ set -- $QUERY
$ echo $1
a=123
$ echo $2
b=456
$ echo $3
c=ok
$ array=($#)
$ for i in "${array[#]}"; do IFS="=" ; set -- $i; echo $1 $2; done
a 123
b 456
c ok
And you can save to a hash/dictionary in Bash 4+
$ declare -A hash
$ for i in "${array[#]}"; do IFS="=" ; set -- $i; hash[$1]=$2; done
$ echo ${hash["b"]}
456

Please don't use the evil eval junk.
Here's how you can reliably parse the string and get an associative array:
declare -A param
while IFS='=' read -r -d '&' key value && [[ -n "$key" ]]; do
param["$key"]=$value
done <<<"${QUERY_STRING}&"
If you don't like the key check, you could do this instead:
declare -A param
while IFS='=' read -r -d '&' key value; do
param["$key"]=$value
done <<<"${QUERY_STRING:+"${QUERY_STRING}&"}"
Listing all the keys and values from the array:
for key in "${!param[#]}"; do
echo "$key: ${param[$key]}"
done

I packaged the sed command up into another script:
$cat getvar.sh
s='s/^.*'${1}'=\([^&]*\).*$/\1/p'
echo $QUERY_STRING | sed -n $s | sed "s/%20/ /g"
and I call it from my main cgi as:
id=`./getvar.sh id`
ds=`./getvar.sh ds`
dt=`./getvar.sh dt`
...etc, etc - you get idea.
works for me even with a very basic busybox appliance (my PVR in this case).

To converts the contents of QUERY_STRING into bash variables use the following command:
eval $(echo ${QUERY_STRING//&/;})
The inner step, echo ${QUERY_STRING//&/;}, substitutes all ampersands with semicolons producing a=123;b=456;c=ok which the eval then evaluates into the current shell.
The result can then be used as bash variables.
echo $a
echo $b
echo $c
The assumptions are:
values will never contain '&'
values will never contain ';'
QUERY_STRING will never contain malicious code

While the accepted answer is probably the most beautiful one, there might be cases where security is super-important, and it needs to be also well-visible from your script.
In such a case, first I wouldn't use bash for the task, but if it should be done on some reason, it might be better to avoid these new array - dictionary features, because you can't be sure, how exactly are they escaped.
In this case, the good old primitive solutions might work:
QS="${QUERY_STRING}"
while [ "${QS}" != "" ]
do
nameval="${QS%%&*}"
QS="${QS#$nameval}"
QS="${QS#&}"
name="${nameval%%=*}"
val="${nameval#$name}"
val="${nameval#=}"
# and here we have $name and $val as names and values
# ...
done
This iterates on the name-value pairs of the QUERY_STRING, and there is no way to circumvent it with any tricky escape sequence - the " is a very strong thing in bash, except a single variable name substitution, which is fully controlled by us, nothing can be tricked.
Furthermore, you can inject your own processing code into "# ...". This enables you to allow only your own, well-defined (and, ideally, short) list of the allowed variable names. Needless to say, LD_PRELOAD shouldn't be one of them. ;-)
Furthermore, no variable will be exported, and exclusively QS, nameval, name and val is used.

Following the correct answer, I've done myself some changes to support array variables like in this other question. I added also a decode function of which I can not find the author to give some credit.
Code appears somewhat messy, but it works. Changes and other recommendations would be greatly appreciated.
function cgi_decodevar() {
[ $# -ne 1 ] && return
local v t h
# replace all + with whitespace and append %%
t="${1//+/ }%%"
while [ ${#t} -gt 0 -a "${t}" != "%" ]; do
v="${v}${t%%\%*}" # digest up to the first %
t="${t#*%}" # remove digested part
# decode if there is anything to decode and if not at end of string
if [ ${#t} -gt 0 -a "${t}" != "%" ]; then
h=${t:0:2} # save first two chars
t="${t:2}" # remove these
v="${v}"`echo -e \\\\x${h}` # convert hex to special char
fi
done
# return decoded string
echo "${v}"
return
}
saveIFS=$IFS
IFS='=&'
VARS=($QUERY_STRING)
IFS=$saveIFS
for ((i=0; i<${#VARS[#]}; i+=2))
do
curr="$(cgi_decodevar ${VARS[i]})"
next="$(cgi_decodevar ${VARS[i+2]})"
prev="$(cgi_decodevar ${VARS[i-2]})"
value="$(cgi_decodevar ${VARS[i+1]})"
array=${curr%"[]"}
if [ "$curr" == "$next" ] && [ "$curr" != "$prev" ] ;then
j=0
declare var_${array}[$j]="$value"
elif [ $i -gt 1 ] && [ "$curr" == "$prev" ]; then
j=$((j + 1))
declare var_${array}[$j]="$value"
else
declare var_$curr="$value"
fi
done

I would simply replace the & to ;. It will become to something like:
a=123;b=456;c=ok
So now you need just evaluate and read your vars:
eval `echo "${QUERY_STRING}"|tr '&' ';'`
echo $a
echo $b
echo $c

A nice way to handle CGI query strings is to use Haserl which acts as a wrapper around your Bash cgi script, and offers convenient and secure query string parsing.

To bring this up to date, if you have a recent Bash version then you can achieve this with regular expressions:
q="$QUERY_STRING"
re1='^(\w+=\w+)&?'
re2='^(\w+)=(\w+)$'
declare -A params
while [[ $q =~ $re1 ]]; do
q=${q##*${BASH_REMATCH[0]}}
[[ ${BASH_REMATCH[1]} =~ $re2 ]] && params+=([${BASH_REMATCH[1]}]=${BASH_REMATCH[2]})
done
If you don't want to use associative arrays then just change the penultimate line to do what you want. For each iteration of the loop the parameter is in ${BASH_REMATCH[1]} and its value is in ${BASH_REMATCH[2]}.
Here is the same thing as a function in a short test script that iterates over the array outputs the query string's parameters and their values
#!/bin/bash
QUERY_STRING='foo=hello&bar=there&baz=freddy'
get_query_string() {
local q="$QUERY_STRING"
local re1='^(\w+=\w+)&?'
local re2='^(\w+)=(\w+)$'
while [[ $q =~ $re1 ]]; do
q=${q##*${BASH_REMATCH[0]}}
[[ ${BASH_REMATCH[1]} =~ $re2 ]] && eval "$1+=([${BASH_REMATCH[1]}]=${BASH_REMATCH[2]})"
done
}
declare -A params
get_query_string params
for k in "${!params[#]}"
do
v="${params[$k]}"
echo "$k : $v"
done
Note the parameters end up in the array in reverse order (it's associative so that shouldn't matter).

why not this
$ echo "${QUERY_STRING}"
name=carlo&last=lanza&city=pfungen-CH
$ saveIFS=$IFS
$ IFS='&'
$ eval $QUERY_STRING
$ IFS=$saveIFS
now you have this
name = carlo
last = lanza
city = pfungen-CH
$ echo "name is ${name}"
name is carlo
$ echo "last is ${last}"
last is lanza
$ echo "city is ${city}"
city is pfungen-CH

#giacecco
To include a hiphen in the regex you could change the two lines as such in answer from #starfry.
Change these two lines:
local re1='^(\w+=\w+)&?'
local re2='^(\w+)=(\w+)$'
To these two lines:
local re1='^(\w+=(\w+|-|)+)&?'
local re2='^(\w+)=((\w+|-|)+)$'

For all those who couldn't get it working with the posted answers (like me),
this guy figured it out.
Can't upvote his post unfortunately...
Let me repost the code here real quick:
#!/bin/sh
if [ "$REQUEST_METHOD" = "POST" ]; then
if [ "$CONTENT_LENGTH" -gt 0 ]; then
read -n $CONTENT_LENGTH POST_DATA <&0
fi
fi
#echo "$POST_DATA" > data.bin
IFS='=&'
set -- $POST_DATA
#2- Value1
#4- Value2
#6- Value3
#8- Value4
echo $2 $4 $6 $8
echo "Content-type: text/html"
echo ""
echo "<html><head><title>Saved</title></head><body>"
echo "Data received: $POST_DATA"
echo "</body></html>"
Hope this is of help for anybody.
Cheers

Actually I liked bolt's answer, so I made a version which works with Busybox as well (ash in Busybox does not support here string).
This code will accept key1 and key2 parameters, all others will be ignored.
while IFS= read -r -d '&' KEYVAL && [[ -n "$KEYVAL" ]]; do
case ${KEYVAL%=*} in
key1) KEY1=${KEYVAL#*=} ;;
key2) KEY2=${KEYVAL#*=} ;;
esac
done <<END
$(echo "${QUERY_STRING}&")
END

One can use the bash-cgi.sh, which processes :
the query string into the $QUERY_STRING_GET key and value array;
the post request data (x-www-form-urlencoded) into the $QUERY_STRING_POST key and value array;
the cookies data into the $HTTP_COOKIES key and value array.
Demands bash version 4.0 or higher (to define the key and value arrays above).
All processing is made by bash only (i.e. in an one process) without any external dependencies and additional processes invoking.
It has:
the check for max length of data, which can be transferred to it's input,
as well as processed as query string and cookies;
the redirect() procedure to produce redirect to itself with the extension changed to .html (it is useful for an one page's sites);
the http_header_tail() procedure to output the last two strings of the HTTP(S) respond's header;
the $REMOTE_ADDR value sanitizer from possible injections;
the parser and evaluator of the escaped UTF-8 symbols embedded into the values passed to the $QUERY_STRING_GET, $QUERY_STRING_POST and $HTTP_COOKIES;
the sanitizer of the $QUERY_STRING_GET, $QUERY_STRING_POST and $HTTP_COOKIES values against possible SQL injections (the escaping like the mysql_real_escape_string php function does, plus the escaping of # and $).
It is available here:
https://github.com/VladimirBelousov/fancy_scripts

This works in dash using for in loop
IFS='&'
for f in $query_string; do
value=${f##*=}
key=${f%%=*}
# if you need environment variable -> eval "qs_$key=$value"
done

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Read a data file containing multiple variables in bash - bash

I want to read the following variables from a data file in bash. #/tmp/input.dat $machie=1234-567*890ABC $action=REPLACE $location=test_location Thanks for your help. Tas

Related

How to read a line into variables, which is in 'var=value' format, in a bash script?

Assign and/or manipulate incoming variables (string) from external program in bash

Read 'n' lines from file and assign each line to distinct variable [duplicate]

Bash assign variable using another variable

How to parse $QUERY_STRING from a bash CGI script?

Categories

Resources