Bash: Split two strings directly into associative array - bash

I have two strings of same number of substrings divided by a delimiter.
I need to create key-value pairs from substrings.
Short example:
Input:
firstString='00011010:00011101:00100001'
secondString='H:K:O'
delimiter=':'
Desired result:
${translateMap['00011010']} -> 'H'
${translateMap['00011101']} -> 'K'
${translateMap['00100001']} -> 'O'
So, I wrote:
IFS="$delimiter" read -ra fromArray <<< "$firstString"
IFS="$delimiter" read -ra toArray <<< "$secondString"
declare -A translateMap
curIndex=0
for from in "${fromArray[#]}"; do
translateMap["$from"]="${toArray[curIndex]}"
((curIndex++))
done
Is there any way to create the associative array directly from 2 strings without the unneeded arrays and loop? Something like:
IFS="$delimiter" read -rA translateMap["$(read -ra <<< "$firstString")"] <<< "$secondString"
Is it possible?

A (somewhat convoluted) variation on #accdias's answer of assigning the values via the declare -A command, but will need a bit of explanation for each step ...
First we need to break the 2 variables into separate lines for each item:
$ echo "${firstString}" | tr "${delimiter}" '\n'
00011010
00011101
00100001
$ echo "${secondString}" | tr "${delimiter}" '\n'
H
K
O
What's nice about this is that we can now process these 2 sets of key/value pairs as separate files.
NOTE: For the rest off this discussion I'm going to replace "${delimiter}" with ':' to make this a tad bit (but not much) less convoluted.
Next we make use of the paste command to merge our 2 'files' into a single file; we'll also designate ']' as the delimiter between key/value mappings:
$ paste -d ']' <(echo "${firstString}" | tr ':' '\n') <(echo "${secondString}" | tr ':' '\n')
00011010]H
00011101]K
00100001]O
We'll now run these results through a couple sed patterns to build our array assignments:
$ paste -d ']' <(echo "${firstString}" | tr ':' '\n') <(echo "${secondString}" | tr ':' '\n') | sed 's/^/[/g;s/]/]=/g'
[00011010]=H
[00011101]=K
[00100001]=O
What we'd like to do now is use this output in the typeset -A command but unfortunately we need to build the entire command and then eval it:
$ evalstring="typeset -A kv=( "$(paste -d ']' <(echo "${firstString}" | tr ':' '\n') <(echo "${secondString}" | tr ':' '\n') | sed 's/^/[/g;s/]/]=/g')" )"
$ echo "$evalstring"
typeset -A kv=( [00011010]=H
[00011101]=K
[00100001]=O )
If we want to remove the carriage returns and put on a single line we append another tr at the output from the sed command:
$ evalstring="typeset -A kv=( "$(paste -d ']' <(echo "${firstString}" | tr ':' '\n') <(echo "${secondString}" | tr ':' '\n') | sed 's/^/[/g;s/]/]=/g' | tr '\n' ' ')" )"
$ cat "${evalstring}"
typeset -A kv=( [00011010]=H [00011101]=K [00100001]=O )
At this point we can eval our auto-generated typeset -A command:
$ eval "${evalstring}"
And now loop through our array displaying the key/value pairs:
$ for i in ${!kv[#]}; do echo "kv[${i}] = ${kv[${i}]}"; done
kv[00011010] = H
kv[00100001] = O
kv[00011101] = K
Hey, I did say this would be a bit convoluted! :-)

It is probably not what you expect, but this works:
key_string="A:B:C:D"
val_string="1:2:3:4"
declare -A map
while [ -n "$key_string" ] && [ -n "$val_string" ]; do
IFS=: read -r key key_string <<<"$key_string"
IFS=: read -r val val_string <<<"$val_string"
map[$key]="$val"
done
for key in "${!map[#]}"; do echo "$key => ${map[$key]}"; done
It uses recursion in the read function to reassign the string value.
The downside of this method is that it destroys the original strings. The while-loop checks constantly if both strings have a non-zero length.
Next to the above in pure bash, you could any command to generate the associative array. See How do I populate a bash associative array with command output?
This generally looks like:
declare -A map="( $( magic_command ) )"
where the magic_command generates an output like
[key1]=val1
[key2]=val2
[key3]=val3
In this case we use the command:
paste -d "" <(echo "[${key_string//:/]=$'\n'[}]=") \
<(echo "${val_string//:/$'\n'}")
where we use bash substitution to replace the delimiter with a newline. However, any other magic_command might do. For completion:
key_string="A:B:C:D"
val_string="1:2:3:4"
declare -A map="( $(paste -d "" <(echo "[${key_string//:/]=$'\n'[}]=") \
<(echo "${val_string//:/$'\n'}")) )"
for key in "${!map[#]}"; do echo "$key => ${map[$key]}"; done
Both examples generate the following output
D => 4
C => 3
B => 2
A => 1

Not exactly the answer for what you asked but at least it is shorter:
key='00011010:00011101:00100001'
value='H:K:O'
ifs=':'
IFS="$ifs" read -ra keys <<< "$key"
IFS="$ifs" read -ra values <<< "$value"
declare -A kv
for ((i=0; i<${#keys[*]}; i++)); do
kv[${keys[i]}]=${values[i]}
done
As a side note, you can initialize an associative array in one step with:
declare -A kv=([key1]=value1 [key2]=value2 [keyn]=valuen)
But I don't know how to use that in your case.

If values in your strings won't use spaces i would suggest this approach
firstString='00011010:00011101:00100001'
secondString='H:K:O'
delimiter=':'
declare -A translateMap
firstArray=( ${firstString//$delimiter/' '} )
secondArray=( ${secondString//$delimiter/' '} )
for i in ${!firstArray[#]}; {
translateMap[firstArray[$i]}]=${secondArray[$i]}
}

Related

Bash Associative Array from String?

A command emits the string: "[abc]=kjlkjkl [def]=yutuiu [ghi]=jljlkj"
I want to load a bash associative array using these key|value pairs, but the result I'm getting is a single row array where the key is formed of the first pair [abc]=kjlkjkl and the value is the whole of the rest of the string, so: declare -p arr returns declare -A arr["[abc]=kjlkjkl"]="[def]=yutuiu [ghi]=jljlkj"
This is what I am doing at the moment. Where am I going wrong please?
declare -A arr=()
while read -r a b; do
arr["$a"]="$b"
done < <(command that outputs the string "[abc]=kjlkjkl [def]=yutuiu [ghi]=jljlkj")
You need to parse it: split the string on spaces, split each key-value pair on the equals sign, and get rid of the brackets.
Here's one way, using tr to replace the spaces with newlines, then tr again to remove all brackets (including any that occur in a value), then IFS="=" to split the key-value pairs. I'm sure this could be done more effectively, like with AWK or Perl, but I don't know how.
declare -A arr=()
while IFS="=" read -r a b; do
arr["$a"]="$b"
done < <(
echo "[abc]=kjlkjkl [def]=yutuiu [ghi]=jljlkj" |
tr ' ' '\n' |
tr -d '[]'
)
echo "${arr[def]}" # -> yutuiu
See Cyrus's answer for another take on this, with the space and equals steps combined.
Append this to your command which outputs the string:
| tr ' =' '\n ' | tr -d '[]'
You can use the "eval declare" trick - but be sure your input is clean.
#! /bin/bash
s='[abc]=kjlkjkl [def]=yutuiu [ghi]=jljlkj'
eval declare -A arr=("$s")
echo ${arr[def]} # yutuiu
If the input is insecure, don't use it. Imagine (don't try) what would happen if
s='); rm -rf / #'
The "proper" good™ solution would be to write your own parser and tokenize the input. For example read the input char by char, handle [ and ] and = and space and optionally quoting. After parsing the string, assign the output to an associative array.
A simple way could be:
echo "[abc]=kjlkjkl [def]=yutuiu [ghi]=jljlkj" |
xargs -n1 |
{
declare -A arr;
while IFS= read -r line; do
if [[ "$line" =~ ^\[([a-z]*)\]=([a-z]*)$ ]]; then
arr[${BASH_REMATCH[1]}]=${BASH_REMATCH[2]}
fi
done
declare -p arr
}
outputs:
declare -A arr=([abc]="kjlkjkl" [ghi]="jljlkj" [def]="yutuiu" )

exporting environment variables with spaces using jq

So, I'm trying to export an environment variable that comes from an api that returns json values. Would like to use jq to just do a one liner, but if the values have spaces I cannot get it working
Trying without surrounding the value in quotes
/app/src $ $(echo '{"params":[{ "Name":"KEY","Value":"value with space"}]}' | jq
-r '.params[] | "export " + .Name + "=" + .Value')
/app/src $ printenv KEY
value
/app/src $
Next, I try wrapping the value in quotes
/app/src $ $(echo '{"params":[{ "Name":"KEY","Value":"value with space"}]}' | jq
-r '.params[] | "export " + .Name + "=\"" + .Value + "\""')
sh: export: space": bad variable name
/app/src $
For all of the below, I'm assuming that:
json='{"params":[{ "Name":"KEY","Value":"value with space"}]}'
It can be done, but ONLY IF YOU TRUST YOUR INPUT.
A solution that uses eval might look like:
eval "$(jq -r '.params[] | "export \(.Name | #sh)=\(.Value | #sh)"' <<<"$json")"
The #sh builtin in jq escapes content to be eval-safe in bash, and the eval invocation then ensures that the content goes through all parsing stages (so literal quotes in the data emitted by jq become syntactic).
However, all solutions that allow arbitrary shell variables to be assigned have innate security problems, as the ability to set variables like PATH, LD_LIBRARY_PATH, LD_PRELOAD and the like can be leveraged into arbitrary code execution.
Better form is to generate a NUL-delimited key/value list...
build_kv_nsv() {
jq -j '.params[] |
((.Name | gsub("\u0000"; "")),
"\u0000",
(.Value | gsub("\u0000"; "")),
"\u0000")'
}
...and either populate an associative array...
declare -A content_received=( )
while IFS= read -r -d '' name && IFS= read -r -d '' value; do
content_received[$name]=$value
done < <(build_kv_nsv <<<"$json")
# print the value of the populated associative array
declare -p content_received
...or to use a namespace that's prefixed to guarantee safety.
while IFS= read -r -d '' name && IFS= read -r -d '' value; do
printf -v "received_$name" %s "$value" && export "received_$name"
done < <(build_kv_nsv <<<"$json")
# print names and values of our variables that start with received_
declare -p "${!received_#}" >&2
If the values are known not to contain (raw) newlines, and if you have access to mapfile, it may be worthwhile considering using it, e.g.
$ json='{"params":[{ "Name":"KEY","Value":"value with space"}]}'
$ mapfile -t KEY < <( jq -r '.params[] | .Value' <<< "$json" )
$ echo N=${#KEY[#]}
N=1
If the values might contain (raw) newlines, then you'd need a version of mapfile with the -d option, which could be used as illustrated below:
$ json='{"params":[{ "Name":"KEY1","Value":"value with space"}, { "Name":"KEY2","Value":"value with \n newline"}]}'
$ mapfile -d $'\0' KEY < <( jq -r -j '.params[] | .Value + "\u0000"' <<< "$json" )
$ echo N=${#KEY[#]}
N=2

convert factor to numeric in bash

What's the most efficient way to convert a factor vector (not all levels are unique) into a numeric vector in bash? The values in the numeric vector do not matter as long as each represents a unique level of the factor.
To illustrate, this would be the R equivalent to what I want to do in bash:
numeric<-seq_along(levels(factor))[factor]
I.e.:
factor
AV1019A
ABG1787
AV1019A
B77hhA
B77hhA
numeric
1
2
1
3
3
Many thanks.
It is most probably not the most efficient, but maybe something to start.
#!/bin/bash
input_data=$( mktemp )
map_file=$( mktemp )
# your example written to a file
echo -e "AV1019A\nABG1787\nAV1019A\nB77hhA\nB77hhA" >> $input_data
# create a map <numeric, factor> and write to file
idx=0
for factor in $( cat $input_data | sort -u )
do
echo $idx $factor
let idx=$idx+1
done > $map_file
# go through your file again and replace values with keys
while read line
do
key=$( cat $map_file | grep -e ".* ${line}$" | awk '{print $1}' )
echo $key
done < $input_data
# cleanup
rm -f $input_data $map_file
I initially wanted to use associative arrays, but it's a bash 4+ feature only and not available here and there. If you have bash 4 then you have one file less, which is obviously more efficient.
#!/bin/bash
# your example written to a file
input_data=$( mktemp )
echo -e "AV1019A\nABG1787\nAV1019A\nB77hhA\nB77hhA" >> $input_data
# declare an array
declare -a factor_map=($( cat $input_data | sort -u | tr "\n" " " ))
# go through your file replace values with keys
while read line
do
echo ${factor_map[#]/$line//} | cut -d/ -f1 | wc -w | tr -d ' '
done < $input_data
# cleanup
rm -f $input_data

Simple method to shuffle the elements of an array in BASH shell?

I can do this in PHP but am trying to work within the BASH shell. I need to take an array and then randomly shuffle the contents and dump that to somefile.txt.
So given array Heresmyarray, of elements a;b;c;d;e;f; it would produce an output file, output.txt, which would contain elements f;c;b;a;e;d;
The elements need to retain the semicolon delimiter. I've seen a number of bash shell array operations but nothing that seems even close to this simple concept. Thanks for any help or suggestions!
The accepted answer doesn't match the headline question too well, though the details in the question are a bit ambiguous. The question asks about how to shuffle elements of an array in BASH, and kurumi's answer shows a way to manipulate the contents of a string.
kurumi nonetheless makes good use of the 'shuf' command, while siegeX shows how to work with an array.
Putting the two together yields an actual "simple method to shuffle the elements of an array in BASH shell":
$ myarray=( 'a;' 'b;' 'c;' 'd;' 'e;' 'f;' )
$ myarray=( $(shuf -e "${myarray[#]}") )
$ printf "%s" "${myarray[#]}"
d;b;e;a;c;f;
From the BashFaq
This function shuffles the elements of an array in-place using the Knuth-Fisher-Yates shuffle algorithm.
#!/bin/bash
shuffle() {
local i tmp size max rand
# $RANDOM % (i+1) is biased because of the limited range of $RANDOM
# Compensate by using a range which is a multiple of the array size.
size=${#array[*]}
max=$(( 32768 / size * size ))
for ((i=size-1; i>0; i--)); do
while (( (rand=$RANDOM) >= max )); do :; done
rand=$(( rand % (i+1) ))
tmp=${array[i]} array[i]=${array[rand]} array[rand]=$tmp
done
}
# Define the array named 'array'
array=( 'a;' 'b;' 'c;' 'd;' 'e;' 'f;' )
shuffle
printf "%s" "${array[#]}"
Output
$ ./shuff_ar > somefile.txt
$ cat somefile.txt
b;c;e;f;d;a;
If you just want to put them into a file (use redirection > )
$ echo "a;b;c;d;e;f;" | sed -r 's/(.[^;]*;)/ \1 /g' | tr " " "\n" | shuf | tr -d "\n"
d;a;e;f;b;c;
$ echo "a;b;c;d;e;f;" | sed -r 's/(.[^;]*;)/ \1 /g' | tr " " "\n" | shuf | tr -d "\n" > output.txt
If you want to put the items in array
$ array=( $(echo "a;b;c;d;e;f;" | sed -r 's/(.[^;]*;)/ \1 /g' | tr " " "\n" | shuf | tr -d " " ) )
$ echo ${array[0]}
e;
$ echo ${array[1]}
d;
$ echo ${array[2]}
a;
If your data has &#abcde;
$ echo "a;&#abcde;c;d;e;f;" | sed -r 's/(.[^;]*;)/ \1 /g' | tr " " "\n" | shuf | tr -d "\n"
d;c;f;&#abcde;e;a;
$ echo "a;&#abcde;c;d;e;f;" | sed -r 's/(.[^;]*;)/ \1 /g' | tr " " "\n" | shuf | tr -d "\n"
&#abcde;f;a;c;d;e;

Redirect output to a bash array

I have a file containing the string
ipAddress=10.78.90.137;10.78.90.149
I'd like to place these two IP addresses in a bash array. To achieve that I tried the following:
n=$(grep -i ipaddress /opt/ipfile | cut -d'=' -f2 | tr ';' ' ')
This results in extracting the values alright but for some reason the size of the array is returned as 1 and I notice that both the values are identified as the first element in the array. That is
echo ${n[0]}
returns
10.78.90.137 10.78.90.149
How do I fix this?
Thanks for the help!
do you really need an array
bash
$ ipAddress="10.78.90.137;10.78.90.149"
$ IFS=";"
$ set -- $ipAddress
$ echo $1
10.78.90.137
$ echo $2
10.78.90.149
$ unset IFS
$ echo $# #this is "array"
if you want to put into array
$ a=( $# )
$ echo ${a[0]}
10.78.90.137
$ echo ${a[1]}
10.78.90.149
#OP, regarding your method: set your IFS to a space
$ IFS=" "
$ n=( $(grep -i ipaddress file | cut -d'=' -f2 | tr ';' ' ' | sed 's/"//g' ) )
$ echo ${n[1]}
10.78.90.149
$ echo ${n[0]}
10.78.90.137
$ unset IFS
Also, there is no need to use so many tools. you can just use awk, or simply the bash shell
#!/bin/bash
declare -a arr
while IFS="=" read -r caption addresses
do
case "$caption" in
ipAddress*)
addresses=${addresses//[\"]/}
arr=( ${arr[#]} ${addresses//;/ } )
esac
done < "file"
echo ${arr[#]}
output
$ more file
foo
bar
ipAddress="10.78.91.138;10.78.90.150;10.77.1.101"
foo1
ipAddress="10.78.90.137;10.78.90.149"
bar1
$./shell.sh
10.78.91.138 10.78.90.150 10.77.1.101 10.78.90.137 10.78.90.149
gawk
$ n=( $(gawk -F"=" '/ipAddress/{gsub(/\"/,"",$2);gsub(/;/," ",$2) ;printf $2" "}' file) )
$ echo ${n[#]}
10.78.91.138 10.78.90.150 10.77.1.101 10.78.90.137 10.78.90.149
This one works:
n=(`grep -i ipaddress filename | cut -d"=" -f2 | tr ';' ' '`)
EDIT: (improved, nestable version as per Dennis)
n=($(grep -i ipaddress filename | cut -d"=" -f2 | tr ';' ' '))
A variation on a theme:
$ line=$(grep -i ipaddress /opt/ipfile)
$ saveIFS="$IFS" # always save it and put it back to be safe
$ IFS="=;"
$ n=($line)
$ IFS="$saveIFS"
$ echo ${n[0]}
ipAddress
$ echo ${n[1]}
10.78.90.137
$ echo ${n[2]}
10.78.90.149
If the file has no other contents, you may not need the grep and you could read in the whole file.
$ saveIFS="$IFS"
$ IFS="=;"
$ n=$(</opt/ipfile)
$ IFS="$saveIFS"
A Perl solution:
n=($(perl -ne 's/ipAddress=(.*);/$1 / && print' filename))
which tests for and removes the unwanted characters in one operation.
You can do this by using IFS in bash.
First read the first line from file.
Seoncd convert that to an array with = as delimeter.
Third convert the value to an array with ; as delimeter.
Thats it !!!
#!/bin/bash
IFS='\n' read -r lstr < "a.txt"
IFS='=' read -r -a lstr_arr <<< $lstr
IFS=';' read -r -a ip_arr <<< ${lstr_arr[1]}
echo ${ip_arr[0]}
echo ${ip_arr[1]}

Resources