Bash : Remove values present in file from a variable - bash

I have a variable which has values such as
"abc.def.ghi aa.bbb.ccc kk.lll.mmm ppp.qqq.lll"
and a file which has values
kk.lll.mmm
abc.def.ghi
I want to remove these values from the variable. File and variable has some values similar but not in same order.

If you don't care about ordering, one of the cleanest ways to maintain a set is with an associative array. This gives you O(1) ability to check for or remove items -- much better performance than you'd get with sed
## Convert from a string to an array
str="abc.def.ghi aa.bbb.ccc kk.lll.mmm ppp.qqq.lll"
read -r -a array <<<"$str"
## ...and from there to an *associative* array
declare -A items=( )
for item in "${array[#]}"; do
items[$item]=1
done
## ...whereafter you can remove items in O(1) time
while IFS= read -r line; do
unset "items[$item]"
done <file
## Write list of remaining items
printf 'Remaining item: %q\n' "${!items[#]}"
By the way, much of this code could be skipped if the original data were in associative-array from to start with:
# if the assignment looked like this, could just start at the "while read" loop.
declare -A items=( [abc.def.ghi]=1 [aa.bbb.ccc]=1 [kk.lll.mmm]=1 [ppp.qqq.lll]=1 )

Related

How to concatenate string to comma-separated element in bash

I am new to Bash coding. I would like to concatenate a string to each element of a comma-separated strings "array".
This is an example of what I have in mind:
s=a,b,c
# Here a function to concatenate the string "_string" to each of them.
# Expected result:
a_string,b_string,c_string
One way:
$ s=a,b,c
$ echo ${s//,/_string,}_string
a_string,b_string,c_string
Using a proper array is generally a much more robust solution. It allows the values to contain literal commas, whitespace, etc.
s=(a b c)
printf '%s\n' "${s[#]/%/_string}"
As suggested by chepner, you can use IFS="," to merge the result with commas.
(IFS=","; echo "${s[#]/%/_string}")
(The subshell is useful to keep the scope of the IFS reassignment from leaking to the current shell.)
Simply, you could use a for loop
main() {
local input='a,b,c'
local append='_string'
# create an 'output' variable that is empty
local output=
# convert the input into an array called 'items' (without the commas)
IFS=',' read -ra items <<< "$input"
# loop over each item in the array, and append whatever string we want, in this case, '_string'
for item in "${items[#]}"; do
output+="${item}${append},"
done
# in the loop, the comma was re-added back. now, we must remove the so there are only commas _in between_ elements
output=${output%,}
echo "$output"
}
main
I've split it up in three steps:
Make it into an actual array.
Append _string to each element in the array using Parameter expansion.
Turn it back into a scalar (for which I've made a function called turn_array_into_scalar).
#!/bin/bash
function turn_array_into_scalar() {
local -n arr=$1 # -n makes `arr` a reference the array `s`
local IFS=$2 # set the field separator to ,
arr="${arr[*]}" # "join" over IFS and assign it back to `arr`
}
s=a,b,c
# make it into an array by turning , into newline and reading into `s`
readarray -t s < <(tr , '\n' <<< "$s")
# append _string to each string in the array by using parameter expansion
s=( "${s[#]/%/_string}" )
# use the function to make it into a scalar again and join over ,
turn_array_into_scalar s ,
echo "$s"

How do i add whitepsaces to a String while filling it up in a for-loop in Bash?

Have a string as follows:
files="applications/dbt/Dockerfile applications/dbt/cloudbuild.yaml applications/dataform/Dockerfile applications/dataform/cloudbuild.yaml"
Want to extract the first two directories and save it as another string like this:
"applications/dbt applications/dbt applications/dataform pplications/dataform"
But while filling up the second string, its being saved as
applications/dbtapplications/dbtapplications/dataformapplications/dataform
What i tried:
files="applications/dbt/Dockerfile applications/dbt/cloudbuild.yaml applications/dataform/Dockerfile applications/dataform/cloudbuild.yaml"
arr=($files)
#extracting the first two directories and saving it to a new string
for i in ${arr[#]}; do files2+=$(echo "$i" | cut -d/ -f 1-2); done
echo $files2
files2 echoes the following
applications/dbtapplications/dbtapplications/dataformapplications/dataform
Reusing your code as much as possible:
(assuming to only remove the last right part):
arr=( applications/dbt/Dockerfile applications/dbt/cloudbuild.yaml applications/dataform/Dockerfile applications/dataform/cloudbuild.yaml )
#extracting the first two directories and saving it to a new string
for file in "${arr[#]}"; do
files2+="${file%/*} "
done
echo "$files2"
applications/dbt applications/dbt applications/dataform
You could use a for loop as requested
for dir in ${files};
do file2+=$(printf '%s ' "${dir%/*}")
done
which will give output
$ echo "$file2"
applications/dbt applications/dbt applications/dataform applications/dataform
However, it would be much easier with sed
$ sed -E 's~([^/]*/[^/]*)[^ ]*~\1~g' <<< $files
applications/dbt applications/dbt applications/dataform applications/dataform
Convert the string in an array first. Assuming there are no white/blank/newline space embedded in your strings/path name. Something like
#!/usr/bin/env bash
files="applications/dbt/Dockerfile applications/dbt/cloudbuild.yaml applications/dataform/Dockerfile applications/dataform/cloudbuild.yaml"
mapfile -t array <<< "${files// /$'\n'}"
Now check the value of the array
declare -p array
Output
declare -a array=([0]="applications/dbt/Dockerfile" [1]="applications/dbt/cloudbuild.yaml" [2]="applications/dataform/Dockerfile" [3]="applications/dataform/cloudbuild.yaml")
Remove all the last / from the path name in the array.
new_array=("${array[#]%/*}")
Check the new value
declare -p new_array
Output
declare -a new_array=([0]="applications/dbt" [1]="applications/dbt" [2]="applications/dataform" [3]="applications/dataform")
Now the value is an array, assign it to a variable or do what ever you like with it. Like what was mentioned in the comment section. Use an array from the start.
Assign the first 2 directories/path in a variable (weird requirement)
new_var="${new_array[#]::2}"
declare -p new_var
Output
declare -- new_var="applications/dbt applications/dbt"

Dynamically creating associative arrays in bash

I have a variable ($OUTPUT) that contains the following name / value pairs:
member_id=4611686018429783292
platform=Xbox
platform_id=1
character_id=2305843009264966985
period_dt=2020-11-25 20:31:14.923158 UTC
mode=all Crucible modes
mode_id=5
activities_entered=18
activities_won=10
activities_lost=8
assists=103
kills=233
average_kill_distance=15.729613
total_kill_distance=3665
seconds_played=8535
deaths=118
average_lifespan=71.72269
total_lifespan=8463.277
opponents_defeated=336
efficiency=2.8474576
kills_deaths_ratio=1.9745762
kills_deaths_assists=2.411017
suicides=1
precision_kills=76
best_single_game_kills=-1
Each line ends with \n.
I want to loop through them, and parse them into an associative array, and the access the values in the array by the variable names:
while read line
do
key=${line%%=*}
value=${line#*=}
echo $key=$value
data[$key]="$value"
done < <(echo "$OUTPUT")
#this always prints the last value
echo ${data['seconds_played']}
This seems to work, i.e. key/value print the right values, but when I try to pull any values from the array, it always returns the last value (in this case -1).
I feel like im missing something obvious, but have been banging my head against it for a couple of hours.
UPDATE: My particular issue is I'm running a version of bash (3.2.57 on OSX) that doesn't support associative arrays). I'll mark the correct answer below.
Without declare -A data, then data is a normal array. In normal arrays expressions in [here] first undergo expansions, then arithmetic expansion. Inside arithmetic expansion unset variables are expanded to 0. You are effectively only just setting data[0]=something, because data[$key] is data[seconds_played] -> variable seconds_played is not defined, so it expands to data[0]
Add declare -A data and it "should work". You could also just:
declare -A data
while IFS== read -r key value; do
data["$key"]="$value"
done <<<"$OUTPUT"
Try declaring data as an associative array before populating it, eg:
$ typeset -A data # declare as an associative array
$ while read line
do
key=${line%%=*}
value=${line#*=}
echo $key=$value
data[$key]="$value"
done <<< "${OUTPUT}"
$ typeset -p data
declare -A data=([mode]="all Crucible modes" [period_dt]="2020-11-25 20:31:14.923158 UTC" [deaths]="118" [best_single_game_kills]="-1" [efficiency]="2.8474576" [precision_kills]="76" [activities_entered]="18" [seconds_played]="8535" [total_lifespan]="8463.277" [average_lifespan]="71.72269" [character_id]="2305843009264966985" [kills]="233" [activities_won]="10" [average_kill_distance]="15.729613" [activities_lost]="8" [mode_id]="5" [assists]="103" [suicides]="1" [total_kill_distance]="3665" [platform]="Xbox" [kills_deaths_ratio]="1.9745762" [platform_id]="1" [kills_deaths_assists]="2.411017" [opponents_defeated]="336" [member_id]="4611686018429783292" )
$ echo "${data['seconds_played']}"
8535

Bash. Associative array iteration (ordered and without duplicates)

I have two problems handling associative arrays. First one is that I can't keep a custom order on it.
#!/bin/bash
#First part, I just want to print it ordered in the custom created order (non-alphabetical)
declare -gA array
array["PREFIX_THIS","value"]="true"
array["PREFIX_IS","value"]="false"
array["PREFIX_AN","value"]="true"
array["PREFIX_ORDERED","value"]="true"
array["PREFIX_ARRAY","value"]="true"
for item in "${!array[#]}"; do
echo "${item}"
done
Desired output is:
PREFIX_THIS,value
PREFIX_IS,value
PREFIX_AN,value
PREFIX_ORDERED,value
PREFIX_ARRAY,value
But I'm obtaining this:
PREFIX_IS,value
PREFIX_ORDERED,value
PREFIX_THIS,value
PREFIX_AN,value
PREFIX_ARRAY,value
Until here the first problem. For the second problem, the order is not important. I added more stuff to the associative array and I just want to loop on it without duplicates. Adding this:
array["PREFIX_THIS","text"]="Text for the var"
array["PREFIX_IS","text"]="Another text"
array["PREFIX_AN","text"]="Text doesn't really matter"
array["PREFIX_ORDERED","text"]="Whatever"
array["PREFIX_ARRAY","text"]="More text"
I just want to loop over "PREFIX_THIS", "PREFIX_IS", "PREFIX_AN", etc... printing each one only once. I just want to print doing an "echo" on loop (order is not important for this part, just to print each one only once). Desired output:
PREFIX_ORDERED
PREFIX_AN
PREFIX_ARRAY
PREFIX_IS
PREFIX_THIS
I achieved it doing "dirty" stuff. But there must be a more elegant way. This is my working but not too much elegant approach:
already_set=""
var_name=""
for item in "${!array[#]}"; do
var_name="${item%,*}"
if [[ ! ${already_set} =~ "${var_name}" ]]; then
echo "${var_name}"
already_set+="${item}"
fi
done
Any help? Thanks.
Iteration Order
As Inian pointed out in the comments, you cannot fix the order in which "${!array[#]}" expands for associative arrays. However, you can store all keys inside a normal array that you can order manually.
keysInCustomOrder=(PREFIX_{THIS,IS,AN,ORDERED,ARRAY})
for key in "${keysInCustomOrder[#]}"; do
echo "do something with ${array[$key,value]}"
done
Unique Prefixes of Keys
For your second problem: a["key1","key2"] is the same as a["key1,key2"]. In bash, arrays are always 1D therefore there is no perfect solution. However, you can use the following one-liner as long as , is never part of key1.
$ declare -A array=([a,1]=x [a,2]=y [b,1]=z [c,1]=u [c,2]=v)
$ printf %s\\n "${!array[#]}" | cut -d, -f1 | sort -u
a
b
c
When your keys may also contain linebreaks delemit each key by null \0.
printf %s\\0 "${!array[#]}" | cut -zd, -f1 | sort -zu
Alternatively you could use reference variables to simulate 2D-arrays, however I would advice against using them.

Open file with two columns and dynamically create variables

I'm wondering if anyone can help. I've not managed to find much in the way of examples and I'm not sure where to start coding wise either.
I have a file with the following contents...
VarA=/path/to/a
VarB=/path/to/b
VarC=/path/to/c
VarD=description of program
...
The columns are delimited by the '=' and some of the items in the 2nd column may contain gaps as they aren't just paths.
Ideally I'd love to open this in my script once and store the first column as the variable and the second as the value, for example...
echo $VarA
...
/path/to/a
echo $VarB
...
/path/to/a
Is this possible or am I living in a fairy land?
Thanks
You might be able to use the following loop:
while IFS== read -r name value; do
declare "$name=$value"
done < file.txt
Note, though, that a line like foo="3 5" would include the quotes in the value of the variable foo.
A minus sign or a special character isn't allowed in a variable name in Unix.
You may consider using BASH associative array for storing key and value together:
# declare an associative array
declare -A arr
# read file and populate the associative array
while IFS== read -r k v; do
arr["$k"]="$v"
done < file
# check output of our array
declare -p arr
declare -A arr='([VarA]="/path/to/a" [VarC]="/path/to/c" [VarB]="/path/to/b" [VarD]="description of program" )'
What about source my-file? It won't work with spaces though, but will work for what you've shared. This is an example:
reut#reut-home:~$ cat src
test=123
test2=abc/def
reut#reut-home:~$ echo $test $test2
reut#reut-home:~$ source src
reut#reut-home:~$ echo $test $test2
123 abc/def

Resources