I'm trying to read a property file like this one into a set of arrays:
DATABASE="mysql57"
DB_DRIVER_XA="com.mysql.cj.jdbc.MysqlXADataSource"
DB_DRIVER_CLASS="com.mysql.cj.jdbc.Driver"
DATABASE="db2_111"
DB_DRIVER_XA="com.ibm.db2.jcc.DB2XADataSource"
DB_DRIVER_CLASS="com.ibm.db2.jcc.DB2Driver"
I've found the following grep to be useful to store each key into its array:
filename=conf.properties
dblist=($(grep "DATABASE" $filename))
xadriver=($(grep "DB_DRIVER_XA" $filename))
driver=($(grep "DB_DRIVER_CLASS" $filename))
The problem is that the above solution stores into the array KEY=VALUE:
printf '%s\n' "${dblist[#]}"
DATABASE="mysql57"
DATABASE="db2_111"
I'd like to have in each array only the value. Is there a simple way to do it rather than looping over the array and maybe use "cut" to remove the "KEY=" part?
Sure:
databases=()
xas=()
classes=()
while IFS="=" read -r var value; do
without_quotes=${value//\"/}
case $var in
DATABASE) databases+=( "$without_quotes" ) ;;
DB_DRIVER_XA) xas+=( "$without_quotes" ) ;;
DB_DRIVER_CLASS) classes+=( "$without_quotes" ) ;;
esac
done < file
declare -p databases xas classes
declare -a databases='([0]="mysql57" [1]="db2_111")'
declare -a xas='([0]="com.mysql.cj.jdbc.MysqlXADataSource" [1]="com.ibm.db2.jcc.DB2XADataSource")'
declare -a classes='([0]="com.mysql.cj.jdbc.Driver" [1]="com.ibm.db2.jcc.DB2Driver")'
The take-away is to use IFS with the read command to split the line into fields, and store the results in separate variables.
Use awk -F= to split each line into key and value, and sed to strip out the quotes.
dblist=( $(awk -F= '$1=="DATABASE" {print $2}' "$filename" | sed 's/"//g'))
xadriver=($(awk -F= '$1=="DB_DRIVER_XA" {print $2}' "$filename" | sed 's/"//g'))
driver=( $(awk -F= '$1=="DB_DRIVER_CLASS" {print $2}' "$filename" | sed 's/"//g'))
Then, it would be better to use readarray to populate arrays to prevent word splitting on spaces and glob expansion on * and ?.
readarray -t dblist < <(awk -F= '$1=="DATABASE" {print $2}' "$filename" | sed 's/"//g')
readarray -t xadriver < <(awk -F= '$1=="DB_DRIVER_XA" {print $2}' "$filename" | sed 's/"//g')
readarray -t driver < <(awk -F= '$1=="DB_DRIVER_CLASS" {print $2}' "$filename" | sed 's/"//g')
I have a TSV file with 3 columns, that is assigned to paramfile.
Here is my script:
#! /bin/bash -l
paramfile=/path/to/file
while
sample=`sed -n ${number}p $paramfile | awk '{print $1}'`
Reads1=`sed -n ${number}p $paramfile | awk '{print $2}'`
Reads2=`sed -n ${number}p $paramfile | awk '{print $3}'`
do
./program.sh $sample $reads1 $reads2
done
I want it to read the TSV line by line, and for each line take the content of each column and insert it into my program, to be used as an option in program.sh
I know I haven't got the loop qutie right, what am I missing?
read with a ‘custom’ $IFS can read TSV* into variables, e.g:
#!/bin/bash
paramfile=/path/to/file
while IFS="$(printf '\t')" read -r sample reads1 reads2 _
do
./program.sh "${sample}" "${reads1}" "${reads2}"
done < "${paramfile}"
The _ is for dropping any trailing cells.
And I took the liberty to quote all variables, as one should.
*Not quoted TSV, though.
I'm writing this script to count some variables from an input file. I can't figure out why it is not counting the elements in the array (should be 500) but only counts 1.
#initializing variables
timeout=5
headerFile="lab06.output"
dataFile="fortune500.tsv"
dataURL="http://www.tech.mtu.edu/~toarney/sat3310/lab09/"
dataPath="/home/pjvaglic/Documents/labs/lab06/data/"
curlOptions="--silent --fail --connect-timeout $timeout"
#creating the array
declare -a myWebsitearray #=('cut -d '\t' -f3 "dataPath$dataFile"')
#obtaining the data file
wget $dataURL$dataFile -O $dataPath$dataFile
#getting rid of the crap from dos
sed -e "s/^m//" $dataPath$dataFile | readarray -t $myWebsitesarray
readarray -t myWebsitesarray < <(cut -d, -f3 $dataPath$dataFile)
myWebsitesarray=("${#myWebsitesarray[#]:1}")
#printf '%s\n' "${myWebsitesarray2[#]}"
websitesCount=${#myWebsitesarray[*]}
echo $websitesCount
You are overwriting your array with the count of elements in this line
myWebsitesarray=("${#myWebsitesarray[#]:1}")
Remove the hash sign
myWebsitesarray=("${myWebsitesarray[#]:1}")
Also, #chepner suggestions are good to follow.
I have a file with contents:
abc|r=1,f=2,c=2
abc|r=1,f=2,c=2;r=3,f=4,c=8
I want a result like below:
abc|r=1,f=2,c=2|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|3
The third column value is r value. A new line would be inserted for each occurrence.
I have tried with:
for i in `cat $xxxx.txt`
do
#echo $i
live=$(echo $i | awk -F " " '{print $1}')
home=$(echo $i | awk -F " " '{print $2}')
echo $live
done
but is not working properly. I am a beginner to sed/awk and not sure how can I use them. Can someone please help on this?
awk to the rescue!
$ awk -F'[,;|]' '{c=0;
for(i=2;i<=NF;i++)
if(match($i,/^r=/)) a[c++]=substr($i,RSTART+2);
delim=substr($0,length($0))=="|"?"":"|";
for(i=0;i<c;i++) print $0 delim a[i]}' file
abc|r=1,f=2,c=2|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|3
Use an inner routine (made up of GNU grep, sed, and tr) to compile a second more elaborate sed command, the output of which needs further cleanup with more sed. Call the input file "foo".
sed -n $(grep -no 'r=[0-9]*' foo | \
sed 's/^[0-9]*/&s#.*#\&/;s/:r=/|/;s/.*/&#p;/' | \
tr -d '\n') foo | \
sed 's/|[0-9|]*|/|/'
Output:
abc|r=1,f=2,c=2|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|3
Looking at the inner sed code:
grep -no 'r=[0-9]*' foo | \
sed 's/^[0-9]*/&s#.*#\&/;s/:r=/|/;s/.*/&#p;/' | \
tr -d '\n'
It's purpose is to parse foo on-the-fly (when foo changes, so will the output), and in this instance come up with:
1s#.*#&|1#p;2s#.*#&|1#p;2s#.*#&|3#p;
Which is almost perfect, but it leaves in old data on the last line:
sed -n '1s#.*#&|1#p;2s#.*#&|1#p;2s#.*#&|3#p;' foo
abc|r=1,f=2,c=2|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|1|3
...which old data |1 is what the final sed 's/|[0-9|]*|/|/' removes.
Here is a pure bash solution. I wouldn't recommend actually using this, but it might help you understand better how to work with files in bash.
# Iterate over each line, splitting into three fields
# using | as the delimiter. (f3 is only there to make
# sure a trailing | is not included in the value of f2)
while IFS="|" read -r f1 f2 f3; do
# Create an array of variable groups from $f2, using ;
# as the delimiter
IFS=";" read -a groups <<< "$f2"
for group in "${groups[#]}"; do
# Get each variable from the group separately
# by splitting on ,
IFS=, read -a vars <<< "$group"
for var in "${vars[#]}"; do
# Split each assignment on =, create
# the variable for real, and quit once we
# have found r
IFS== read name value <<< "$var"
declare "$name=$value"
[[ $name == r ]] && break
done
# Output the desired line for the current value of r
printf '%s|%s|%s\n' "$f1" "$f2" "$r"
done
done < $xxxx.txt
Changes for ksh:
read -A instead of read -a.
typeset instead of declare.
If <<< is a problem, you can use a here document instead. For example:
IFS=";" read -A groups <<EOF
$f2
EOF
Im new to bash scripting... Im trying to sort and store unique values from an array into another array.
eg:
list=('a','b','b','b','c','c');
I need,
unique_sorted_list=('b','c','a')
I tried a couple of things, didnt help me ..
sorted_ids=($(for v in "${ids[#]}"; do echo "$v";done| sort| uniq| xargs))
or
sorted_ids=$(echo "${ids[#]}" | tr ' ' '\n' | sort -u | tr '\n' ' ')
Can you guys please help me in this ....
Try:
$ list=(a b b b c c)
$ unique_sorted_list=($(printf "%s\n" "${list[#]}" | sort -u))
$ echo "${unique_sorted_list[#]}"
a b c
Update based on comments:
$ uniq=($(printf "%s\n" "${list[#]}" | sort | uniq -c | sort -rnk1 | awk '{ print $2 }'))
The accepted answer doesn't work if array elements contain spaces.
Try this instead:
readarray -t unique_sorted_list < <( printf "%s\n" "${list[#]}" | sort -u )
In Bash, readarray is an alias to the built-in mapfile command. See help mapfile for details.
The -t option is to remove the trailing newline (used in printf) from each line read.