parse only first delimiter in bash - bash

Please find below function (keyvalue.sh) that parses a configuration file with key value pairs to return the value for passed argument key.
It works fine, if the value don't have any = (equals to operator), but if the value contains = (equals to) operator, it returns incorrect value.
function getValueForKey(){
while read -r line
do
#echo $line
key=`echo $line | cut -d = -f1`
value=`echo $line | cut -d = -f2`
if [ "$2" == "$key" ]; then
echo $value
fi;
done < "$1"
}
Please find below sample key-value configuration file (keys.txt) :-
Scala_Url="http://downloads.lightbend.com/scala/2.11.8/scala-2.11.8.tgz"
Zookeeper_Url="http://www-eu.apache.org/dist/zookeeper/stable/zookeeper-3.4.10.tar.gz"
Eclipse_Url="http://www.eclipse.org/downloads/download.php?file=/technology/epp/downloads/release/neon/3/eclipse-jee-neon-3-win32-x86_64.zip&mirror_id=1135"
Also, find below sample execution :-
$ls
keys.txt keyvalue.sh
$
$
$
$cat keys.txt
Scala_Url="http://downloads.lightbend.com/scala/2.11.8/scala-2.11.8.tgz"
Zookeeper_Url="http://www-eu.apache.org/dist/zookeeper/stable/zookeeper-3.4.10.tar.gz"
Eclipse_Url="http://www.eclipse.org/downloads/download.php?file=/technology/epp/downloads/release/neon/3/eclipse-jee-neon-3-win32-x86_64.zip&mirror_id=1135"
$
$
$. keyvalue.sh
$
$getValueForKey keys.txt "Scala_Url"
"http://downloads.lightbend.com/scala/2.11.8/scala-2.11.8.tgz"
$
$
$
$
$getValueForKey keys.txt "Zookeeper_Url"
"http://www-eu.apache.org/dist/zookeeper/stable/zookeeper-3.4.10.tar.gz"
$
$
$
$
$
$
$getValueForKey keys.txt "Eclipse_Url"
"http://www.eclipse.org/downloads/download.php?file
$
$
$
$
$
$cat keyvalue.sh
function getValueForKey(){
while read -r line
do
#echo $line
key=`echo $line | cut -d = -f1`
value=`echo $line | cut -d = -f2`
if [ "$2" == "$key" ]; then
echo $value
fi;
done < "$1"
}$
$
$
$
$

You shouldn't use cut at all for this:
getValueForKey(){
while IFS== read -r key value;
do
if [ "$2" = "$key" ]; then
echo "$value"
fi;
done < "$1"
}
read will split the line on the input separator =, and if there are more fields than named variables, it assigns all of the remaining line to the final variable named (in this case, value).
But really you should change your format. At the very least, sort the input and use look to find the values.

William Pursell's helpful answer is an effective pure shell solution, but such solutions are inevitably slow, which is why William recommends a key-sorted configuration file combined with look.
An alternative that doesn't require sorting is to use awk:
getValueForKey() {
awk -F= -v key="$2" '$1 == key { sub(/^[^=]+=/, ""); print }' "$1"
}
-F= splits each line into fields by all occurrences of =
We can still use $1, the 1st field (key field), to compare it to the key value of interest.
To output the corresponding value, however, a string substitution is used to ensure that the value is output as-is, even if it contains = instances:
sub(/^[^=]+=/, "") replaces (sub()) everything from the start of the line (^) up to the first = instance ([^=]+ matches a nonempty sequence (+) of characters other than (^) = followed by =) with the empty string, leaving just the value.
Sample call:
$ getValueForKey keys.txt 'Eclipse_Url'
"http://www.eclipse.org/downloads/download.php?file=/technology/epp/downloads/release/neon/3/eclipse-jee-neon-3-win32-x86_64.zip&mirror_id=1135"

Related

How to read each line in .tfvars and export key value as environmental var in bash?

I want to read from .tfvars which is variables_file_2 and export each variable (key) and value of it as environmental variable in bash so I can access it in my script
grep "^[^#]" "${variables_file_2}" > manifest.tfvars
while read line;
do
export "$(echo "${line}" | tr -d "\"")"
done < manifest.tfvars
Currently getting
export: `{={': not a valid identifier
as error.
sample of .tfvars is
region = "us-east-2"
You might want to use something more robust:
LANG=C awk -v OFS='=' '
$1 ~ /^[[:alpha:]_][[:alnum:]_]*$/ && $2 == "=" && match($0,/".*"/) {
print $1, substr($0,RSTART+1,RLENGTH-2)
}
' "${variables_file_2}" > manifest.tfvars
while IFS='=' read -r var value
do
export "$var"="$value"
done < manifest.tfvars
note: the awk assumes that the = are always surrounded by at least a space character
The code
grep "^[^#]" "${variables_file_2}" > manifest.tfvars
Looks like you're trying to remove/delete lines starting with a #
and the tr seems like trying to delete the quotes around the assignment after the = sign.
Something like this might do it.
#!/usr/bin/env bash
while IFS= read -r line; do
echo export -- "$line"
done < <(sed '/^#/d;s/ = /=/;s/"//g' "${variables_file_2}")
With the assumption that the input from "${variables_file_2}" is something like.
#
region = "us-east-2"
#
With
sed '/^#/d;s/ = /=/;s/"//g' "${variables_file_2}"
Or even like
sed '/^#/d;s/^\([^ ]*\) = "\([^"].*\)"/\1=\2/' "${variables_file_2}"
The output should be.
region=us-east-2
Remove the echo if you're ok with the output.

how to change words with the same words but with number at the back bash

I have a file for example with the name file.csv and content
adult,REZ
man,BRB
women,SYO
animal,HIJ
and a line that is nor a directory nor a file
file.csv BRB1 REZ3 SYO2
And what I want to do is change the content of the file with the words that are on the line and then get the nth letter of that word with the number at the end of the those words in capital
and the output should then be
umo
I know that I can get over the line with
for i in "${#:2}"
do
words+=$(echo "$i ")
done
and then the output is
REZ3 BRB1 SYO2
Using awk:
Pass the string of values as an awk variable and then split them into an array a. For each record in file.csv, iterate this array and if the second field of current record matches the first three characters of the current array value, then strip the target character from the first field of the current record and append it to a variable. Print the value of the aggregated variable.
awk -v arr="BRB1 REZ3 SYO2" -F, 'BEGIN{split(arr,a," ")} {for (v in a) { if ($2 == substr(a[v],0,3)) {n=substr(a[v],length(a[v]),1); w=w""substr($1,n,1) }}} END{print w}' file.csv
umo
You can also put this into a script:
#!/bin/bash
words="${2}"
src_file="${1}"
awk -v arr="$words" -F, 'BEGIN{split(arr,a," ")} \
{for (v in a) { \
if ($2 == substr(a[v],0,3)) { \
n=substr(a[v],length(a[v]),1); \
w=w""substr($1,n,1);
}
}
} END{print w}' "$src_file"
Script execution:
./script file.csv "BRB1 REZ3 SYO2"
umo
This is a way using sed.
Create a pattern string from command arguments and convert lines with sed.
#!/bin/bash
file="$1"
pat='s/^/ /;Te;'
for i in ${#:2}; do
pat+=$(echo $i | sed 's#^\([^0-9]*\)\([0-9]*\)$#s/.\\{\2\\}\\(.\\).*,\1$/\\1/;#')
done
pat+='Te;H;:e;${x;s/\n//g;p}'
eval "sed -n '$pat' $file"
Try this code:
#!/bin/bash
declare -A idx_dic
filename="$1"
pattern_string=""
for i in "${#:2}";
do
pattern_words=$(echo "$i" | grep -oE '[A-Z]+')
index=$(echo "$i" | grep -oE '[0-9]+')
pattern_string+=$(echo "$pattern_words|")
idx_dic["$pattern_words"]="$index"
done
pattern_string=${pattern_string%|*}
while IFS= read -r line
do
line_pattern=$(echo $line | grep -oE $pattern_string)
[[ -n $line_pattern ]] && line_index="${idx_dic[$line_pattern]}" && echo $line | awk -v i="$line_index" '{split($0, chars, ""); printf("%s", chars[i]);}'
done < $filename
first find the capital words pattern and catch the index corresponding
then construct the hole pattern words string which connect with |
at last, iterate the every line according to the pattern string, and find the letter by the index
Execute this script.sh like:
bash script.sh file.csv BRB1 REZ3 SYO2

Grep -rl from a .txt list

I'm trying to locate a list of strings from a .txt file, the search target is a directory of multiple .csv (locating which .csv contain the string)
I already find how to do it manually:
grep -rl doggo C:\dirofcsv\
The next step is to to it from a list of hundreds of terms.
I tried grep -rl -f list.txt C:\dirofcsv < print.txt but I only have the last term printed.. I want to have the results lines by lines.
I'm missing something but I don't know where.
I'm working on windows with a term emulator.
EDIT: I've found how to list the terms from a file.Now I need to see which terms have which result like " doggo => file2, file4" did I need to write a loop ?
Thanks community.
grep -rl -f list.txt C:\dirofcsv >> print.txt
You are looking to append lines to the print.txt file and so will need to use >> as opposed to > which will overwrite what is already in the file.
To get the output listed in the output required in your edited requirement, you can use a loop redirected back into awk:
awk '/^FILE -/ { fil=$3; # When the output start with "FILE -" set fil to the third space delimited field
next # Skip to the next line
}
{ arr[fil][$0]="" # Set up a 2 dimensional array with the search term (fil) as the first index and the name of the file the second
}
END { for (i in arr) { # Loop through the array
printf "%s => ",i; First print the search term in the format required
for (j in arr[i]) {
printf "%s,",j # Print the file name followed by a comma
}
printf "\n" # Print a new line
}
}' <<< "$(while read line # Read list.txt line by line
do
echo "FILE - $line"; Echo a marker for identification in awk
grep -l "$line" C:\dirofcsv ; # Grep for the line
done < list.txt)" >> print.txt
One liner:
awk '/^FILE -/ { fil=$3;next } { arr[fil][$0]="" } END { for (i in arr) { printf "%s => ",i;for (j in arr[i]) { printf "%s,",j } printf "\n" } }' <<< "$(while read line;do echo "FILE - $line";grep -l "$line" C:\dirofcsv done < list.txt)" >> print.txt
I think you meant to pass the command as:
grep -rl -f list.txt C:\dirofcsv >> print.txt
Give it a shot. It should take all patterns from list.txt line by line and search in the directory C:\dirofcsv for files with matching patterns and print their names to print.txt file.
Try this for printing without a loop (just like you asked in comments ;-)
One Line Answer
dir=C:\dirofcsv
listfile=list.txt
eval $(jq -Rsr 'split("\n") | map(select(length > 0)) | reduce .[] as $line ([]; . + ["echo \($line) :; grep -rl \($line) \($dir); echo"]) | (join("; "))' --arg dir "$dir" < "$listfile")
Another solution, for explanation say:
unset li
readarray li -u <"$listfile"
quoted_commands="$(jq -R 'reduce inputs as $line ([]; . + ["echo \($line) :; grep -rl \($line) \($dir); echo"]) | (join("; "))' \
--arg dir $dir \
<<< $(echo; printf "%s" "${li[#]}"))"
quoted_commands=${quoted_commands%\"}
commands=${quoted_commands#\"}
eval $commands
Breaking down the command for better explaination in comments:
# read contents of listfile in li
unset li && readarray li -u <"$listfile"
# add the content to new list so that it prints the list elements in new-lines
# also add a newline at top as it will be discarded by jq (in this case only)
list="$(echo; printf "%s" "${li[#]}";)"
# pass jq command
quoted_commands="$(jq -R 'reduce inputs as $line
([]; . + ["echo \($line) :; grep -rl \($line) \($dir); echo"])
| (join("; "))' \
--arg dir $dir <<< "$list")"
# the elements are read with reduce filter and converted to JSON Array of corresponding commands to execute
# the commands for all elements of list are joined with join filter
# trim quotes to execute commands properly
commands=$(sed -e 's/^"//' -e 's/"$//' <<< "$quoted_commands")
# run commands
eval "$commands"
You may want to print the above variables. Take care to use quotes in echo/printf while doing so, i.e., echo "$variable".
Replacement of sed command:
signgle_quoted=${quoted%\"}
commands=${signgle_quoted#\"}
echo "$commands"
I am now using the following implementations (though the dictionary implementation uses a for loop, the key : value implementation doesn't, and is a single line command):
# print an Associative bash array as a JSON dictionary
print_dict()
{
declare -n ref
ref=$1
for k in $(echo "${!ref[#]}")
do
printf '{"name":"%s", "value":"%s"}\n' "$k" "${ref[$k]}"
done | jq -s 'reduce .[] as $i ({}; .[$i.name] = $i.value)'
}
#-------------------------------------------------------------------------
# print the grep output in key : value format
function list_grep()
{
local listfile=$1
local dir=$2
eval $(jq -Rsr 'split("\n") | map(select(length > 0)) | reduce .[] as $line ([]; . + ["echo \($line) :; grep -rl \($line) \($dir); echo"]) | (join("; "))' --arg dir "$dir" < "$listfile")
}
#-------------------------------------------------------------------------
# print the grep output as JSON dictionary
function dict_grep()
{
local listfile=$1
local dir=$2
eval declare -A Arr=\($(eval echo $(jq -Rrs 'split("\n") | map(select(length > 0)) | reduce .[] as $k ([]; . + ["[\($k)]=\\\"$(grep -rl \($k) tmp)\\\""]) | (join(" "))' --arg dir $dir < tmp/list.txt))\)
print_dict Arr
}
#-------------------------------------------------------------------------
# call:
list_grep $listfile $dir
dict_grep $listfile $dir
-Himanshu

Parsing .ini file in bash

I have a below properties file and would like to parse it as mentioned below. Please help in doing this.
.ini file which I created :
[Machine1]
app=version1
[Machine2]
app=version1
app=version2
[Machine3]
app=version1
app=version3
I am looking for a solution in which ini file should be parsed like
[Machine1]app = version1
[Machine2]app = version1
[Machine2]app = version2
[Machine3]app = version1
[Machine3]app = version3
Thanks.
Try:
$ awk '/\[/{prefix=$0; next} $1{print prefix $0}' file.ini
[Machine1]app=version1
[Machine2]app=version1
[Machine2]app=version2
[Machine3]app=version1
[Machine3]app=version3
How it works
/\[/{prefix=$0; next}
If any line begins with [, we save the line in the variable prefix and then we skip the rest of the commands and jump to the next line.
$1{print prefix $0}
If the current line is not empty, we print the prefix followed by the current line.
Adding spaces
To add spaces around any occurrence of =:
$ awk -F= '/\[/{prefix=$0; next} $1{$1=$1; print prefix $0}' OFS=' = ' file.ini
[Machine1]app = version1
[Machine2]app = version1
[Machine2]app = version2
[Machine3]app = version1
[Machine3]app = version3
This works by using = as the field separator on input and = as the field separator on output.
I love John1024's answer. I was looking for exactly that. I have created a bash function that allows me to lookup sections or specific keys based on his idea:
function iniget() {
if [[ $# -lt 2 || ! -f $1 ]]; then
echo "usage: iniget <file> [--list|<section> [key]]"
return 1
fi
local inifile=$1
if [ "$2" == "--list" ]; then
for section in $(cat $inifile | grep "\[" | sed -e "s#\[##g" | sed -e "s#\]##g"); do
echo $section
done
return 0
fi
local section=$2
local key
[ $# -eq 3 ] && key=$3
# https://stackoverflow.com/questions/49399984/parsing-ini-file-in-bash
# This awk line turns ini sections => [section-name]key=value
local lines=$(awk '/\[/{prefix=$0; next} $1{print prefix $0}' $inifile)
for line in $lines; do
if [[ "$line" = \[$section\]* ]]; then
local keyval=$(echo $line | sed -e "s/^\[$section\]//")
if [[ -z "$key" ]]; then
echo $keyval
else
if [[ "$keyval" = $key=* ]]; then
echo $(echo $keyval | sed -e "s/^$key=//")
fi
fi
fi
done
}
So given this as file.ini
[Machine1]
app=version1
[Machine2]
app=version1
app=version2
[Machine3]
app=version1
app=version3
then the following results are produced
$ iniget file.ini --list
Machine1
Machine2
Machine3
$ iniget file.ini Machine3
app=version1
app=version3
$ iniget file.ini Machine1 app
version1
$ iniget file.ini Machine2 app
version2
version3
Again, thanks to #John1024 for his answer, I was pulling my hair out trying to create a simple bash ini parser that supported sections.
Tested on Mac using GNU bash, version 5.0.0(1)-release (x86_64-apple-darwin18.2.0)
You can try using awk:
awk '/\[[^]]*\]/{ # Match pattern like [...]
a=$1;next # store the pattern in a
}
NF{ # Match non empty line
gsub("=", " = ") # Add space around the = character
print a $0 # print the line
}' file
Excellent answers here. I made some modifications to #davfive's function to fit it better to my use case. This version is largely the same except it allows for whitespace before and after = characters, and allows values to have spaces in them.
# Get values from a .ini file
function iniget() {
if [[ $# -lt 2 || ! -f $1 ]]; then
echo "usage: iniget <file> [--list|<section> [key]]"
return 1
fi
local inifile=$1
if [ "$2" == "--list" ]; then
for section in $(cat $inifile | grep "^\\s*\[" | sed -e "s#\[##g" | sed -e "s#\]##g"); do
echo $section
done
return 0
fi
local section=$2
local key
[ $# -eq 3 ] && key=$3
# This awk line turns ini sections => [section-name]key=value
local lines=$(awk '/\[/{prefix=$0; next} $1{print prefix $0}' $inifile)
lines=$(echo "$lines" | sed -e 's/[[:blank:]]*=[[:blank:]]*/=/g')
while read -r line ; do
if [[ "$line" = \[$section\]* ]]; then
local keyval=$(echo "$line" | sed -e "s/^\[$section\]//")
if [[ -z "$key" ]]; then
echo $keyval
else
if [[ "$keyval" = $key=* ]]; then
echo $(echo $keyval | sed -e "s/^$key=//")
fi
fi
fi
done <<<"$lines"
}
For taking disparate sectional and tacking the section name (including 'no-section'/Default together) to each of its related keyword (along with = and its keyvalue), this one-liner AWK will do the trick coupled with a few clean-up regex.
ini_buffer="$(echo "$raw_buffer" | awk '/^\[.*\]$/{obj=$0}/=/{print obj $0}')"
Will take your lines and output them like you wanted:
+++ awk '/^\[.*\]$/{obj=$0}/=/{print obj $0}'
++ ini_buffer='[Machine1]app=version1
[Machine2]app=version1
[Machine2]app=version2
[Machine3]app=version1
[Machine3]app=version3'
A complete solution to the INI-format File
As Clonato, INI-format expert said that for the latest INI version 1.4 (2009-10-23), there are several other tricky aspects to the INI file:
character set constraint for section name
character set constraint for keyword
And lastly is for the keyvalue to be able to handle pretty much anthing that is not used in the section and keyword name; that includes nesting of quotes inside a pair of same single/double-quote.
Except for the nesting of quotes, a INI-format Github complete solution to parsing INI-format file with default section:
# syntax: ini_file_read <raw_buffer>
# outputs: formatted bracket-nested "[section]keyword=keyvalue"
ini_file_read()
{
local ini_buffer raw_buffer hidden_default
raw_buffer="$1"
# somebody has to remove the 'inline' comment
# there is a most complex SED solution to nested
# quotes inline comment coming ... TBA
raw_buffer="$(echo "$raw_buffer" | sed '
s|[[:blank:]]*//.*||; # remove //comments
s|[[:blank:]]*#.*||; # remove #comments
t prune
b
:prune
/./!d; # remove empty lines, but only those that
# become empty as a result of comment stripping'
)"
# awk does the removal of leading and trailing spaces
ini_buffer="$(echo "$raw_buffer" | awk '/^\[.*\]$/{obj=$0}/=/{print obj $0}')" # original
ini_buffer="$(echo "$ini_buffer" | sed 's/^\s*\[\s*/\[/')"
ini_buffer="$(echo "$ini_buffer" | sed 's/\s*\]\s*/\]/')"
# finds all 'no-section' and inserts '[Default]'
hidden_default="$(echo "$ini_buffer" \
| egrep '^[-0-9A-Za-z_\$\.]+=' | sed 's/^/[Default]/')"
if [ -n "$hidden_default" ]; then
echo "$hidden_default"
fi
# finds sectional and outputs as-is
echo "$(echo "$ini_buffer" | egrep '^\[\s*[-0-9A-Za-z_\$\.]+\s*\]')"
}
The unit test for this StackOverflow post is included in this file:
https://github.com/egberts/bash-ini-file
Source:
https://github.com/egberts/easy-admin/blob/main/test/section-regex.sh
https://cloanto.com/specs/ini/#escapesequences

Unix file pattern issue: append changing value of variable pattern to copies of matching line

I have a file with contents:
abc|r=1,f=2,c=2
abc|r=1,f=2,c=2;r=3,f=4,c=8
I want a result like below:
abc|r=1,f=2,c=2|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|3
The third column value is r value. A new line would be inserted for each occurrence.
I have tried with:
for i in `cat $xxxx.txt`
do
#echo $i
live=$(echo $i | awk -F " " '{print $1}')
home=$(echo $i | awk -F " " '{print $2}')
echo $live
done
but is not working properly. I am a beginner to sed/awk and not sure how can I use them. Can someone please help on this?
awk to the rescue!
$ awk -F'[,;|]' '{c=0;
for(i=2;i<=NF;i++)
if(match($i,/^r=/)) a[c++]=substr($i,RSTART+2);
delim=substr($0,length($0))=="|"?"":"|";
for(i=0;i<c;i++) print $0 delim a[i]}' file
abc|r=1,f=2,c=2|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|3
Use an inner routine (made up of GNU grep, sed, and tr) to compile a second more elaborate sed command, the output of which needs further cleanup with more sed. Call the input file "foo".
sed -n $(grep -no 'r=[0-9]*' foo | \
sed 's/^[0-9]*/&s#.*#\&/;s/:r=/|/;s/.*/&#p;/' | \
tr -d '\n') foo | \
sed 's/|[0-9|]*|/|/'
Output:
abc|r=1,f=2,c=2|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|3
Looking at the inner sed code:
grep -no 'r=[0-9]*' foo | \
sed 's/^[0-9]*/&s#.*#\&/;s/:r=/|/;s/.*/&#p;/' | \
tr -d '\n'
It's purpose is to parse foo on-the-fly (when foo changes, so will the output), and in this instance come up with:
1s#.*#&|1#p;2s#.*#&|1#p;2s#.*#&|3#p;
Which is almost perfect, but it leaves in old data on the last line:
sed -n '1s#.*#&|1#p;2s#.*#&|1#p;2s#.*#&|3#p;' foo
abc|r=1,f=2,c=2|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|1|3
...which old data |1 is what the final sed 's/|[0-9|]*|/|/' removes.
Here is a pure bash solution. I wouldn't recommend actually using this, but it might help you understand better how to work with files in bash.
# Iterate over each line, splitting into three fields
# using | as the delimiter. (f3 is only there to make
# sure a trailing | is not included in the value of f2)
while IFS="|" read -r f1 f2 f3; do
# Create an array of variable groups from $f2, using ;
# as the delimiter
IFS=";" read -a groups <<< "$f2"
for group in "${groups[#]}"; do
# Get each variable from the group separately
# by splitting on ,
IFS=, read -a vars <<< "$group"
for var in "${vars[#]}"; do
# Split each assignment on =, create
# the variable for real, and quit once we
# have found r
IFS== read name value <<< "$var"
declare "$name=$value"
[[ $name == r ]] && break
done
# Output the desired line for the current value of r
printf '%s|%s|%s\n' "$f1" "$f2" "$r"
done
done < $xxxx.txt
Changes for ksh:
read -A instead of read -a.
typeset instead of declare.
If <<< is a problem, you can use a here document instead. For example:
IFS=";" read -A groups <<EOF
$f2
EOF

Resources