Advanced AWK formatting - bash

I am having problem using this awk command . It is not producing the result I want giving this input file. Can someone help me with this please?
I am searching for "Class:" value of "ABC". When I find ABC . I like to assign the values associated with userName/servicelist/hostlist and port number to variables. ( please see output section ) to
awk -v q="\"" '/ABC/{f=1;c++}
f && /userName|serviceList|hostList|portNumber/
{sub(":",c"=",$1);
print $1 q $3 q
}
/port:/{f=0;print ""}' filename
The file contains the following input
Instance: Ths is a test
Class: ABC
Variables:
udpRecvBufSize: Numeric: 8190000
userName: String:test1
pingInterval: Numeric: 2
blockedServiceList: String:
acceptAllServices: Boolean: False
serviceList: String: ABC
hostList: String: 159.220.108.3
protocol: String: JJJJ
portNumber: Numeric: 20001
port: String: RTR_LLLL
Children:
Instance: The First Server in the Loop
Class: Servers
Variables:
pendout: Numeric: 0
overflows: Counter: 0
peakBufferUsage: Numeric: 100
bufferPercentage: Gauge: 1 (0,100)
currentBufferUsage: Numeric: 1
pendingBytesOut: Numeric: 0
pendingBytesIn: Numeric: 1
pingsReceived: Counter: 13597
pingsSent: Counter: 87350
clientToServerPings: Boolean: True
serverToClientPings: Boolean: True
numInputBuffers: Numeric: 10
maxOutputBuffers: Numeric: 100
guaranteedOutputBuffers: Numeric: 100
lastOutageDuration: String: 0:00:00:00
peakDisconnectTime: String:
totalDisconnectTime: String: 0:00:00:00
disconnectTime: String:
disconnectChannel: Boolean: False
enableDacsPermTest: Boolean: False
enableFirewall: Boolean: False
dacsPermDenied: Counter: 0
dacsDomain: String:
compressPercentage: Gauge: 0 (0,100)
uncompBytesSentRate: Gauge: 0 (0,9223372036854775807)
Instance: Ths is a test
Class: ABC
Variables:
udpRecvBufSize: Numeric: 8190000
userName: String:test2
pingInterval: Numeric: 4
blockedServiceList: String:
acceptAllServices: Boolean: False
serviceList: String: DEF
hostList: String: 159.220.111.2
protocol: String: ffff
portNumber: Numeric: 20004
port: String: JJJ_LLLL
Children:
This is the output I am looking for . Assigning variables
userName1="test1"
serviceList1="ABC"
hostList1="159.220.108.3"
portNumber1="2001"
userName2="test2"
serviceList2="DEF"
hostList2="159.220.111.2"
portNumber2="2004"

If your intention is to assign to a series of variables, then rather than parsing the whole file at once, perhaps you could just extract the specific parts that you're interested in one by one. For example:
$ awk -F'\n' -v RS= -v record=1 -v var=userName 'NR == record { for (i=1; i<=NF; ++i) if (sub("^\\s*" var ".*:\\s*", "", $i)) print $i }' file
test1
$ awk -F'\n' -v RS= -v record=1 -v var=serviceList 'NR == record { for (i=1; i<=NF; ++i) if (sub("^\\s*" var ".*:\\s*", "", $i)) print $i }' file
ABC
The awk script could be put inside a shell function and used like this:
parse_file() {
record=$1
var=$2
file=$3
awk -F'\n' -v RS= -v record="$record" -v var="$var" 'NR == record {
for (i=1; i<=NF; ++i) if (sub("^\\s*" var ".*:\\s*", "", $i)) print $i
}' "$file"
}
userName1=$(parse_file 1 userName file)
serviceList1=$(parse_file 1 serviceList file)
# etc.

$ awk -F: -v q="\"" '/Class: ABC/{f=1;c++;print ""} \
f && /userName|serviceList|hostList|portNumber/ \
{gsub(/ /,"",$1); \
gsub(/ /,"",$3); \
print $1 c "=" q $3 q} \
/Children:/{f=0}' vars
userName1="test1"
serviceList1="ABC"
hostList1="159.220.108.3"
portNumber1="20001"
userName2="test2"
serviceList2="DEF"
hostList2="159.220.111.2"
portNumber2="20004"
it will increment the counter for each "Class: ABC" pattern and set a flag. Will format and print the selected entries until the terminal pattern for the block. This limits the context between the two patterns.

Assuming bash 4.0 or newer, there's no need for awk here at all:
flush() {
if (( ${#hostvars[#]} )); then
for varname in userName serviceList hostList portNumber; do
[[ ${hostvars[$varname]} ]] && {
printf '%q=%q\n' "$varname" "${hostvars[$varname]}"
}
done
printf '\n'
fi
hostvars=( )
}
class=
declare -A hostvars=( )
while read -r line; do
[[ $line = *"Class: "* ]] && class=${line#*"Class: "}
[[ $class = ABC ]] || continue
case $line in
*:*:*)
IFS=$': \t' read varName varType value <<<"$line"
hostvars[$varName]=$value
;;
*"Variables:"*)
flush
;;
esac
done
flush
Notable points:
The full set of defined variables are collected in the hostvars associative array (what other languages might call a "map" or "hash"), even though we're only printing the four names defined to be of interest. More interesting logic could thus be defined that combined multiple variables to decide what to output, &c.
The flush function is defined outside the loop so it can be used in multiple places -- both when starting a new block (as detected, here, by seeing Variables:), and when at the end-of-file.
The output varies from what you requested in that it includes quotes only if necessary -- but that quoting is guaranteed to be correct and sufficient for bash to parse without room for security holes even if the strings being emitted would otherwise contain security-relevant content. Think about correctly handling a case where serviceList contains $(rm -rf /*)'$(rm -rf /*)' (the duplication being present to escape single quotes); printf %q makes this easy, whereas awk has no equivalent.

Solution in TXR:
#(collect)
#(skip)Class: ABC
Variables:
# (gather)
userName: String:#user
serviceList: String: #servicelist
hostList: String: #hostlist
portNumber: Numeric: #port
# (until)
Children:
# (end)
#(end)
#(deffilter shell-esc
("\"" "\\\"") ("$" "\\$") ("`" "\\'")
("\\" "\\\\"))
#(output :filter shell-esc)
# (repeat :counter i)
userName#(succ i)="#user"
serviceList#(succ i)="#servicelist"
hostList#(succ i)="#hostlist"
portNumber#(succ i)="#port"
# (end)
#(end)
Run:
$ txr data.txr data
userName1="test1"
serviceList1="ABC"
hostList1="159.220.108.3"
portNumber1="20001"
userName2="test2"
serviceList2="DEF"
hostList2="159.220.111.2"
portNumber2="20004"
Note 1: Escaping is necessary if the data may contain characters which are special between quotes in the target language. The shell-esc filter is based on the assumption that the generated variable assignments are shell syntax. It can easily be replaced.
Note 2: The code assumes that each Class: ABC has all of the required variables present. It will not work right if some are missing, and there are two ways to address it by tweaking the #(gather) line:
failure:
#(gather :vars (user servicelist hostlist port))
Meaning: fail if any of these four variables are not gathered. The consequence is that the entire Class: ABC section with missing variables is skipped.
default missing:
#(gather :vars (user (servicelist "ABC") hostlist port))
Meaning: must gather the four variables user, servicelist, hostlist and port. However, if serviceList is missing, then it gets the default value "ABC" and is treated as if it had been found.

Related

Print string variable that stores the output of a command in Bash [duplicate]

This question already has answers here:
Add a prefix string to beginning of each line
(18 answers)
Closed last month.
I need to place the output of a command in Bash into a string variable.
Each value should be separated by a space. There are many options to do that but I cannot use mapfileor read options (I'm using Bash < 4 version in macOS).
This is the output of the command:
values="$(mycommand | awk 'NR > 2 { printf "%s\n", $2 }')"
where mycommand is just a cloud command that gets some values like:
echo $values
mycommand output: (which I think is a string ending with \n for each value)
55369972
75369973
85369974
95369975
This is what I'm trying to do:
Here I should print the values like (I need to iterate over the variable values so I can print each value individually).
desired output in the foor loop
value: 55369972
value: 75369973
value: 85369974
value: 95369975
but I'm getting this:
value: 55369972 75369973 85369974 95369975
# Getting the id field of the values
values="$(mycommand| awk 'NR > 2 { printf "%s\n", $2 }')"
# Replacing the new line with a space so I can iterate over each value
new_values="${values//$'\n'/ }"
# new_values=("${values//$'\n'/ }")
# Checking if I can print each value correctly
for i in "${new_values[#]}"
# for i in "$new_values"
do
echo "value: ${i}"
done
Also, I cannot use things like
# shellcheck disable=xxx
values=($(echo "${values}" | tr "\n" " "))
As I'm getting error messages when checking the code...
Any idea what I'm doing wrong in my code?
try this:
#!/bin/bash
values="$(mycommand | awk 'NR > 2 { printf "%s\n", $2 }')"
for v in $values; do
echo value: $v
done
Your step that replaces the newlines with spaces renders it as a string. If you want to split that string into a list, you should put it in brackets (based on this answer )
This should do what you are expecting:
# Getting the id field of the values
values="$(mycommand| awk 'NR > 2 { printf "%s\n", $2 }')"
# Replacing the new line with a space
new_values=("${values//$'\n'/ }")
# Checking if I can print the values correctly
for i in ${new_values}
do
echo "value: ${i}"
done
where new_values=("${values//$'\n'/ }") is the crucial part, then you need to avoid putting it in quotes when you iterate it (or you turn it back into a string)
Since I can't paste code into the comments, I post an answer but the credits go to #akathimy above.
This works for me (solution #1):
#!/bin/bash
# Getting the id field of the values
values="55369972 75369973 85369974 95369975"
#
for v in $values; do
echo value: "$v"
done
and this also (solution #2):
#!/bin/bash
# Getting the id field of the values
values="55369972
75369973
85369974
95369975"
#
for v in $values; do
echo value: "$v"
done
Edit: And what about this one (solution #3)? :
#!/bin/bash
# Getting the id field of the values
values=("55369972
75369973
85369974
95369975")
#
for v in ${values[#]}; do
echo value: "$v"
done
This last one works for me, and perhaps also for you. Let me know.

Convert a key:value file w/ comments into JSON document with UNIX tools

I have a file in a subset of YAML with data such as the below:
# This is a comment
# This is another comment
spark:spark.ui.enabled: 'false'
spark:spark.sql.adaptive.enabled: 'true'
yarn:yarn.nodemanager.log.retain-seconds: '259200'
I need to convert that into a JSON document looking like this (note that strings containing booleans and integers still remain strings):
{
"spark:spark.ui.enabled": "false",
"spark:spark.sql.adaptive.enabled": "true",
"yarn:yarn.nodemanager.log.retain-seconds", "259200"
}
The closest I got was this:
cat << EOF > ./file.yaml
> # This is a comment
> # This is another comment
>
>
> spark:spark.ui.enabled: 'false'
> spark:spark.sql.adaptive.enabled: 'true'
> yarn:yarn.nodemanager.log.retain-seconds: '259200'
> EOF
echo {$(cat file.yaml | grep -o '^[^#]*' | sed '/^$/d' | awk -F": " '{sub($1, "\"&\""); print}' | paste -sd "," - )}
which apart from looking rather gnarly doesn't give the correct answer, it returns:
{"spark:spark.ui.enabled": 'false',"spark:spark.sql.adaptive.enabled": 'true',"dataproc:dataproc.monitoring.stackdriver.enable": 'true',"spark:spark.submit.deployMode": 'cluster'}
which, if I pipe to jq causes a parse error.
I'm hoping I'm missing a much much easier way of doing this but I can't figure it out. Can anyone help?
Implemented in pure jq (tested with version 1.6):
#!/usr/bin/env bash
jq_script=$(cat <<'EOF'
def content_for_line:
"^[[:space:]]*([#]|$)" as $ignore_re | # regex for comments, blank lines
"^(?<key>.*): (?<value>.*)$" as $content_re | # regex for actual k/v pairs
"^'(?<value>.*)'$" as $quoted_re | # regex for values in single quotes
if test($ignore_re) then {} else # empty lines add nothing to the data
if test($content_re) then ( # non-empty: match against $content_re
capture($content_re) as $content | # ...and put the groups into $content
$content.key as $key | # string before ": " becomes $key
(if ($content.value | test($quoted_re)) then # if value contains literal quotes...
($content.value | capture($quoted_re)).value # ...take string from inside quotes
else
$content.value # no quotes to strip
end) as $value | # result of the above block becomes $value
{"\($key)": "\($value)"} # and return a map from one key to one value
) else
# we get here if a line didn't match $ignore_re *or* $content_re
error("Line \(.) is not recognized as a comment, empty, or valid content")
end
end;
# iterate over our input lines, passing each one to content_for_line and merging the result
# into the object we're building, which we eventually return as our result.
reduce inputs as $item ({}; . + ($item | content_for_line))
EOF
)
# jq -R: read input as raw strings
# jq -n: don't read from stdin until requested with "input" or "inputs"
jq -Rn "$jq_script" <file.yaml >file.json
Unlike syntax-unaware tools, this can never generate output that isn't valid JSON; and it can easily be extended with application-specific logic (f/e, to emit some values but not others as numeric literals rather than string literals) by adding an additional filter stage to inspect and modify the output of content_for_line.
Here's a no-frills but simple solution:
def tidy: sub("^ *'?";"") | sub(" *'?$";"");
def kv: split(":") | [ (.[:-1] | join(":")), (.[-1]|tidy)];
reduce (inputs| select( test("^ *#|^ *$")|not) | kv) as $row ({};
.[$row[0]] = $row[1] )
Invocation
jq -n -R -f tojson.jq input.txt
You can do it all in awk using gsub and sprintf, for example:
(edit to add "," separating json records)
awk 'BEGIN {ol=0; print "{" }
/^[^#]/ {
if (ol) print ","
gsub ("\047", "\042")
$1 = sprintf (" \"%s\":", substr ($1, 1, length ($1) - 1))
printf "%s %s", $1, $2
ol++
}
END { print "\n}" }' file.yaml
(note: though jq is the proper tool for json formatting)
Explanation
awk 'BEGIN { ol=0; print "{" } call awk setting the output line variable ol=0 for "," output control and printing the header "{",
/^[^#]/ { only match non-comment lines,
if (ol) print "," if the output line ol is greater than zero, output a trailing ","
gsub ("\047", "\042") replace all single-quotes with double-quotes,
$1 = sprintf (" \"%s\":", substr ($1, 1, length ($1) - 1)) add 2 leading spaces and double-quotes around the first field (except for the last character) and then append a ':' at the end.
print $1, $2 output the reformatted fields,
ol++ increment the output line count, and
END { print "}" }' close by printing the "}" footer
Example Use/Output
Just select/paste the awk command above (changing the filename as needed)
$ awk 'BEGIN {ol=0; print "{" }
> /^[^#]/ {
> if (ol) print ","
> gsub ("\047", "\042")
> $1 = sprintf (" \"%s\":", substr ($1, 1, length ($1) - 1))
> printf "%s %s", $1, $2
> ol++
> }
> END { print "\n}" }' file.yaml
{
"spark:spark.ui.enabled": "false",
"spark:spark.sql.adaptive.enabled": "true"
}

Extract values from command output to a JSON

I am extracting values from a cloud foundry command. It has to be done via the shell. Here is how the file looks like:
User-Provided:
end: 123.12.12.12
text_pass: 980
KEY: 000
Running Environment Variable Groups:
BLUEMIX_REGION: ibm:yp:us-north
Staging Environment Variable Groups:
BLUEMIX_REGION: ibm:yp:us-south
I want to extract everything from end to KEY and please note that user-provided will always be the start but the end can be any value. But there will always be a new line.
How do I extract between "User-Provided to new line" and put in a JSON file which I will later use to parse?
So far I'm able to do this:
cf env space | awk -F 'end:' '{print $2}'
this gives me the value of end but not the whole object.
Expected output:
{
"end": "123.12.12.12"
"text_pass": "980"
"KEY": "000"
}
cf env space | awk '/User-Provided/{a = 1; next}/^$/{a = 0} a'
end: 123.12.12.12
text_pass: 980
KEY: 000
When pattern User-Provided is encountered set a variable a and when a blank line is encountered, unset this variable a. Now, the lines will be printed out for only the cases when a is set.
Edited answer:
cf env space | awk -F" *: *" '/User-Provided/{a=1;print"{";next}/^$/{a=0} END{print "\n}"} a{if(c)printf(","); printf("%s", "\n\""$1"\" : \""$NF"\""); c=1}'
This will give the output:
{
"end" : "123.12.12.12",
"text_pass" : "980",
"KEY" : "000"
}
Latest edit:
cf env space | awk '/User-Provided/{a=1;print"{";next}/^$/{a=0} END{print "\n}"} a{if(c)printf(","); sub(/:$/,"",$1); printf("%s", "\n\""$1"\" : \""$NF"\""); c=1}'
In awk:
$ awk '/^end:/,/^KEY:/' file
end: 123.12.12.12
text_pass: 980
KEY: 000
/.../,/.../ is used to name the start and end markers which are printed.
However, the output requirements complicate the program a bit:
$ awk '
BEGIN { FS=": *";OFS=":" } # set appropriate delimiters
/^end:/ { print "{";f=1 } # print at start marker and raise flag
f { print "\"" $1"\"","\"" $2"\"" } # when flag up, print
/^KEY:/ { print "}";f="" } # at end-marker, print end marker and flag down
' file
{
"end":"123.12.12.12"
"text_pass":"980"
"KEY":"000"
}
If you want to use and empty line as end marker, use /^$/ && f instead of /^KEY:/.

Parse out key=value pairs into variables

I have a bunch of different kinds of files I need to look at periodically, and what they have in common is that the lines have a bunch of key=value type strings. So something like:
Version=2 Len=17 Hello Var=Howdy Other
I would like to be able to reference the names directly from awk... so something like:
cat some_file | ... | awk '{print Var, $5}' # prints Howdy Other
How can I go about doing that?
The closest you can get is to parse the variables into an associative array first thing every line. That is to say,
awk '{ delete vars; for(i = 1; i <= NF; ++i) { n = index($i, "="); if(n) { vars[substr($i, 1, n - 1)] = substr($i, n + 1) } } Var = vars["Var"] } { print Var, $5 }'
More readably:
{
delete vars; # clean up previous variable values
for(i = 1; i <= NF; ++i) { # walk through fields
n = index($i, "="); # search for =
if(n) { # if there is one:
# remember value by name. The reason I use
# substr over split is the possibility of
# something like Var=foo=bar=baz (that will
# be parsed into a variable Var with the
# value "foo=bar=baz" this way).
vars[substr($i, 1, n - 1)] = substr($i, n + 1)
}
}
# if you know precisely what variable names you expect to get, you can
# assign to them here:
Var = vars["Var"]
Version = vars["Version"]
Len = vars["Len"]
}
{
print Var, $5 # then use them in the rest of the code
}
$ cat file | sed -r 's/[[:alnum:]]+=/\n&/g' | awk -F= '$1=="Var"{print $2}'
Howdy Other
Or, avoiding the useless use of cat:
$ sed -r 's/[[:alnum:]]+=/\n&/g' file | awk -F= '$1=="Var"{print $2}'
Howdy Other
How it works
sed -r 's/[[:alnum:]]+=/\n&/g'
This places each key,value pair on its own line.
awk -F= '$1=="Var"{print $2}'
This reads the key-value pairs. Since the field separator is chosen to be =, the key ends up as field 1 and the value as field 2. Thus, we just look for lines whose first field is Var and print the corresponding value.
Since discussion in commentary has made it clear that a pure-bash solution would also be acceptable:
#!/bin/bash
case $BASH_VERSION in
''|[0-3].*) echo "ERROR: Bash 4.0 required" >&2; exit 1;;
esac
while read -r -a words; do # iterate over lines of input
declare -A vars=( ) # refresh variables for each line
set -- "${words[#]}" # update positional parameters
for word; do
if [[ $word = *"="* ]]; then # if a word contains an "="...
vars[${word%%=*}]=${word#*=} # ...then set it as an associative-array key
fi
done
echo "${vars[Var]} $5" # Here, we use content read from that line.
done <<<"Version=2 Len=17 Hello Var=Howdy Other"
The <<<"Input Here" could also be <file.txt, in which case lines in the file would be iterated over.
If you wanted to use $Var instead of ${vars[Var]}, then substitute printf -v "${word%%=*}" %s "${word*=}" in place of vars[${word%%=*}]=${word#*=}, and remove references to vars elsewhere. Note that this doesn't allow for a good way to clean up variables between lines of input, as the associative-array approach does.
I will try to explain you a very generic way to do this which you can adapt easily if you want to print out other stuff.
Assume you have a string which has a format like this:
key1=value1 key2=value2 key3=value3
or more generic
key1_fs2_value1_fs1_key2_fs2_value2_fs1_key3_fs2_value3
With fs1 and fs2 two different field separators.
You would like to make a selection or some operations with these values. To do this, the easiest is to store these in an associative array:
array["key1"] => value1
array["key2"] => value2
array["key3"] => value3
array["key1","full"] => "key1=value1"
array["key2","full"] => "key2=value2"
array["key3","full"] => "key3=value3"
This can be done with the following function in awk:
function str2map(str,fs1,fs2,map, n,tmp) {
n=split(str,map,fs1)
for (;n>0;n--) {
split(map[n],tmp,fs2);
map[tmp[1]]=tmp[2]; map[tmp[1],"full"]=map[n]
delete map[n]
}
}
So, after processing the string, you have the full flexibility to do operations in any way you like:
awk '
function str2map(str,fs1,fs2,map, n,tmp) {
n=split(str,map,fs1)
for (;n>0;n--) {
split(map[n],tmp,fs2);
map[tmp[1]]=tmp[2]; map[tmp[1],"full"]=map[n]
delete map[n]
}
}
{ str2map($0," ","=",map) }
{ print map["Var","full"] }
' file
The advantage of this method is that you can easily adapt your code to print any other key you are interested in, or even make selections based on this, example:
(map["Version"] < 3) { print map["var"]/map["Len"] }
The simplest and easiest way is to use the string substitution like this:
property='my.password.is=1234567890=='
name=${property%%=*}
value=${property#*=}
echo "'$name' : '$value'"
The output is:
'my.password.is' : '1234567890=='
Yore.
Using bash's set command, we can split the line into positional parameters like awk.
For each word, we'll try to read a name value pair delimited by =.
When we find a value, assign it to the variable named $key using bash's printf -v feature.
#!/usr/bin/env bash
line='Version=2 Len=17 Hello Var=Howdy Other'
set $line
for word in "$#"; do
IFS='=' read -r key val <<< "$word"
test -n "$val" && printf -v "$key" "$val"
done
echo "$Var $5"
output
Howdy Other
SYNOPSIS
an awk-based solution that doesn't require manually checking the fields to locate the desired key pair :
approach being avoid splitting unnecessary fields or arrays - only performing regex match via function call when needed
only returning FIRST occurrence of input key value. Subsequent matches along the row are NOT returned
i just called it S() cuz it's the closest letter to $
I only included an array (_) of the 3 test values for demo purposes. Those aren't needed. In fact, no state information is being kept at all
caveat being : key-match must be exact - this version of the code isn't for case-insensitive or fuzzy/agile matching
Tested and confirmed working on
- gawk 5.1.1
- mawk 1.3.4
- mawk-2/1.9.9.6
- macos nawk
CODE
# gawk profile, created Fri May 27 02:07:53 2022
{m,n,g}awk '
function S(__,_) {
return \
! match($(_=_<_), "(^|["(_="[:blank:]]")")"(__)"[=][^"(_)"*") \
? "^$" \
: substr(__=substr($-_, RSTART, RLENGTH), index(__,"=")+_^!_)
}
BEGIN { OFS = "\f" # This array is only for testing
_["Version"] _["Len"] _["Var"] # purposes. Feel free to discard at will
} {
for (__ in _) {
print __, S(__) } }'
OUTPUT
Var
Howdy
Len
17
Version
2
So either call the fields in BAU fashion
- $5, $0, $NF, etc
or call S(QUOTED_KEY_VALUE), case-sensitive, like
As a safeguard, to prevent mis-interpreting null strings
or invalid inputs as $0, a non-match returns ^$
instead of empty string
S("Version") to get back 2.
As a bonus, it can safely handle values in multibyte unicode, both for values and even for keys, regardless of whether ur awk is UTF-8-aware or not :
1 ✜
🤡
2 Version
2
3 Var
Howdy
4 Len
17
5 ✜=🤡 Version=2 Len=17 Hello Var=Howdy Other
I know this is particularly regarding awk but mentioning this as many people come here for solutions to break down name = value pairs ( with / without using awk as such).
I found below way simple straight forward and very effective in managing multiple spaces / commas as well -
Source: http://jayconrod.com/posts/35/parsing-keyvalue-pairs-in-bash
change="foo=red bar=green baz=blue"
#use below if var is in CSV (instead of space as delim)
change=`echo $change | tr ',' ' '`
for change in $changes; do
set -- `echo $change | tr '=' ' '`
echo "variable name == $1 and variable value == $2"
#can assign value to a variable like below
eval my_var_$1=$2;
done

How to get specific data from block of data based on condition

I have a file like this:
[group]
enable = 0
name = green
test = more
[group]
name = blue
test = home
[group]
value = 48
name = orange
test = out
There may be one ore more space/tabs between label and = and value.
Number of lines may wary in every block.
I like to have the name, only if this is not true enable = 0
So output should be:
blue
orange
Here is what I have managed to create:
awk -v RS="group" '!/enable = 0/ {sub(/.*name[[:blank:]]+=[[:blank:]]+/,x);print $1}'
blue
orange
There are several fault with this:
I am not able to set RS to [group], both this fails RS="[group]" and RS="\[group\]". This will then fail if name or other labels contains group.
I do prefer not to use RS with multiple characters, since this is gnu awk only.
Anyone have other suggestion? sed or awk and not use a long chain of commands.
If you know that groups are always separated by empty lines, set RS to the empty string:
$ awk -v RS="" '!/enable = 0/ {sub(/.*name[[:blank:]]+=[[:blank:]]+/,x);print $1}'
blue
orange
#devnull explained in his answer that GNU awk also accepts regular expressions in RS, so you could only split at [group] if it is on its own line:
gawk -v RS='(^|\n)[[]group]($|\n)' '!/enable = 0/ {sub(/.*name[[:blank:]]+=[[:blank:]]+/,x);print $1}'
This makes sure we're not splitting at evil names like
[group]
enable = 0
name = [group]
name = evil
test = more
Your problem seems to be:
I am not able to set RS to [group], both this fails RS="[group]" and
RS="\[group\]".
Saying:
RS="[[]group[]]"
should yield the desired result.
In these situations where there's clearly name = value statements within a record, I like to first populate an array with those mappings, e.g.:
map["<name>"] = <value>
and then just use the names to reference the values I want. In this case:
$ awk -v RS= -F'\n' '
{
delete map
for (i=1;i<=NF;i++) {
split($i,tmp,/ *= */)
map[tmp[1]] = tmp[2]
}
}
map["enable"] !~ /^0$/ {
print map["name"]
}
' file
blue
orange
If your version of awk doesn't support deleting a whole array then change delete map to split("",map).
Compared to using REs and/or sub()s., etc., it makes the solution much more robust and extensible in case you want to compare and/or print the values of other fields in future.
Since you have line-separated records, you should consider putting awk in paragraph mode. If you must test for the [group] identifier, simply add code to handle that. Here's some example code that should fulfill your requirements. Run like:
awk -f script.awk file.txt
Contents of script.awk:
BEGIN {
RS=""
}
{
for (i=2; i<=NF; i+=3) {
if ($i == "enable" && $(i+2) == 0) {
f = 1
}
if ($i == "name") {
r = $(i+2)
}
}
}
!(f) && r {
print r
}
{
f = 0
r = ""
}
Results:
blue
orange
This might work for you (GNU sed):
sed -n '/\[group\]/{:a;$!{N;/\n$/!ba};/enable\s*=\s*0/!s/.*name\s*=\s*\(\S\+\).*/\1/p;d}' file
Read the [group] block into the pattern space then substitute out the colour if the enable variable is not set to 0.
sed -n '...' set sed to run in silent mode, no ouput unless specified i.e. a p or P command
/\[group\]/{...} when we have a line which contains [group] do what is found inside the curly braces.
:a;$!{N;/\n$/!ba} to do a loop we need a place to loop to, :a is the place to loop to. $ is the end of file address and $! means not the end of file, so $!{...} means do what is found inside the curly braces when it is not the end of file. N means append a newline and the next line to the current line and /\n$/ba when we have a line that ends with an empty line branch (b) to a. So this collects all lines from a line that contains `[group] to an empty line (or end of file).
/enable\s*=\s*0/!s/.*name\s*=\s*\(\S\+\).*/\1/p if the lines collected contain enable = 0 then do not substitute out the colour. Or to put it another way, if the lines collected so far do not contain enable = 0 do substitute out the colour.
If you don't want to use the record separator, you could use a dummy variable like this:
#!/usr/bin/awk -f
function endgroup() {
if (e == 1) {
print n
}
}
$1 == "name" {
n = $3
}
$1 == "enable" && $3 == 0 {
e = 0;
}
$0 == "[group]" {
endgroup();
e = 1;
}
END {
endgroup();
}
You could actually use Bash for this.
while read line; do
if [[ $line == "enable = 0" ]]; then
n=1
else
n=0
fi
if [ $n -eq 0 ] && [[ $line =~ name[[:space:]]+=[[:space:]]([a-z]+) ]]; then
echo ${BASH_REMATCH[1]}
fi
done < file
This will only work however if enable = 0 is always only one line above the line with name.

Resources