How to get specific data from block of data based on condition - bash

I have a file like this:
[group]
enable = 0
name = green
test = more
[group]
name = blue
test = home
[group]
value = 48
name = orange
test = out
There may be one ore more space/tabs between label and = and value.
Number of lines may wary in every block.
I like to have the name, only if this is not true enable = 0
So output should be:
blue
orange
Here is what I have managed to create:
awk -v RS="group" '!/enable = 0/ {sub(/.*name[[:blank:]]+=[[:blank:]]+/,x);print $1}'
blue
orange
There are several fault with this:
I am not able to set RS to [group], both this fails RS="[group]" and RS="\[group\]". This will then fail if name or other labels contains group.
I do prefer not to use RS with multiple characters, since this is gnu awk only.
Anyone have other suggestion? sed or awk and not use a long chain of commands.

If you know that groups are always separated by empty lines, set RS to the empty string:
$ awk -v RS="" '!/enable = 0/ {sub(/.*name[[:blank:]]+=[[:blank:]]+/,x);print $1}'
blue
orange
#devnull explained in his answer that GNU awk also accepts regular expressions in RS, so you could only split at [group] if it is on its own line:
gawk -v RS='(^|\n)[[]group]($|\n)' '!/enable = 0/ {sub(/.*name[[:blank:]]+=[[:blank:]]+/,x);print $1}'
This makes sure we're not splitting at evil names like
[group]
enable = 0
name = [group]
name = evil
test = more

Your problem seems to be:
I am not able to set RS to [group], both this fails RS="[group]" and
RS="\[group\]".
Saying:
RS="[[]group[]]"
should yield the desired result.

In these situations where there's clearly name = value statements within a record, I like to first populate an array with those mappings, e.g.:
map["<name>"] = <value>
and then just use the names to reference the values I want. In this case:
$ awk -v RS= -F'\n' '
{
delete map
for (i=1;i<=NF;i++) {
split($i,tmp,/ *= */)
map[tmp[1]] = tmp[2]
}
}
map["enable"] !~ /^0$/ {
print map["name"]
}
' file
blue
orange
If your version of awk doesn't support deleting a whole array then change delete map to split("",map).
Compared to using REs and/or sub()s., etc., it makes the solution much more robust and extensible in case you want to compare and/or print the values of other fields in future.

Since you have line-separated records, you should consider putting awk in paragraph mode. If you must test for the [group] identifier, simply add code to handle that. Here's some example code that should fulfill your requirements. Run like:
awk -f script.awk file.txt
Contents of script.awk:
BEGIN {
RS=""
}
{
for (i=2; i<=NF; i+=3) {
if ($i == "enable" && $(i+2) == 0) {
f = 1
}
if ($i == "name") {
r = $(i+2)
}
}
}
!(f) && r {
print r
}
{
f = 0
r = ""
}
Results:
blue
orange

This might work for you (GNU sed):
sed -n '/\[group\]/{:a;$!{N;/\n$/!ba};/enable\s*=\s*0/!s/.*name\s*=\s*\(\S\+\).*/\1/p;d}' file
Read the [group] block into the pattern space then substitute out the colour if the enable variable is not set to 0.
sed -n '...' set sed to run in silent mode, no ouput unless specified i.e. a p or P command
/\[group\]/{...} when we have a line which contains [group] do what is found inside the curly braces.
:a;$!{N;/\n$/!ba} to do a loop we need a place to loop to, :a is the place to loop to. $ is the end of file address and $! means not the end of file, so $!{...} means do what is found inside the curly braces when it is not the end of file. N means append a newline and the next line to the current line and /\n$/ba when we have a line that ends with an empty line branch (b) to a. So this collects all lines from a line that contains `[group] to an empty line (or end of file).
/enable\s*=\s*0/!s/.*name\s*=\s*\(\S\+\).*/\1/p if the lines collected contain enable = 0 then do not substitute out the colour. Or to put it another way, if the lines collected so far do not contain enable = 0 do substitute out the colour.

If you don't want to use the record separator, you could use a dummy variable like this:
#!/usr/bin/awk -f
function endgroup() {
if (e == 1) {
print n
}
}
$1 == "name" {
n = $3
}
$1 == "enable" && $3 == 0 {
e = 0;
}
$0 == "[group]" {
endgroup();
e = 1;
}
END {
endgroup();
}

You could actually use Bash for this.
while read line; do
if [[ $line == "enable = 0" ]]; then
n=1
else
n=0
fi
if [ $n -eq 0 ] && [[ $line =~ name[[:space:]]+=[[:space:]]([a-z]+) ]]; then
echo ${BASH_REMATCH[1]}
fi
done < file
This will only work however if enable = 0 is always only one line above the line with name.

Related

Bash script to compare and generate csv datafile

I have two CSV files data1.csv and data2.csv the content is something like this (with headers) :
DATA1.csv
Client Name;strnu;addr;fav
MAD01;HDGF;11;V PO
CVOJF01;HHD-;635;V T
LINKO10;DH--JDH;98;V ZZ
DATA2.csv
USER;BINin;TYPE
XXMAD01XXXHDGFXX;11;N
KJDGD;635;M
CVOJF01XXHHD;635;N
Issues :
The value of the 1st and 2nd column of DATA1.csv exist randomly in the first column of DATA2.csv.
For example MAD01;HDGF exist in the first column of DATA2 ***MAD01***HDGF** (* can be alphanum and/or symbols charachter) and MAD01;HDGF might not be in the same order in the column USER of DATA2.
The value of strnum in DATA1 is equal to the value of the column BINin in DATA2
The column fav DATA1 is the same as TYPE in DATA2 because V T = M and V PO = N (some other valuses may exist but we won't need them for example line 3 of DATA1 it should be ignored)
N.B: some data may exist in a file but not the other.
my bash script needs to generate a new CSV file that should contain:
The column USER from DATA2
Client Name and strnu from DATA1
BINin from DATA2 only if it's equal to the corespondent line and value of strnu DATA1
TYPE using M N Format and making sure to respect the condition that V T = M and V PO = N
The first thing i tried was usuing grep to search for lines that exist in both files
#!/bin/sh
DATA1="${1}"
DATA2="${2}"
for i in $(cat $DATA1 | awk -F";" '{print $1".*"$2}' | sed 1d) ; do
grep "$i" $DATA2
done
Result :
$ ./script.sh DATA1.csv DATA2.csv
MAD01;HDGF;11;V PO
XXMAD01XXXHDGFXX;11;N
CVOJF01;HHD-;635;V T
LINKO10;DH--JDH;98;V PO
Using grep and awk i could find lines that are present in DATA1 and DATA2 files but it doesn't work for all the lines and i guess it's because of the - and other special characters present in column 2 of DATA1 but they can be ignored.
I don't know how i can generate a new csv that would mix the lines present in both files but the expected generated CSV should look like this
USER;Client Name;strnu;BINin;TYPE
XXMAD01XXXHDGFXX;MAD01;HDGF;11;N
CVOJF01XXHHD;CVOJF01;HHD-;635;M
This can be done in a single awk program. This is join.awk
BEGIN {
FS = OFS = ";"
print "USER", "Client Name", "strnu", "BINin", "TYPE"
}
FNR == 1 {next}
NR == FNR {
strnu[$1] = $2
next
}
{
for (client in strnu) {
strnu_pattern = strnu[client]
gsub(/-/, "", strnu_pattern)
if ($1 ~ client && $1 ~ strnu_pattern) {
print $1, client, strnu[client], $2, $3
break
}
}
}
and then
awk -f join.awk DATA1.csv DATA2.csv
outputs
USER;Client Name;strnu;BINin;TYPE
XXMAD01XXXHDGFXX;MAD01;HDGF;11;N
CVOJF01XXHHD;CVOJF01;HHD-;635;N
Assumptions/understandings:
ignore lines from DATA1.csv where the fav field is not one of V T or V PO
when matching fields we need to ignore the any hyphens from the DATA1.csv fields
when matching fields the strings from DATA1.csv can show up in either order in DATA2.csv
last line of the expected output show end with 635,N
One `awk idea:
awk '
BEGIN { FS=OFS=";"
print "USER","Client Name","strnu","BINin","TYPE" # print new header
}
FNR==1 { next } # skip input headers
FNR==NR { if ($4 == "V PO" || $4 == "V T") { # only process if fav is one of "V PO" or "V T"
cnames[FNR]=$1 # save client name
strnus[FNR]=$2 # save strnu
}
next
}
{ for (i in cnames) { # loop through array indices
cname=cnames[i] # make copy of client name ...
strnu=strnus[i] # and strnu so that we can ...
gsub(/-/,"",cname) # strip hypens from both ...
gsub(/-/,"",strnu) # in order to perform the comparisons ...
if (index($1,cname) && index($1,strnu)) { # if cname and strnu both exist in $1 then index()>=1 in both cases so ...
print $1,cnames[i],strnus[i],$2,$3 # print to stdout
next # we found a match so break from loop and go to next line of input
}
}
}
' DATA1.csv DATA2.csv
This generates:
USER;Client Name;strnu;BINin;TYPE
XXMAD01XXXHDGFXX;MAD01;HDGF;11;N
CVOJF01XXHHD;CVOJF01;HHD-;635;N

AWK print block that does NOT contain specific text

I have the following data file:
variable "ARM_CLIENT_ID" {
description = "Client ID for Service Principal"
}
variable "ARM_CLIENT_SECRET" {
description = "Client Secret for Service Principal"
}
# [.....loads of code]
variable "logging_settings" {
description = "Logging settings from TFVARs"
}
variable "azure_firewall_nat_rule_collections" {
default = {}
}
variable "azure_firewall_network_rule_collections" {
default = {}
}
variable "azure_firewall_application_rule_collections" {
default = {}
}
variable "build_route_tables" {
description = "List of Route Table keys that need direct internet prior to Egress FW build"
default = [
"shared_services",
"sub_to_afw"
]
}
There's a 2 things I wish to do:
print the variable names without the inverted commas
ONLY print the variables names if the code block does NOT contain default
I know I can print the variables names like so: awk '{ gsub("\"", "") }; (/variable/ && $2 !~ /^ARM_/) { print $2}'
I know I can print the code blocks with: awk '/variable/,/^}/', which results:
# [.....loads of code output before this]
variable "logging_settings" {
description = "Logging settings from TFVARs"
}
variable "azure_firewall_nat_rule_collections" {
default = {}
}
variable "azure_firewall_network_rule_collections" {
default = {}
}
variable "azure_firewall_application_rule_collections" {
default = {}
}
variable "build_route_tables" {
description = "List of Route Table keys that need direct internet prior to Egress FW build"
default = [
"shared_services",
"sub_to_afw"
]
}
However, I cannot find out how to print the code blocks "if" they don't contain default. I know I will need to use an if statement, and some variables perhaps, but I am unsure as of how.
This code block should NOT appear in the output for which I grab the variable name:
variable "build_route_tables" {
description = "List of Route Table keys that need direct internet prior to Egress FW build"
default = [
"shared_services",
"sub_to_afw"
]
}
End output should NOT contain those that had default:
# [.....loads of code output before this]
expressroute_settings
firewall_settings
global_settings
peering_settings
vnet_transit_object
vnet_shared_services_object
route_tables
logging_settings
Preferable I would like to keep this a single AWK command or file, no piping. I have uses for this that do prefer no piping.
EDIT: update the ideal outputs (missed some examples of those with default)
Assumptions and collection of notes from OP's question and comments:
all variable definition blocks end with a right brace (}) in the first column of a new line
we only display variable names (sans the double quotes)
we do not display the variable names if the body of the variable definition contains the string default
we do not display the variable name if it starts with the string ARM_
One (somewhat verbose) awk solution:
NOTE: I've copied the sample input data into my local file variables.dat
awk -F'"' ' # use double quotes as the input field separator
/^variable / && $2 !~ "^ARM_" { varname = $2 # if line starts with "^variable ", and field #2 is not like "^ARM_", save field #2 for later display
printme = 1 # enable our print flag
}
/variable/,/^}/ { if ( $0 ~ "default" ) # within the range of a variable definition, if we find the string "default" ...
printme = 0 # disable the print flag
next # skip to next line
}
printme { print varname # if the print flag is enabled then print the variable name and then ...
printme = 0 # disable the print flag
}
' variables.dat
This generates:
logging_settings
$ awk -v RS= '!/default =/{gsub(/"/,"",$2); print $2}' file
ARM_CLIENT_ID
ARM_CLIENT_SECRET
[.....loads
logging_settings
of course output doesn't match yours since it's inconsistent with the input data.
Using GNU awk:
awk -v RS="}" '/variable/ && !/default/ && !/ARN/ { var=gensub(/(^.*variable ")(.*)(".*{.*)/,"\\2",$0);print var }' file
Set the record separator to "}" and then check for records that contain "variable", don't contain default and don't contain "ARM". Use gensub to split the string into three sections based on regular expressions and set the variable var to the second section. Print the var variable.
Output:
logging_settings
Another variation on awk using skip variable to control the array index holding the variable names:
awk '
/^[[:blank:]]*#/ { next }
$1=="variable" { gsub(/["]/,"",$2); vars[skip?n:++n]=$2; skip=0 }
$1=="default" { skip=1 }
END { if (skip) n--; for(i=1; i<=n; i++) print vars[i] }
' code
The first rule just skips comment lines. If you want to skip "ARM_" variables, then you can add a test on $2.
Example Use/Output
With your example code in code, all variables without default are:
$ awk '
> /^[[:blank:]]*#/ { next }
> $1=="variable" { gsub(/["]/,"",$2); vars[skip?n:++n]=$2; skip=0 }
> $1=="default" { skip=1 }
> END { if (skip) n--; for(i=1; i<=n; i++) print vars[i] }
> ' code
ARM_CLIENT_ID
ARM_CLIENT_SECRET
logging_settings
Here's another maybe shorter solution.
$ awk -F'"' '/^variable/&&$2!~/^ARM_/{v=$2} /default =/{v=0} /}/&&v{print v; v=0}' file
logging_settings

Parse out key=value pairs into variables

I have a bunch of different kinds of files I need to look at periodically, and what they have in common is that the lines have a bunch of key=value type strings. So something like:
Version=2 Len=17 Hello Var=Howdy Other
I would like to be able to reference the names directly from awk... so something like:
cat some_file | ... | awk '{print Var, $5}' # prints Howdy Other
How can I go about doing that?
The closest you can get is to parse the variables into an associative array first thing every line. That is to say,
awk '{ delete vars; for(i = 1; i <= NF; ++i) { n = index($i, "="); if(n) { vars[substr($i, 1, n - 1)] = substr($i, n + 1) } } Var = vars["Var"] } { print Var, $5 }'
More readably:
{
delete vars; # clean up previous variable values
for(i = 1; i <= NF; ++i) { # walk through fields
n = index($i, "="); # search for =
if(n) { # if there is one:
# remember value by name. The reason I use
# substr over split is the possibility of
# something like Var=foo=bar=baz (that will
# be parsed into a variable Var with the
# value "foo=bar=baz" this way).
vars[substr($i, 1, n - 1)] = substr($i, n + 1)
}
}
# if you know precisely what variable names you expect to get, you can
# assign to them here:
Var = vars["Var"]
Version = vars["Version"]
Len = vars["Len"]
}
{
print Var, $5 # then use them in the rest of the code
}
$ cat file | sed -r 's/[[:alnum:]]+=/\n&/g' | awk -F= '$1=="Var"{print $2}'
Howdy Other
Or, avoiding the useless use of cat:
$ sed -r 's/[[:alnum:]]+=/\n&/g' file | awk -F= '$1=="Var"{print $2}'
Howdy Other
How it works
sed -r 's/[[:alnum:]]+=/\n&/g'
This places each key,value pair on its own line.
awk -F= '$1=="Var"{print $2}'
This reads the key-value pairs. Since the field separator is chosen to be =, the key ends up as field 1 and the value as field 2. Thus, we just look for lines whose first field is Var and print the corresponding value.
Since discussion in commentary has made it clear that a pure-bash solution would also be acceptable:
#!/bin/bash
case $BASH_VERSION in
''|[0-3].*) echo "ERROR: Bash 4.0 required" >&2; exit 1;;
esac
while read -r -a words; do # iterate over lines of input
declare -A vars=( ) # refresh variables for each line
set -- "${words[#]}" # update positional parameters
for word; do
if [[ $word = *"="* ]]; then # if a word contains an "="...
vars[${word%%=*}]=${word#*=} # ...then set it as an associative-array key
fi
done
echo "${vars[Var]} $5" # Here, we use content read from that line.
done <<<"Version=2 Len=17 Hello Var=Howdy Other"
The <<<"Input Here" could also be <file.txt, in which case lines in the file would be iterated over.
If you wanted to use $Var instead of ${vars[Var]}, then substitute printf -v "${word%%=*}" %s "${word*=}" in place of vars[${word%%=*}]=${word#*=}, and remove references to vars elsewhere. Note that this doesn't allow for a good way to clean up variables between lines of input, as the associative-array approach does.
I will try to explain you a very generic way to do this which you can adapt easily if you want to print out other stuff.
Assume you have a string which has a format like this:
key1=value1 key2=value2 key3=value3
or more generic
key1_fs2_value1_fs1_key2_fs2_value2_fs1_key3_fs2_value3
With fs1 and fs2 two different field separators.
You would like to make a selection or some operations with these values. To do this, the easiest is to store these in an associative array:
array["key1"] => value1
array["key2"] => value2
array["key3"] => value3
array["key1","full"] => "key1=value1"
array["key2","full"] => "key2=value2"
array["key3","full"] => "key3=value3"
This can be done with the following function in awk:
function str2map(str,fs1,fs2,map, n,tmp) {
n=split(str,map,fs1)
for (;n>0;n--) {
split(map[n],tmp,fs2);
map[tmp[1]]=tmp[2]; map[tmp[1],"full"]=map[n]
delete map[n]
}
}
So, after processing the string, you have the full flexibility to do operations in any way you like:
awk '
function str2map(str,fs1,fs2,map, n,tmp) {
n=split(str,map,fs1)
for (;n>0;n--) {
split(map[n],tmp,fs2);
map[tmp[1]]=tmp[2]; map[tmp[1],"full"]=map[n]
delete map[n]
}
}
{ str2map($0," ","=",map) }
{ print map["Var","full"] }
' file
The advantage of this method is that you can easily adapt your code to print any other key you are interested in, or even make selections based on this, example:
(map["Version"] < 3) { print map["var"]/map["Len"] }
The simplest and easiest way is to use the string substitution like this:
property='my.password.is=1234567890=='
name=${property%%=*}
value=${property#*=}
echo "'$name' : '$value'"
The output is:
'my.password.is' : '1234567890=='
Yore.
Using bash's set command, we can split the line into positional parameters like awk.
For each word, we'll try to read a name value pair delimited by =.
When we find a value, assign it to the variable named $key using bash's printf -v feature.
#!/usr/bin/env bash
line='Version=2 Len=17 Hello Var=Howdy Other'
set $line
for word in "$#"; do
IFS='=' read -r key val <<< "$word"
test -n "$val" && printf -v "$key" "$val"
done
echo "$Var $5"
output
Howdy Other
SYNOPSIS
an awk-based solution that doesn't require manually checking the fields to locate the desired key pair :
approach being avoid splitting unnecessary fields or arrays - only performing regex match via function call when needed
only returning FIRST occurrence of input key value. Subsequent matches along the row are NOT returned
i just called it S() cuz it's the closest letter to $
I only included an array (_) of the 3 test values for demo purposes. Those aren't needed. In fact, no state information is being kept at all
caveat being : key-match must be exact - this version of the code isn't for case-insensitive or fuzzy/agile matching
Tested and confirmed working on
- gawk 5.1.1
- mawk 1.3.4
- mawk-2/1.9.9.6
- macos nawk
CODE
# gawk profile, created Fri May 27 02:07:53 2022
{m,n,g}awk '
function S(__,_) {
return \
! match($(_=_<_), "(^|["(_="[:blank:]]")")"(__)"[=][^"(_)"*") \
? "^$" \
: substr(__=substr($-_, RSTART, RLENGTH), index(__,"=")+_^!_)
}
BEGIN { OFS = "\f" # This array is only for testing
_["Version"] _["Len"] _["Var"] # purposes. Feel free to discard at will
} {
for (__ in _) {
print __, S(__) } }'
OUTPUT
Var
Howdy
Len
17
Version
2
So either call the fields in BAU fashion
- $5, $0, $NF, etc
or call S(QUOTED_KEY_VALUE), case-sensitive, like
As a safeguard, to prevent mis-interpreting null strings
or invalid inputs as $0, a non-match returns ^$
instead of empty string
S("Version") to get back 2.
As a bonus, it can safely handle values in multibyte unicode, both for values and even for keys, regardless of whether ur awk is UTF-8-aware or not :
1 ✜
🤡
2 Version
2
3 Var
Howdy
4 Len
17
5 ✜=🤡 Version=2 Len=17 Hello Var=Howdy Other
I know this is particularly regarding awk but mentioning this as many people come here for solutions to break down name = value pairs ( with / without using awk as such).
I found below way simple straight forward and very effective in managing multiple spaces / commas as well -
Source: http://jayconrod.com/posts/35/parsing-keyvalue-pairs-in-bash
change="foo=red bar=green baz=blue"
#use below if var is in CSV (instead of space as delim)
change=`echo $change | tr ',' ' '`
for change in $changes; do
set -- `echo $change | tr '=' ' '`
echo "variable name == $1 and variable value == $2"
#can assign value to a variable like below
eval my_var_$1=$2;
done

bash scripting how convert log with key into csv

I have a log with format like a table
ge-1/0/0.0 up down inet 10.100.100.1/24
multiservice
ge-1/0/2.107 up up inet 10.187.132.193/27
10.187.132.194/27
multiservice
ge-1/1/4 up up
ge-1/1/5.0 up up inet 10.164.69.209/30
iso
mpls
multiservice
how we convert it to format csv like below:
ge-1/0/0.0,up,down,inet|multiservice,10.100.100.1/24
ge-1/0/2.107,up,up,inet|multiservice,"10.187.132.193/27,10.187.132.194/27"
ge-1/1/4,up,up
ge-1/1/5.0,up,up,inet|iso|mpls|multiservice,10.164.69.209/30
I've tried with grep interfacename -A4 but it's display other interface information.
#!/bin/bash
show() {
[ "$ge" ] || return
[ "$add_quotes" ] && iprange="\"$iprange\""
out="$ge,$upd1,$upd2,$service,$iprange"
out="${out%%,}"
echo "${out%%,}"
}
while read line
do
case "$line" in
ge*)
show
read ge upd1 upd2 service iprange < <( echo "$line" )
add_quotes=
;;
[0-9]*)
iprange="$iprange,$line"
add_quotes=Y
;;
*)
service="$service|$line"
;;
esac
done
# Show last line
show
With your sample data provided as stdin, this script returns:
ge-1/0/0.0,up,down,inet|multiservice,10.100.100.1/24
ge-1/0/2.107,up,up,inet|multiservice,"10.187.132.193/27,10.187.132.194/27"
ge-1/1/4,up,up
ge-1/1/5.0,up,up,inet|iso|mpls|multiservice,10.164.69.209/30
How it works: This script reads from stdin line by line (while read line). Each line is then classified into one of three types: (a) a new record (i.e. a line that starts with "ge-"), (b) a continuation record that provides another IP range (i.e. a record that starts with a number), or (c) a continuation line that provides another service (i.e. a record that starts with a letter). Taking these cases in turn:
(a) When the line contains the start of a new record, that means that the previous record has ended, so we print it out with the show function. Then we read from the new line the five columns that I have named: ge upd1 upd2 service iprange. And, we reset the add_quotes variable to empty.
(b) When the line contains just another IP range, we add that to the current IP range. As per the example in the question, combinations of two or more IP ranges are separated by a comma and enclosed in quotes. Thus, we set add_quotes to "Y".
(c) When the line contains an additional service, we add that to the service variable. As per the example in the question, two services are separated by a vertical bar "|" and no quotes are used.
The function show first checks to make sure that there is a record to show by checking that the ge variable is non-empty. If it is empty, then the return statement is executed so that the function exits (returns) without processing any of its further statements. If $ge was non-empty, the function proceeds to the next statement which adds quotes around the IP range variable if they are needed. It then combines the variables with commas separating them, removes trailing commas (as per the example in the question), and sends the result to stdout.
parselog.awk
#!/usr/bin/gawk -f
BEGIN {
RS = "[^\n]*\n( [^\n]*\n)*"
OFS = ","
}
length(RT) > 0 {
$0 = RT # See: http://stackoverflow.com/a/11917783/27581
opts = ""
ips = ""
for (i = 4; i <= NF; ++i) {
if (isIP($i)) {
ips = append(ips, $i, ",")
} else {
opts = append(opts, $i, "|")
}
}
print $1, $2, $3, opts, "\"" ips "\""
}
function isIP(str) {
return str ~ /^[0-9]/
}
function append(list, val, separator) {
if (length(list) > 0) {
list = list separator
}
return list val
}
Usage
$ ./parselog.awk < log.txt
ge-1/0/0.0,up,down,inet|multiservice,"10.100.100.1/24"
ge-1/0/2.107,up,up,inet|multiservice,"10.187.132.193/27,10.187.132.194/27"
ge-1/1/4,up,up,,""
ge-1/1/5.0,up,up,inet|iso|mpls|multiservice,"10.164.69.209/30"

Show different context on different grep keyword?

I know -A -B -C could be used to show context around the grep keyword.
My question is, how to show different context on different keyword?
For example, how do I show -A 5 for cat, -B 4 for dog, and -C 1 for monkey:
egrep -A3 "cat|dog|monkey" <file>
// this just show 3 after lines for each keyword.
i don't think there's any way to do it with a single grep call, but you could run it through grep once for each variable and concatenate the output:
var=$(grep -n -A 5 cat file)$'\n'$(grep -n -B 4 dog file)$'\n'$(grep -n -C 1 monkey file)
var=$(sort -un <(echo "$var"))
now echo "$var" will produce the same output as you would have gotten from your single command, plus line numbers and context indicators (the : prefix indicates a line that matched the pattern exactly, and the - prefix indicates a line being included because of the -A -B and/or -C options).
the reason i included the line numbers thus far is to preserve the order of the results you would have seen had you managed to do this in one statement. if you like them, great, but if not, you can use the following line to cut them out:
var=$(cut -d: -f2- <(echo "$var") | cut -d- -f2-)
this passes it through once to cut the exact matching lines' prefixes, then again to cut the context matches' prefixes.
pretty? no. but it works.
I'm afraid grep won't do that. You'll have to use a different tool. Perhaps write your own program.
Something like this would do it:
awk '
BEGIN{ ARGV[ARGC++] = ARGV[1] }
function prtB(nr) { for (i=FNR-nr; i<FNR; i++) print a[i] }
function prtA(nr) { for (i=FNR+1; i<=FNR+nr; i++) print a[i] }
NR==FNR{ a[NR]; next }
/cat/ { print; prtA(5) }
/dog/ { prtB(4); print }
/monkey/ { prtB(1); print; prtA(1) }
' file
check the math on the loops in the functions. You didn't say how you'd want to handle lines that contain monkey AND dog, for example.
EDIT: here's an untested solution that would print the maximum context around any match and let you specify the contexts on the command line and won't use as much memory as the above cheap and cheerful solution:
awk -v cxts="cat:0:5\ndog:4:0\nmonkey:1:1" '
BEGIN{
ARGV[ARGC++] = ARGV[1]
numCxts = split(cxts,cxtsA,RS)
for (i=1;i<=numCxts;i++) {
regex = cxtsA[i]
n = split(regex,rangeA,/:/)
sub(/:[^:]+:[^:]+$/,"",regex)
endA[regex] = rangeA[n]
startA[regex] = rangeA[n-1]
regexA[regex]
}
}
NR==FNR{
for (regex in regexA) {
if ($0 ~ regex) {
start = NR - startA[regex]
end = NR + endA[regex]
for (i=start; i<=end; i++) {
prt[i]
}
}
}
next
}
FNR in prt
' file
Separate the searched for patterns in the cxts variable with whatever your RS value is, newline by default.

Resources