Using awk in a wrong approach - bash

I am told I used awk in a wrong approach in the below code, but I am dumbfounded as to how to improve my code so that it is more simpler to read.
read -r bookName
read -r authorName
if grep -iqx "$bookName:$authorName" cutText.txt
then
lineNum=`awk -v bookName="$bookName" -v authorName="$authorName" '$0 ~ bookName ":" authorName {print NR} BEGIN{IGNORECASE=1}' BookDB.txt`
echo "Enter a new title"
read -r newTitle
awk -F":" -v bookName="$bookName" -v newTitle="$newTitle" -v lineNum="$lineNum" 'NR==lineNum{gsub(bookName, newTitle)}1' cutText.txt > temp2.txt
mv -f temp2.txt cutText.txt
else
echo "Error"
fi
My cutText.txt contains content as shown below:
Hairy Potter:Rihanna
MARY IS A LITTLE LAMB:Kenny
Sing along:May
This program basically update a new title in cutText.txt. If a user wants to change MARY IS A LITTLE LAMB to Mary is not a lamb, he will enter the new title and cutText.txt will replace the original title with Mary is not a lamb.
A problem arises now that if a user enter "Mary is a little lamb" for $newTitle, this code of works just doesn't work, because it does take the case into account.
It will only work is user types "MARY IS A LITTLE LAMB". I came to be aware that BEGIN{IGNORECASE=1} is gawk-sepcific, therefore it cannot be used in awk.
How can I script this better so I can ignore case in user input? Thank you!

This uses exact string matching and so cannot fail on partial matches or if your old title contains : or regexp metacharacters or if the new title contains backreferences (e.g. &) or if a backslash (\) appears in any field or any of the other situations that your other scripts to date will fail on:
$ cat tst.sh
read -r oldTitle
read -r authorName
echo "Enter a new title"
read -r newTitle
awk '
BEGIN {
ot=ARGV[1]; nt=ARGV[2]; an=ARGV[3]
ARGV[1] = ARGV[2] = ARGV[3] = ""
}
tolower($0) == tolower(ot":"an) {
$0 = nt":"an
found = 1
}
{ print }
END {
if ( !found ) {
print "Error" | "cat>&2"
}
}
' "$oldTitle" "$newTitle" "$authorName" cutText.txt > temp2.txt &&
mv -f temp2.txt cutText.txt
.
$ cat cutText.txt
Hairy Potter:Rihanna
MARY IS A LITTLE LAMB:Kenny
Sing along:May
$ ./tst.sh
mary is a little lamb
kenny
Enter a new title
Mary is not a lamb
$ cat cutText.txt
Hairy Potter:Rihanna
Mary is not a lamb:kenny
Sing along:May
I'm populating the awk variables from ARGV[] because if I populated them using -v var=val or var=val in the arg list then any backslashes would be interpreted and so \t, for example, would become a literal tab character. See the shell FAQ article I wrote about that a long time ago - http://cfajohnson.com/shell/cus-faq-2.html#Q24.
I changed bookName to oldTitle, btw just because that seems to make more sense in relation to newTitle. No functional difference.
When doing any text manipulation it's extremely important to understand the differences between strings and the various regexp flavors (BREs/EREs/PCREs) and between partial and full matches.
grep operates on BREs by default, on EREs given the -E arg, on PCREs given the -P arg, and on strings given the -F arg.
sed operates on BREs by default, on EREs given the -E arg. sed does not support PCREs. sed also cannot operate on strings and to make your regexps behave as if they were strings is painful, see is-it-possible-to-escape-regex-metacharacters-reliably-with-sed.
awk operates on both EREs and strings by default. You just use EREs with regexp operators and strings with string operators (see https://www.gnu.org/software/gawk/manual/gawk.html#String-Functions).
So if, as in your case, you need all characters in your text treated literally then that is a string, not a regexp, so you should not be using sed on it, and if you want to quickly find a string in a file and are happy with a partial match, you should use grep, but if you want to do anything beyond that such as change a string in a file or do an exact match then you should use awk.

To get you started. Create files
r.awk
function asplit(str, arr, sep, temp, i, n) { # make an assoc array from str
n = split(str, temp, sep)
for (i = 1; i <= n; i++)
arr[temp[i]]++
return n
}
function regexpify(s, back, quote, rest, all, meta, n, c, u, l, ans) {
back = "\\"; quote = "\"";
rest = "^$.[]|()*+?"
all = back quote rest
asplit(all, meta, "")
n = length(s)
for (i=1; i<=n; i++) {
c = substr(s, i, 1)
if (c in meta)
ans = ans back c
else if ((u = toupper(c)) != (l = tolower(c)))
ans = ans "[" l u "]"
else
ans = ans c
}
return ans
}
BEGIN {
old = regexpify(old)
sep = ":"; m = length(sep)
}
NR == n {
i = index($0, sep)
fst = substr($0, 1, i-m)
scn = substr($0, i+m )
gsub(old, new, fst)
print fst sep scn
next
}
{
print
}
cutText.txt
Hairy Potter:Rihanna
MARY IS A LITTLE LAMB:Kenny
Sing along:May
Usage:
awk -v n=2 -v old="MArY iS A LIttLE lAmb" -v new="Mary is not a lamb" -f r.awk cutText.txt
Expected output:
Hairy Potter:Rihanna
Mary is not a lamb:Kenny
Sing along:May

OK GUYS I JUST REALISED I AM DUMB AS ****
I was tearing my hair out for the whole day and all I had to do was to do this.
lineNum=`grep -in "$bookName:$authorName" BookDB.txt | cut -f1 -d":"`
sed -i "${lineNum}s/$bookName/$newTitle/I" BookDB.txt cutText.txt
Omg I feel like killing myself.

Related

How to do operations depending on the presence of a specific string in bash?

I am working with a csv file, so imagine I have this column:
5;10;>11;20;<14
My desired output would be:
5;10;12;20;13
So I would like to add +1 to those values who have the greater than (>) symbol and to subtract 1 to those values with a lesser than (<) symbol with bash language. I have tried something weird with sed but given that it interprets those changes as strings it didn't work out.
Any suggestions?
With awk (tested with GNU awk):
$ awk -F\; -v OFS=\; '
{
for(i = 1; i <= NF; i++) {
if($i ~ /^<[[:digit:]]+$/) {
sub(/^</,"",$i)
$i--
}
else if($i ~ /^>[[:digit:]]+$/) {
sub(/^>/,"",$i)
$i++
}
}
} 1' <<< "5;10;>11;20;<14"
5;10;12;20;13
Warning: use the following if and only if you trust your input file and you are 100% sure it does not contains malicious fields (see the final note).
With GNU sed (and assuming your shell is bash), a bit shorter but also a bit more difficult to understand (as usual with sed):
$ sed -E '
s/<([[:digit:]]+)/$((\1-1))/g
s/>([[:digit:]]+)/$((\1+1))/g
s/.*/printf "%s\n" "&"/e
' <<< "5;10;>11;20;<14"
5;10;12;20;13
That is (where N is a string of digits), substitute all <N with $((N-1)), all >N with $((N+1)), substitute the resulting string S with printf "%s\n" "S", execute it with bash and replace with the output (this is what the e modifier of the substitute command does). In your example the input string successively becomes:
5;10;>11;20;$((14-1))
5;10;$((11+1));20;$((14-1))
printf "%s\n" "5;10;$((11+1));20;$((14-1))"
5;10;12;20;13
The reason why there is a serious security issue here is that if one of your fields is, for instance, $(rm -rf ~/*) it will simply and recursively delete your entire home directory... So, if you do not control the input prefer the awk version.
5;10;>11;20;<14
|
{m,g}awk '
BEGIN {
_*=(OFS= "") (__-=_^= FS ="("(\
___="\31\17")"|"(____="\16\24")")+"
} {
gsub(";[<>][0-9]+",____ "&" ___)
gsub(____ ";[<>]", "&" ___)
NF
for(_+=(_^=($_=$_)<"")+_;_<=NF;_++) {
if ($_~"^[0-9]+$") {
$_+=__^($(_+__)~"[<]$")
}
} print $(_=_<_) }'
=
5;10;>12;20;<13

awk substitution ascii table rules bash

I want to perform a hierarchical set of (non-recursive) substitutions in a text file.
I want to define the rules in an ascii file "table.txt" which contains lines of blank space tabulated pairs of strings:
aaa 3
aa 2
a 1
I have tried to solve it with an awk script "substitute.awk":
BEGIN { while (getline < file) { subs[$1]=$2; } }
{ line=$0; for(i in subs)
{ gsub(i,subs[i],line); }
print line;
}
When I call the script giving it the string "aaa":
echo aaa | awk -v file="table.txt" -f substitute.awk
I get
21
instead of the desired "3". Permuting the lines in "table.txt" doesn't help. Who can explain what the problem is here, and how to circumvent it? (This is a simplified version of my actual task. Where I have a large file containing ascii encoded phonetic symbols which I want to convert into Latex code. The ascii encoding of the symbols contains {$,&,-,%,[a-z],[0-9],...)).
Any comments and suggestions!
PS:
Of course in this application for a substitution table.txt:
aa ab
a 1
a original string: "aa" should be converted into "ab" and not "1b". That means a string which was yielded by applying a rule must be left untouched.
How to account for that?
The order of the loop for (i in subs) is undefined by default.
In newer versions of awk you can use PROCINFO["sorted_in"] to control the sort order. See section 12.2.1 Controlling Array Traversal and (the linked) section 8.1.6 Using Predefined Array Scanning Orders for details about that.
Alternatively, if you can't or don't want to do that you could store the replacements in numerically indexed entries in subs and walk the array in order manually.
To do that you will need to store both the pattern and the replacement in the value of the array and that will require some care to combine. You can consider using SUBSEP or any other character that cannot be in the pattern or replacement and then split the value to get the pattern and replacement in the loop.
Also note the caveats/etc×¥ with getline listed on http://awk.info/?tip/getline and consider not using that manually but instead using NR==1{...} and just listing table.txt as the first file argument to awk.
Edit: Actually, for the manual loop version you could also just keep two arrays one mapping input file line number to the patterns to match and another mapping patterns to replacements. Then looping over the line number array will get you the pattern and the pattern can be used in the second array to get the replacement (for gsub).
Instead of storing the replacements in an associative array, put them in two arrays indexed by integer (one array for the strings to replace, one for the replacements) and iterate over the arrays in order:
BEGIN {i=0; while (getline < file) { subs[i]=$1; repl[i++]=$2}
n = i}
{ for(i=0;i<n;i++) { gsub(subs[i],repl[i]); }
print tolower($0);
}
It seems like perl's zero-width word boundary is what you want. It's a pretty straightforward conversion from the awk:
#!/usr/bin/env perl
use strict;
use warnings;
my %subs;
BEGIN{
open my $f, '<', 'table.txt' or die "table.txt:$!";
while(<$f>) {
my ($k,$v) = split;
$subs{$k}=$v;
}
}
while(<>) {
while(my($k, $v) = each %subs) {
s/\b$k\b/$v/g;
}
print;
}
Here's an answer pulled from another StackExchange site, from a fairly similar question: Replace multiple strings in a single pass.
It's slightly different in that it does the replacements in inverse order by length of target string (i.e. longest target first), but that is the only sensible order for targets which are literal strings, as appears to be the case in this question as well.
If you have tcc installed, you can use the following shell function, which process the file of substitutions into a lex-generated scanner which it then compiles and runs using tcc's compile-and-run option.
# Call this as: substitute replacements.txt < text_to_be_substituted.txt
# Requires GNU sed because I was too lazy to write a BRE
substitute () {
tcc -run <(
{
printf %s\\n "%option 8bit noyywrap nounput" "%%"
sed -r 's/((\\\\)*)(\\?)$/\1\3\3/;
s/((\\\\)*)\\?"/\1\\"/g;
s/^((\\.|[^[:space:]])+)[[:space:]]*(.*)/"\1" {fputs("\3",yyout);}/' \
"$1"
printf %s\\n "%%" "int main(int argc, char** argv) { return yylex(); }"
} | lex -t)
}
With gcc or clang, you can use something similar to compile a substitution program from the replacement list, and then execute that program on the given text. Posix-standard c99 does not allow input from stdin, but gcc and clang are happy to do so provided you tell them explicitly that it is a C program (-x c). In order to avoid excess compilations, we use make (which needs to be gmake, Gnu make).
The following requires that the list of replacements be in a file with a .txt extension; the cached compiled executable will have the same name with a .exe extension. If the makefile were in the current directory with the name Makefile, you could invoke it as make repl (where repl is the name of the replacement file without a text extension), but since that's unlikely to be the case, we'll use a shell function to actually invoke make.
Note that in the following file, the whitespace at the beginning of each line starts with a tab character:
substitute.mak
.SECONDARY:
%: %.exe
#$(<D)/$(<F)
%.exe: %.txt
#{ printf %s\\n "%option 8bit noyywrap nounput" "%%"; \
sed -r \
's/((\\\\)*)(\\?)$$/\1\3\3/; #\
s/((\\\\)*)\\?"/\1\\"/g; #\
s/^((\\.|[^[:space:]])+)[[:space:]]*(.*)/"\1" {fputs("\3",yyout);}/' \
"$<"; \
printf %s\\n "%%" "int main(int argc, char** argv) { return yylex(); }"; \
} | lex -t | c99 -D_POSIX_C_SOURCE=200809L -O2 -x c -o "$#" -
Shell function to invoke the above:
substitute() {
gmake -f/path/to/substitute.mak "${1%.txt}"
}
You can invoke the above command with:
substitute file
where file is the name of the replacements file. (The filename must end with .txt but you don't have to type the file extension.)
The format of the input file is a series of lines consisting of a target string and a replacement string. The two strings are separated by whitespace. You can use any valid C escape sequence in the strings; you can also \-escape a space character to include it in the target. If you want to include a literal \, you'll need to double it.
If you don't want C escape sequences and would prefer to have backslashes not be metacharacters, you can replace the sed program with a much simpler one:
sed -r 's/([\\"])/\\\1/g' "$<"; \
(The ; \ is necessary because of the way make works.)
a) Don't use getline unless you have a very specific need and fully understand all the caveats, see http://awk.info/?tip/getline
b) Don't use regexps when you want strings (yes, this means you cannot use sed).
c) The while loop needs to constantly move beyond the part of the line you've already changed or you could end up in an infinite loop.
You need something like this:
$ cat substitute.awk
NR==FNR {
if (NF==2) {
strings[++numStrings] = $1
old2new[$1] = $2
}
next
}
{
for (stringNr=1; stringNr<=numStrings; stringNr++) {
old = strings[stringNr]
new = old2new[old]
slength = length(old)
tail = $0
$0 = ""
while ( sstart = index(tail,old) ) {
$0 = $0 substr(tail,1,sstart-1) new
tail = substr(tail,sstart+slength)
}
$0 = $0 tail
}
print
}
$ echo aaa | awk -f substitute.awk table.txt -
3
$ echo aaaa | awk -f substitute.awk table.txt -
31
and adding some RE metacharacters to table.txt to show they are treated just like every other character and showing how to run it when the target text is stored in a file instead of being piped:
$ cat table.txt
aaa 3
aa 2
a 1
. 7
\ 4
* 9
$ cat foo
a.a\aa*a
$ awk -f substitute.awk table.txt foo
1714291
Your new requirement requires a solution like this:
$ cat substitute.awk
NR==FNR {
if (NF==2) {
strings[++numStrings] = $1
old2new[$1] = $2
}
next
}
{
delete news
for (stringNr=1; stringNr<=numStrings; stringNr++) {
old = strings[stringNr]
new = old2new[old]
slength = length(old)
tail = $0
$0 = ""
charPos = 0
while ( sstart = index(tail,old) ) {
charPos += sstart
news[charPos] = new
$0 = $0 substr(tail,1,sstart-1) RS
tail = substr(tail,sstart+slength)
}
$0 = $0 tail
}
numChars = split($0, olds, "")
$0 = ""
for (charPos=1; charPos <= numChars; charPos++) {
$0 = $0 (charPos in news ? news[charPos] : olds[charPos])
}
print
}
.
$ cat table.txt
1 a
2 b
$ echo "121212" | awk -f substitute.awk table.txt -
ababab

Bash - listing files neatly

I have a non-determinate list of file names that I would like to output to the user in a script. I don't mind if it's a paragraph or in columns (like the out put of ls. How does ls manage it?). In fact I only have the following requirements:
file names need to stay on the same line (yes, that even means files with a space in their name. If someone is dumb enough to use a newline in a filename, though, they deserve what they get.)
If the output is formatted as a paragraph, I'd like to see it indented on the left and right to separate it from other text. Sort of like the way apt-get upgrade handles the list of packages to install.
I would love not to write my own function for this - at least not a complicated one. There are so many text formatting utilities in linux!
The utility should be available in the default Ubuntu install.
It should handle relatively large input, just in case. Something like 2000 characters or so?
It seems like a simple proposition, but I can't seem to get it to work. The column command is out simply because it can't handle large chunks of data. fmt and fold both don't care about delimiters. printf looks like it would work... if I wrote a script for it.
Is there a more flexible tool I've overlooked, or a simple way to do this?
Here I have a simple formatter that, it seems to me, is good enough
% ls | awk '
NR==1 {for(i=1;i<9;i++)printf "----+----%d", i; print ""
line=" " $0;l=2+length($0);next}
{if(l+1+length($0)>80){
print line; line = " " $0 ; l = 2+length($0) ; next}
{l=l+length($0)+1; line=line " " $0}}'
----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8
3inarow.py 5s.py a.csv a.not1.pdf a.pdf as de.1 asde.1 asdef.txt asde.py a.sh
a.tex auto a.wk bizarre.py board.py cc2012xyz2_5_5dp.csv cc2012xyz2_5_5dp.html
cc.py col.pdf col.sh col.sh~ col.tex com.py data data.a datazip datidisk
datizip.py dd.py doc1.pdf doc1.tex doc2 doc2.pdf doc2.tex doc3.pdf doc3.tex
e.awk Exit file file1 file2 geomedian.py group_by_1st group_by_1st.2
group_by_1st.mawk integers its.py join.py light.py listluatexfonts mask.py
mat.rix my_data nostop.py numerize.py pepp.py pepp.pyc pi.pdf pippo muore
pippo.py pi.py pi.tex pizza.py plocol.py points.csv points.py puddu puffo
%
I had to simulate input using ls because you didn't care to show how to access your list of files. The window width is arbitrary as well, but it's easy to provide a value to a -V width=... option of awk
Edit
I added a header line, an unrequested header line, to my awk script because I wanted to test the effectiveness of the (very simple) algorithm.
Addendum
I'd like to stress that the simple formatter above doesn't split "file names" like this across lines, as in the following example:
% ls -d1 b*
bia nconodi
bianconodi.pdf
bianconodi.ppt
bin
b.txt
% ls | awk '
NR==1 {for(i=1;i<9;i++)printf "----+----%d", i; print ""
line=" " $0;l=2+length($0);next}
{if(l+1+length($0)>80){
print line; line = " " $0 ; l = 2+length($0) ; next}
{l=l+length($0)+1; line=line " " $0}}'
----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8
04_Tower.pdf 2plots.py 2.txt a.csv aiuole asdefff a.txt a.txt~ auto
bia nconodi bianconodi.pdf bianconodi.ppt bin Borsa Ferna.jpg b.txt
...
%
As you can see, in the first line there is enough space to print bia but not enough for the real filename bia nconodi, that hence is printed on the second line.
Addendum 2
This is the formatter the OP eventually went with:
local margin=4
local max=10
echo -e "$filenames" | awk -v width=$(tput cols) -v margin=$margin '
NR==1 {
for (i=0; i<margin; i++) {
line = line " "
}
line = line $0;
l = margin + margin + length($0);
next;
}
{
if (l+1+length($0) > width) {
print line;
line = ""
for (i=0; i<margin; i++) line=line " "
line = line $0 ;
l = margin + margin + length($0) ;
next;
}
{
l = l + length($0) + 1;
line = line " " $0;
}
}
END {
print line;
}'
Perhaps you're looking for /usr/bin/fold?
printf '%s ' * | fold -w 77 | sed -e 's/^/ /'
Replace the * with your list, of course; if your files are in an array (they should be; storing filenames in scalar variables is lossy), that'd be "${your_array[#]}".
If you have your filenames in a variable this will create 3 columns, you can change -3 to whatever number of columns you want
echo "$var" | pr -3 -t
or if you need to get them from the filesystem:
find . -printf "%f\n" 2>/dev/null | pr -3 -t
From what you stated in the comments, I think this may be what you are looking for. The find command prints the file or directory name along with a newline and you can put additional filtering of the filenames by piping through grep or sed prior to pr - the pr command is for print and the -3 states 3 columns and -t is for omit headers and trailers - you can adjust it to your preferences.

How to get specific data from block of data based on condition

I have a file like this:
[group]
enable = 0
name = green
test = more
[group]
name = blue
test = home
[group]
value = 48
name = orange
test = out
There may be one ore more space/tabs between label and = and value.
Number of lines may wary in every block.
I like to have the name, only if this is not true enable = 0
So output should be:
blue
orange
Here is what I have managed to create:
awk -v RS="group" '!/enable = 0/ {sub(/.*name[[:blank:]]+=[[:blank:]]+/,x);print $1}'
blue
orange
There are several fault with this:
I am not able to set RS to [group], both this fails RS="[group]" and RS="\[group\]". This will then fail if name or other labels contains group.
I do prefer not to use RS with multiple characters, since this is gnu awk only.
Anyone have other suggestion? sed or awk and not use a long chain of commands.
If you know that groups are always separated by empty lines, set RS to the empty string:
$ awk -v RS="" '!/enable = 0/ {sub(/.*name[[:blank:]]+=[[:blank:]]+/,x);print $1}'
blue
orange
#devnull explained in his answer that GNU awk also accepts regular expressions in RS, so you could only split at [group] if it is on its own line:
gawk -v RS='(^|\n)[[]group]($|\n)' '!/enable = 0/ {sub(/.*name[[:blank:]]+=[[:blank:]]+/,x);print $1}'
This makes sure we're not splitting at evil names like
[group]
enable = 0
name = [group]
name = evil
test = more
Your problem seems to be:
I am not able to set RS to [group], both this fails RS="[group]" and
RS="\[group\]".
Saying:
RS="[[]group[]]"
should yield the desired result.
In these situations where there's clearly name = value statements within a record, I like to first populate an array with those mappings, e.g.:
map["<name>"] = <value>
and then just use the names to reference the values I want. In this case:
$ awk -v RS= -F'\n' '
{
delete map
for (i=1;i<=NF;i++) {
split($i,tmp,/ *= */)
map[tmp[1]] = tmp[2]
}
}
map["enable"] !~ /^0$/ {
print map["name"]
}
' file
blue
orange
If your version of awk doesn't support deleting a whole array then change delete map to split("",map).
Compared to using REs and/or sub()s., etc., it makes the solution much more robust and extensible in case you want to compare and/or print the values of other fields in future.
Since you have line-separated records, you should consider putting awk in paragraph mode. If you must test for the [group] identifier, simply add code to handle that. Here's some example code that should fulfill your requirements. Run like:
awk -f script.awk file.txt
Contents of script.awk:
BEGIN {
RS=""
}
{
for (i=2; i<=NF; i+=3) {
if ($i == "enable" && $(i+2) == 0) {
f = 1
}
if ($i == "name") {
r = $(i+2)
}
}
}
!(f) && r {
print r
}
{
f = 0
r = ""
}
Results:
blue
orange
This might work for you (GNU sed):
sed -n '/\[group\]/{:a;$!{N;/\n$/!ba};/enable\s*=\s*0/!s/.*name\s*=\s*\(\S\+\).*/\1/p;d}' file
Read the [group] block into the pattern space then substitute out the colour if the enable variable is not set to 0.
sed -n '...' set sed to run in silent mode, no ouput unless specified i.e. a p or P command
/\[group\]/{...} when we have a line which contains [group] do what is found inside the curly braces.
:a;$!{N;/\n$/!ba} to do a loop we need a place to loop to, :a is the place to loop to. $ is the end of file address and $! means not the end of file, so $!{...} means do what is found inside the curly braces when it is not the end of file. N means append a newline and the next line to the current line and /\n$/ba when we have a line that ends with an empty line branch (b) to a. So this collects all lines from a line that contains `[group] to an empty line (or end of file).
/enable\s*=\s*0/!s/.*name\s*=\s*\(\S\+\).*/\1/p if the lines collected contain enable = 0 then do not substitute out the colour. Or to put it another way, if the lines collected so far do not contain enable = 0 do substitute out the colour.
If you don't want to use the record separator, you could use a dummy variable like this:
#!/usr/bin/awk -f
function endgroup() {
if (e == 1) {
print n
}
}
$1 == "name" {
n = $3
}
$1 == "enable" && $3 == 0 {
e = 0;
}
$0 == "[group]" {
endgroup();
e = 1;
}
END {
endgroup();
}
You could actually use Bash for this.
while read line; do
if [[ $line == "enable = 0" ]]; then
n=1
else
n=0
fi
if [ $n -eq 0 ] && [[ $line =~ name[[:space:]]+=[[:space:]]([a-z]+) ]]; then
echo ${BASH_REMATCH[1]}
fi
done < file
This will only work however if enable = 0 is always only one line above the line with name.

Show different context on different grep keyword?

I know -A -B -C could be used to show context around the grep keyword.
My question is, how to show different context on different keyword?
For example, how do I show -A 5 for cat, -B 4 for dog, and -C 1 for monkey:
egrep -A3 "cat|dog|monkey" <file>
// this just show 3 after lines for each keyword.
i don't think there's any way to do it with a single grep call, but you could run it through grep once for each variable and concatenate the output:
var=$(grep -n -A 5 cat file)$'\n'$(grep -n -B 4 dog file)$'\n'$(grep -n -C 1 monkey file)
var=$(sort -un <(echo "$var"))
now echo "$var" will produce the same output as you would have gotten from your single command, plus line numbers and context indicators (the : prefix indicates a line that matched the pattern exactly, and the - prefix indicates a line being included because of the -A -B and/or -C options).
the reason i included the line numbers thus far is to preserve the order of the results you would have seen had you managed to do this in one statement. if you like them, great, but if not, you can use the following line to cut them out:
var=$(cut -d: -f2- <(echo "$var") | cut -d- -f2-)
this passes it through once to cut the exact matching lines' prefixes, then again to cut the context matches' prefixes.
pretty? no. but it works.
I'm afraid grep won't do that. You'll have to use a different tool. Perhaps write your own program.
Something like this would do it:
awk '
BEGIN{ ARGV[ARGC++] = ARGV[1] }
function prtB(nr) { for (i=FNR-nr; i<FNR; i++) print a[i] }
function prtA(nr) { for (i=FNR+1; i<=FNR+nr; i++) print a[i] }
NR==FNR{ a[NR]; next }
/cat/ { print; prtA(5) }
/dog/ { prtB(4); print }
/monkey/ { prtB(1); print; prtA(1) }
' file
check the math on the loops in the functions. You didn't say how you'd want to handle lines that contain monkey AND dog, for example.
EDIT: here's an untested solution that would print the maximum context around any match and let you specify the contexts on the command line and won't use as much memory as the above cheap and cheerful solution:
awk -v cxts="cat:0:5\ndog:4:0\nmonkey:1:1" '
BEGIN{
ARGV[ARGC++] = ARGV[1]
numCxts = split(cxts,cxtsA,RS)
for (i=1;i<=numCxts;i++) {
regex = cxtsA[i]
n = split(regex,rangeA,/:/)
sub(/:[^:]+:[^:]+$/,"",regex)
endA[regex] = rangeA[n]
startA[regex] = rangeA[n-1]
regexA[regex]
}
}
NR==FNR{
for (regex in regexA) {
if ($0 ~ regex) {
start = NR - startA[regex]
end = NR + endA[regex]
for (i=start; i<=end; i++) {
prt[i]
}
}
}
next
}
FNR in prt
' file
Separate the searched for patterns in the cxts variable with whatever your RS value is, newline by default.

Resources