How to get the value after a particular string is found in a text file , using shell scripting - bash

I have a text file and the contents of the file are as follows:
#SERVICE INFO:
srv id [8503]
serv rqst id xxxxxx
serv rqst len [17]
serv status [C]
#SERVICE INFO:
srv id [8501]
serv rqst id xxxxxx
serv rqst len [17]
serv status [C]
#SERVICE INFO:
srv id [8500]
serv rqst id xxxxxx
serv rqst len [17]
serv status [C]
I want to read the srv id and find its corresponding status and use it for further validation.
For ex:
for srv id 9500, serv status is C
I have tried the below awk statement:
awk '{for (I=1;I<=NF;I++) if ($I == "service id") {print $(I+1)};}' $testfile
It gives me a blank output.
Here testfile is my sample text file.
Any input is appreciated.

awk -F '[][]' '$1 ~ /srv id/ {id = $2} $1 ~ /serv status/ {print id, $2}' file
That uses [ or ] as the field separator. If the first field contains "srv id", remember the id. If the first field contains "serv status", print the id and the status value.
Output:
8503 C
8501 C
8500 C

If you don't mind Perl:
perl -00 -ne 'm{srv id.+?(\d+).+status.+\[(\w)\]}s and print "$1 $2\n"' file
This yields:
8503 C
8501 C
8500 C
The -00 switch tells Perl to read the file in "paragraph mode" where a record separator is one or more blank lines.
We match a sequence of characters that begin with "srv id" and ends with the token "status" followed by its value. A dot is any character; the + signifies one or more; and the +? denotes non-greedy matching. \d signifies a digit and \w a word character. Opening and closing square brackets must be escaped to mean themselves. In order to have a dot match a newline character, we add the s modifier at the end of the match pattern m{...}s
Should you want to look up an ID to print its status, simply grep for the ID in the output, piped to grep:
perl ... | grep 8501

Related

Search field and display next data to it

Is there an easiest way to search the following data with specific field based on the field ##id:?
This is the sample data file called sample
##id: 123 ##name: John Doe ##age: 18 ##Gender: Male
##id: 345 ##name: Sarah Benson ##age: 20 ##Gender: Female
For example, If I want to search an ID of 123 and his gender I would do this:
Basically this is the prototype that I want:
# search.sh
#!/bin/bash
# usage: search.sh <id> <field>
# eg: search 123 age
search="$1"
field="$2"
grep "^##id: ${search}" sample | # FILTER <FIELD>
So when I search an ID 123 like below:
search.sh 123 gender
The output would be
Male
Up until now, based on the code above, I only able to grep one line based on ID, and I'm not sure what is the best method or fastest method with less complicated to get its next value after specifying the field (eg. age)
1st solution: With your shown samples, please try following bash script. This considers that you want to match exact string match.
cat script.bash
#!/bin/bash
search="$1"
field="$2"
awk -v search="$search" -v field="$field" '
match($0,"##id:[[:space:]]*"search){
value=""
match($0,"##"field":[[:space:]]*[^#]+")
value=substr($0,RSTART,RLENGTH)
sub(/.*: +/,"",value)
print value
}
' Input_file
2nd solution: In case you want to search strings(values) irrespective of their cases(lower/upper case) in each line then try following code.
cat script.bash
#!/bin/bash
search="$1"
field="$2"
awk -v search="$search" -v field="$field" '
match(tolower($0),"##id:[[:space:]]*"tolower(search)){
value=""
match(tolower($0),"##"tolower(field)":[[:space:]]*[^#]+")
value=substr($0,RSTART,RLENGTH)
sub(/.*: +/,"",value)
print value
}
' Input_file
Explanation: Simple explanation of code would be, creating BASH script, which is expecting 2 parameters while its being run. Then passing these parameters as values to awk program. Then using match function to match the id in each line and print the value of passed field(eg: name OR Gender etc).
Since you want to extract a part of each line found, different from the part you are matching against, sed or awk would be a better tool than grep. You could pipe the output of grep into one of the others, but that's wasteful because both sed and awk can do the line selection directly. I would do something like this:
#!/bin/bash
search="$1"
field="$2"
sed -n "/^##id: ${search}"'\>/ { s/.*##'"${field}"': *//i; s/ *##.*//; p }' sample
Explanation:
sed is instructed to read file sample, which it will do line by line.
The -n option tells sed to suppress its usual behavior of automatically outputting its pattern space at the end of each cycle, which is an easy way to filter out lines that don't match the search criterion.
The sed expression starts with an address, which in this case is a pattern matching lines by id, according to the script's first argument. It is much like your grep pattern, but I append \>, which matches a word boundary. That way, searches for id 123 will not also match id 1234.
The rest of the sed expression edits out the everything in the line except the value of the requested field, with the field name being matched case-insensitively, and prints the result. The editing is accomplished by the two s/// commands, and the p command is of course for "print". These are all enclosed in curly braces ({}) and separated by semicolons (;) to form a single compound associated with the given address.
Assumptions:
'label' fields have format ##<string>:
need to handle case-insensitive searches
'label' fields could be located anywhere in the line (ie, there is no set ordering of 'label' fields)
the 1st input search parameter is always a value associated with the ##id: label
the 2nd input search parameter is to be matched as a whole word (ie, no partial label matching; nam will not match against ##name:)
if there are multiple 'label' fields that match the 2nd input search parameter we print the value associated with the 1st match found in the line)
One awk idea:
awk -v search="${search}" -v field="${field}" '
BEGIN { field = tolower(field) }
{ n=split($0,arr,"##|:") # split current line on dual delimiters "##" and ":", place fields into array arr[]
found_search = 0
found_field = 0
for (i=2;i<=n;i=i+2) { # loop through list of label fields
label=tolower(arr[i])
value = arr[i+1]
sub(/^[[:space:]]+/,"",value) # strip leading white space
sub(/[[:space:]]+$/,"",value) # strip trailing white space
if ( label == "id" && value == search )
found_search = 1
if ( label == field && ! found_field )
found_field = value
}
if ( found_search && found_field )
print found_field
}
' sample
Sample input:
$ cat sample
##id: 123 ##name: John Doe ##age: 18 ##Gender: Male
##id: 345 ##name: Sarah Benson ##age: 20 ##Gender: Female
##name: Archibald P. Granite, III, Ph.D, M.D. ##age: 20 ##Gender: not specified ##id: 567
Test runs:
search=123 field=gender => Male
search=123 field=ID => 123
search=123 field=Age => 18
search=345 field=name => Sarah Benson
search=567 field=name => Archibald P. Granite, III, Ph.D, M.D.
search=567 field=GENDER => not specified
search=999 field=age => <no output>
For the given data format, you could set the field separator to optional spaces followed by ## to prevent trailing spaces for the printed field.
Then create a key value mapping per row (making the keys and the field to search for lowercase) and search for the key, which will be independent of the order in the string.
If the key is present, then print the value.
#!/bin/bash
search="$1"
field="$2"
awk -v search="${search}" -v field="${field}" '
BEGIN {FS="[[:blank:]]*##"} # Set field separator to optional spaces and ##
{
for (i = 1; i <= NF; i++) { # Loop all the fields
split($i, a, /[[:blank:]]*:[[:blank:]]*/) # Split the field on : with optional surrounded spaces
kv[tolower(a[1])]=a[2] # Create a key value array using the splitted values
}
val = kv[tolower(field)] # Get the value from kv based on the lowercase key
if (kv["id"] == search && val) print val # If there is a matching key and a value, print the value
}' file
And then run
./search.sh 123 gender
Output
Male

Sed conditional match and execute command with offset

I am finding a bash command for a conditional replacement with offset. The existing posts that I've found are conditional replacement without offset or with a fixed offset.
Task: If uid contains 8964, then insert the line FORBIDDEN before DOB.
Each TXT file below represents one user, and it contains (in the following order)
some property(ies)
unique uid
some quality(ies)
unique DOB
a random lorem ipsum
I hope I can transform the following files
# file1.txt (uid doens't match 8964)
admin: false
uid: 123456
happy
movie
DOB: 6543-02-10
lorem ipsum
seo varis lireccuni paccem noba sako
# file2.txt (uid matches 8964)
citizen: true
hasSEAcct: true
uid: 289641
joyful hearty
final debug Juno XYus
magazine
DOB: 1234-05-06
saadi torem lopez dupont
into
# file1.txt (uid doens't match 8964)
admin: false
uid: 123456
happy
movie
DOB: 6543-02-10
lorem ipsum
seo varis lireccuni paccem noba sako
# file2.txt (uid matches 8964)
citizen: true
hasSEAcct: true
uid: 289641
joyful hearty
final debug Juno XYus
magazine
FORBIDDEN
DOB: 1234-05-06
saadi torem lopez dupont
My try:
If uid contains 8964, then do a 2nd match with DOB, and insert FORBIDDEN above DOB.
sed '/^uid: [0-9]*8964[0-9]*$/{n;/^DOB: .*$/{iFORBIDDEN}}' file*.txt
This gives me an unmatched { error.
sed: -e expression #1, char 0: unmatched `{'
I know that sed '/PAT/{n;p}' will execute {n;p} if PAT is matched, but it seems impossible to put /PAT2/{iTEXT} inside /PAT/{ }.
How can I perform such FORBIDDEN insertion?
$ awk '
/^uid/ && /8964/ {f=1} #1
/^DOB/ && f {print "FORBIDDEN"; f=0} #2
1 #3
' file
If a line starting with "uid" matches "8964", set flag
If a line starts with "DOB" and flag is set, print string and unset flag
print every line
$ awk -v RS='' '/uid: [0-9]*8964/{sub(/DOB/, "FORBIDDEN\nDOB")} 1' file
Alternatively, treat every block separated by a blank line as a single record, then sub in "FORBIDDEN\nDOB" if there's a match. I think the first one's better practice. As a very general rule, once you start thinking in terms of fields/records, it's time for awk/perl.
In my opinion, this is a good use-case for sed.
Here is a GNU sed solution with some explanation:
# script.sed
/^uid:.*8964/,/DOB/ { # Search only inside this range, if it exists.
/DOB/i FORBIDDEN # Insert FORBIDDEN before the line matching /DOB/.
}
Testing:
▶ gsed -f script.sed FILE2
citizen: true
hasSEAcct: true
uid: 289641
joyful hearty
final debug Juno XYus
magazine
FORBIDDEN
DOB: 1234-05-06
saadi torem lopez dupont
▶ gsed -f script.sed FILE1
admin: false
uid: 123456
happy
movie
DOB: 6543-02-10
lorem ipsum
seo varis lireccuni paccem noba sako
Or on one line:
▶ gsed -e '/^uid:.*8964/,/DOB/{/DOB/i FORBIDDEN' -e '}' FILE*
tried on gnu sed
sed -Ee '/^uid:\s*\w*8964\w*$/{n;/^DOB:/iFORBIDDEN' -e '}' file*.txt

grep for Error and print all the lines containing 2 strings above and below Error

saaa vcahJJ HKak vk
Import xxx xXXXXX xxxx
aaaa aaaa aaaa ffffff
hhhhhh hhhhhh hhh hhh hhhhhh
Error reading readStatus api
aaa hhhh aaa aaaaa
gggggggg ggggg xxxxxxxxxx
uuuu hhhhhhhh fffffffff
query run ends
qidIdih II v iQE Iqe
I want to find the 'Error' string in the file containing above logs and then print all the info available between 2 strings 'Import' and 'ends'.
How can I do this using grep/sed
Tried this
but didn't get much.
Note: I dont know how many lines will be before and after. It may vary from above sample I have provided
How about:
$ awk 'BEGIN{RS=ORS="ends\n"} /Error/' file
RS is the input record separator which needs to be ends. ORSgets the same value for output purposes. Also, your example had /^Error/ but Error does not start the record (^).
Grep's -A 1 option will give you one line after; -B 1 will give you one line before; and -C 1 combines both to give you one line both before and after.
grep -C 1 "Error" <logfile>
As per your requirement, You can use-
sed -n '/Import/,/ends/p' filename
Here you are:
out=$( sed -n '/^Import/,/end$/p' file )
echo "$out" | grep Error >/dev/null && echo "$out"
This will capture the text between "Import" and "end" and print it only if the extracted text contains "Error".
You can try this sed
sed '/^Import/!d;:A;N;/\nImport/{h;s/.*\n//;bA};/ends$/!bA;h;s/\nError//;tB;d;:B;x' infile
Explanation :
sed '
/^Import/!d # if a line start with Import
:A
N # get an other line
/\nImport/{h;s/.*\n//;bA} # if the last line start with Import
# keep only this last line and return to A
/ends$/!bA # If the last line end with ends
h # keep all the lines in the hold space
s/\nError// # look for a line which start with Error
tB # if find jump to B
d # not find, delete all and return to the start
:B
x # restore all the lines and print
' infile

AWK record separator set to empty line not working

I am trying to write a simple AWK script which uses empty lines as record separator. I reproduced on my PC the example from the GNU AWK manual Multiple-Line Records. I copy the code below:
# addrs.awk --- simple mailing list program
# Records are separated by blank lines.
# Each line is one field.
BEGIN { RS = "" ; FS = "\n" }
{
print "Name is:", $1
print "Address is:", $2
print "City and State are:", $3
print ""
}
Input is:
Jane Doe
123 Main Street
Anywhere, SE 12345-6789
John Smith
456 Tree-lined Avenue
Smallville, MW 98765-4321
Files are created on UNIX system.
Required output is:
Name is: Jane Doe
Address is: 123 Main Street
City and State are: Anywhere, SE 12345-6789
Name is: John Smith
Address is: 456 Tree-lined Avenue
City and State are: Smallville, MW 98765-4321
Instead, I get a result which is different from the expected one. What I get is:
Name is: Jane Doe
Address is: 123 Main Street
City and State are: Anywhere, SE 12345-6789
Does anybody know why I am getting the wrong result? AWK finds only 1 record instead of 2, do you know why?
This is to confirm that:
(1) the given program works properly using awk version 20070501, gawk, or mawk, provided the input file has bare newline ('\n') line endings (as opposed to CR LF).
(2) if the input is a DOS text file, then the result is as the OP stated.
Also, if the input file is a DOS text file, an alternative to dos2unix is to use tr as illustrated here:
$ tr -d '\r' < input.dos.txt | awk ....

List all occurences of a particular string or number in the file : UNIX

I have multiple files containing phone numbers in a format XXX XXX XXXX or +XX XXXXXX XXXX or maybe XXXXXXXXXX. Can I list all the phone numbers containing particular no. '845' against each file name.
Currently I am using:-
egrep -H 845 *
This might do what you want (uses GNU for word delimiters):
$ cat file
now is the 123 845 1234 winter of +01 238456 5432
our 1845234567 discontent.
$ cat tst.awk
{
while ( match($0,/(\<(([0-9]{3} ){2}[0-9]{4}|[0-9]{10})|[+][0-9]{2} [0-9]{6} [0-9]{4})\>/) ) {
tgt = substr($0,RSTART,RLENGTH)
$0 = substr($0,RSTART+RLENGTH)
if ( tgt ~ /845/ ) {
print tgt
}
}
}
$ awk -f tst.awk file
123 845 1234
+01 238456 5432
1845234567
If not, edit your question to provide some sample input and expected output.
You can try this:
awk '/845/ {print FILENAME,FNR,$0}' *
Or this:
grep -rF '845' *
This does not give you hits only on phone number, but all line with this nubers.
You does not tell if this is middle of a number, start or end.
To list the surrounding numbers around the match but nothing else,
grep -Ho '[0-9]*845[0-9]*' files
If your phone numbers can contain spaces and punctuation, maybe add those to the character classes; but be careful so as not to make the regex match two adjacent phone numbers in one go. (If they are always separated by text not in the character class, you're fine.)
(There is nothing -E in this particular regex so I'm not using egrep.)

Resources