Looking for exact match using grep - bash

Suppose that I have a file like this:
tst.txt
fName1 lName1-a 222
fname1 lName1-b 22
fName1 lName1 2
And I want to get the 3rd column only for "fName1 lName1", using this command:
var=`grep -i -w "fName1 lName1" tst.txt`
However this returns me every line that starts with "fName1 lName1", how can I look for the exact match?

Here you go:
#!/bin/bash
var=$(grep -Po '(?<=fName1 lName1 ).+' tst.txt)
echo $var
The trick is to use the o option of the grep command. The P option tells the interpreter to use Perl-compatible regular expression syntax when parsing the pattern.

var=$(grep "fName1 lName1 " tst.txt |cut -d ' ' -f 3)

you can try this method:
grep -i -E "^fName1 lName1\s" tst.txt | cut -f3,3- -d ' '
But you must be sure that line starts with fName1 and you have space after lName1.

Related

Bash regex: get value in conf file preceded by string with dot

I have to get my db credentials from this configuration file:
# Database settings
Aisse.LocalHost=localhost
Aisse.LocalDataBase=mydb
Aisse.LocalPort=5432
Aisse.LocalUser=myuser
Aisse.LocalPasswd=mypwd
# My other app settings
Aisse.NumDir=../../data/Num
Aisse.NumMobil=3000
# Log settings
#Aisse.Trace_AppliTpv=blabla1.tra
#Aisse.Trace_AppliCmp=blabla2.tra
#Aisse.Trace_AppliClt=blabla3.tra
#Aisse.Trace_LocalDataBase=blabla4.tra
In particular, I want to get the value mydb from line
Aisse.LocalDataBase=mydb
So far, I have developed this
mydbname=$(echo "$my_conf_file.conf" | grep "LocalDataBase=" | sed "s/LocalDataBase=//g" )
that returns
mydb #Aisse.Trace_blabla4.tra
that would be ok if it did not return also the comment string.
Then I have also tryed
mydbname=$(echo "$my_conf_file.conf" | grep "Aisse.LocalDataBase=" | sed "s/LocalDataBase=//g" )
that retruns void string.
How can I get only the value that is preceded by the string "Aisse.LocalDataBase=" ?
Using sed
$ mydbname=$(sed -n 's/Aisse\.LocalDataBase=//p' input_file)
$ echo $mydbname
mydb
I'm afraid you're being incomplete:
You mention you want the line, containing "LocalDataBase", but you don't want the line in comment, let's start with that:
A line which contains "LocalDataBase":
grep "LocalDataBase" conf.conf.txt
A line which contains "LocalDataBase" but who does not start with a hash:
grep "LocalDataBase" conf.conf.txt | grep -v "^ *#"
??? grep -v "^ *#"
That means: don't show (-v) the lines, containing:
^ : the start of the line
* : a possible list of space characters
# : a hash character
Once you have your line, you need to work with it:
You only need the part behind the equality sign, so let's use that sign as a delimiter and show the second column:
cut -d '=' -f 2
All together:
grep "LocalDataBase" conf.conf.txt | grep -v "^ *#" | cut -d '=' -f 2
Are we there yet?
No, because it's possible that somebody has put some comment behind your entry, something like:
LocalDataBase=mydb #some information
In order to prevent that, you need to cut that comment too, which you can do in a similar way as before: this time you use the hash character as a delimiter and you show the first column:
grep "LocalDataBase" conf.conf.txt | grep -v "^ *#" | cut -d '=' -f 2 | cut -d '#' -f 1
Have fun.
You may use this sed:
mydbname=$(sed -n 's/^[^#][^=]*LocalDataBase=//p' file)
echo "$mydbname"
mydb
RegEx Details:
^: Start
[^#]: Matches any character other than #
[^=]*: Matches 0 or more of any character that is not =
LocalDataBase=: Matches text LocalDataBase=
You can use
mydbname=$(sed -n 's/^Aisse\.LocalDataBase=\(.*\)/\1/p' file)
If there can be leading whitespace you can add [[:blank:]]* after ^:
mydbname=$(sed -n 's/^[[:blank:]]*Aisse\.LocalDataBase=\(.*\)/\1/p' file)
See this online demo:
#!/bin/bash
s='# Database settings
Aisse.LocalHost=localhost
Aisse.LocalDataBase=mydb
Aisse.LocalPort=5432
Aisse.LocalUser=myuser
Aisse.LocalPasswd=mypwd
# My other app settings
Aisse.NumDir=../../data/Num
Aisse.NumMobil=3000
# Log settings
#Aisse.Trace_AppliTpv=blabla1.tra
#Aisse.Trace_AppliCmp=blabla2.tra
#Aisse.Trace_AppliClt=blabla3.tra
#Aisse.Trace_LocalDataBase=blabla4.tra'
sed -n 's/^Aisse\.LocalDataBase=\(.*\)/\1/p' <<< "$s"
Output:
mydb
Details:
-n - suppresses default line output in sed
^[[:blank:]]*Aisse\.LocalDataBase=\(.*\) - a regex that matches the start of a string (^), then zero or more whiespaces ([[:blank:]]*), then a Aisse.LocalDataBase= string, then captures the rest of the line into Group 1
\1 - replaces the whole match with the value of Group 1
p - prints the result of the successful substitution.

Convert data from a simple JSON format to a DSV format

I have a file in Unix, with data sample like the following:
{"ID":"123", "Region":"Asia", "Location":"India"}
{"ID":"234", "Region":"APAC", "Location":"Australia"}
{"ID":"345", "Region":"Americas", "Location":"Mexio"}
{"ID":"456", "Region":"Americas", "Location":"Canada"}
{"ID":"567", "Region":"APAC", "Location":"Japan"}
The desired output is
ID|Region|Location
123|Asia|India
234|APAC|Australia
345|Americas|Mexico
456|Americas|Canada
567|APAC|Japan
I tried with a few sed commands. I could remove the following: '{', '}', ' " ', ':'
There are 2 issues with the output file
All rows from input appear in single line in the output.
Adding the pipe ('|') as delimiter.
Any pointers are highly appreciated.
I recommend the tool jq (http://stedolan.github.io/jq/); jq is a lightweight and flexible command-line JSON processor.
jq -r '"\(.ID)|\(.Region)|\(.Location)"' < infile
123|Asia|India
234|APAC|Australia
345|Americas|Mexio
456|Americas|Canada
567|APAC|Japan
Explanation
-r is --raw-output
Through awk,
awk -F'"' -v OFS="|" 'BEGIN{print "ID|Region|Location"}{print $4,$8,$12}' file
Example:
$ cat file
{"ID":"123", "Region":"Asia", "Location":"India"}
{"ID":"234", "Region":"APAC", "Location":"Australia"}
{"ID":"345", "Region":"Americas", "Location":"Mexio"}
{"ID":"456", "Region":"Americas", "Location":"Canada"}
{"ID":"567", "Region":"APAC", "Location":"Japan"}
$ awk -F'"' -v OFS="|" 'BEGIN{print "ID|Region|Location"}{print $4,$8,$12}' file
ID|Region|Location
123|Asia|India
234|APAC|Australia
345|Americas|Mexio
456|Americas|Canada
567|APAC|Japan
EXplanation:
-F'"' Sets " as Field Separator value.
OFS="|" Sets | as Output Field Separator value.
Atfirst, awk would execute the function inside the BEGIN block. It helps to print the header section.
This sed one-liner does what you want. It's capturing the field values using parenthesized expressions, and then putting them into the output using \1, \2, and \3.
s/^{"ID":"\([^"]*\)", "Region":"\([^"]*\)", "Location":"\([^"]*\)"}$/\1|\2|\3/
Invoke it like:
$ sed -f one-liner.sed input.txt
Or you can invoke it within a Bash script, producing the header:
echo 'ID|Region|Location'
sed -e 's/^{"ID":"\([^"]*\)", "Region":"\([^"]*\)", "Location":"\([^"]*\)"}$/\1|\2|\3/' $input
It is a JSON file so it is best to use a JSON parser. Here is a perl implementation of it.
#!/usr/bin/perl
use strict;
use warnings;
use JSON;
open my $fh, '<', 'path/to/your/file';
#keys of your structure
my #key = qw(ID Region Location);
print join ("|", #key), "\n";
#iterate over your file, decode it and print in order of your key structure
while (my $json = <$fh>) {
my $text = decode_json($json);
print join ("|", map { $$text{$_} } #key ),"\n";
}
Output:
ID|Region|Location
123|Asia|India
234|APAC|Australia
345|Americas|Mexio
456|Americas|Canada
567|APAC|Japan
Using sed as follows
Command line
echo "my_string" |
sed -e 's#[,:"{}]##g' -e 's#ID##g' -e "s#Region##g" -e 's#Location##g' \
-e '1 s#^.*$#ID Region Location\n&#' -e 's# #|#g'
or
sed -e 's#[,:"{}]##g' -e 's#ID##g' -e "s#Region##g" -e 's#Location##g' \
-e '1 s#^.*$#ID Region Location\n&#' -e 's# #|#g' my_file
I tried this in a terminal as follows:
echo '{"ID":"123", "Region":"Asia", "Location":"India"}
{"ID":"234", "Region":"APAC", "Location":"Australia"}
{"ID":"345", "Region":"Americas", "Location":"Mexio"}
{"ID":"456", "Region":"Americas", "Location":"Canada"}
{"ID":"567", "Region":"APAC", "Location":"Japan"}' |
sed -e 's#[,:"{}]##g' -e 's#ID##g' -e "s#Region##g" -e 's#Location##g' \
-e '1 s#^.*$#ID Region Location\n&#' -e 's# #|#g'
Output
ID|Region|Location
123|Asia|India
234|APAC|Australia
345|Americas|Mexio
456|Americas|Canada
567|APAC|Japan
Many thanks for your response and the pointers/ solutions did help a lot.
For some mysterious reasons, I couldn't get any sed commands work. So, I devised my own solution. Although it's not elegant, it's still worked.
Here is the script I prepared which resolved the issue.
#!/bin/bash
# ource file path.
infile=/home/exfile.txt
# remove if these temp file exist already.
rm ./efile.txt ./xfile.txt ./yfile.txt ./zfile.txt
# removing the curly braces from input file.
cat exfile.txt | cut -d "{" -f2 | cut -d "}" -f1 >> ./efile.txt
# setting input file name to different value.
infile=./efile.txt
# remove double quotes from the file.
while IFS= read -r line
do
echo $line | sed 's/\"//g' >> ./xfile.txt
done < "$infile"
# creating another temp file.
infile2=./xfile.txt
# remove colon from file.
while IFS= read -r line
do
echo $line | sed 's/\:/,/g' >> ./yfile.txt
done < "$infile2"
# set input file path to new temp file.
infile3=yfile.txt
# initialize variables to hold header column values.
t1=0
t3=0
t5=0
# read each of the line to extract header row. Exit loop after reading 1st row.
once=1
while IFS=',' read -r f1 f2 f3 f4 f5 f6
do
"$f1 $f2 $f3 $f4 $f5 $f6"
t1=$f1
t3=$f3
t5=$f5
if [ "$once" -eq 1 ]; then
break
fi
done < "$infile3"
# Read each of the line from input file. Write only the value to another output file.
while IFS=',' read -r f1 f2 f3 f4 f5 f6
do
echo "$f2|$f4|$f6" >> ./zfile.txt
done < "$infile3"
# insert the header column row into the file generated in the step above.
frstline="$t1|$t3|$t5"
sed -i '1i ID|Region|Location' ./zfile.txt

How to get delete word combination "Name Server:" without quotes but keep 'Name Server:someletters/digits' in sed

I have the following lines:
Name Server:NS92.WORLDNIC.COM(or some other value)
Name Server:
Name Server:
Name Server:
Please see the screenshot for better understanding: http://imgur.com/q6Ir4lo
How do I get rid of the 'Name Server:' line but keep the line with the value?
I tried /Name Server:{0,0}/d but it deletes all lines.
Thanks
I was able to get the following two lines to work:
I believe the [:space:] is POSIX compliant:
cat test |sed '/^Name Server:[[:space:] \t]\?$/d'
An alternative is simply:
cat test |sed '/^Name Server:[ \t]\?$/d'
I've also found in sed, that most of the meta-characters (eg + ? ) need to be escaped for sed to recognize them correctly.
This works for me:
echo "Name Server:NS92.WORLDNIC.COM" | sed 's/^Name Server://'
cut -d ":" -f 2 < ff | sed '/^$/d'
Uses ':' as delimiter and splits the line (-d option), then selects the second field (-f option)

How to use sed to extract a string [duplicate]

This question already has answers here:
BASH extract value after string in variable Not file [duplicate]
(2 answers)
Closed last year.
I need to extract a number from the output of a command: cmd. The output is type: 1000
So my question is how to execute the command, store its output in a variable and extract 1000 in a shell script. Also how do you store the extracted string in a variable?
This question has been answered in pieces here before, it would be something like this:
line=$(sed -n '2p' myfile)
echo "$line"
if [ `echo $line || grep 'type: 1000' ` ] then;
echo "It's there!";
fi;
Store output of sed into a variable
String contains in Bash
EDIT: sed is very limited, you would need to use bash, perl or awk for what you need.
This is a typical use case for grep:
output=$(cmd | grep -o '[0-9]\+')
You can write the output of a command or even a pipeline of commands into a shell variable using so called command substitution:
variable=$(cmd);
In comments it appeared that the output of cmd contains more lines than the type : 1000. In this case I would suggest sed:
output=$(cmd | sed -n 's/type : \([0-9]\+\)/\1/p;q')
You tagged your question as sed but your question description does not restrict other tools, so here's a solution using awk.
output = `cmd | awk -F':' '/type: [0-9]+/{print $2}'`
Alternatively, you can use the newer $( ) syntax. Some find the newer syntax preferable and it can be conveniently nested, without the need for escaping backtics.
output = $(cmd | awk -F':' '/type: [0-9]+/{print $2}')
If the output is rigidly restricted to "type: " followed by a number, you can just use cut.
var=$(echo 'type: 1000' | cut -f 2 -d ' ')
Obviously you'll have to pipe the output of your command to cut, I'm using echo as a demo.
In addition, I'd use grep and then cut if the string you are searching is more complex. If we assume there can be all kind of numbers in the text, but only one occurrence of "type: " followed by a number, you can use the command:
>> var=$(echo "hello 12 type: 1000 foo 1001" | grep -oE "type: [0-9]+" | cut -f 2 -d ' ')
>> echo $var
1000
You can use the | operator to send the output of one command to another, like so:
echo " 1\n 2\n 3\n" | grep "2"
This sends the string " 1\n 2\n 3\n" to the grep command, which will search for the line containing 2. It sound like you might want to do something like:
cmd | grep "type"
Here is a plain sed solution that uses a regualar expression to find the number in your string:
cmd | sed 's/^.*type: \([0-9]\+\)/\1/g'
^ means from the start
.* can be any character (also none)
\([0-9]\+\) are numbers (minimum one character)
\1 means it takes the first pattern it finds (and only in this case) and uses it as replacement for the whole string

modify the contents of a file without a temp file

I have the following log file which contains lines like this
1345447800561|FINE|blah#13|txReq
1345447800561|FINE|blah#13|Req
1345447800561|FINE|blah#13|rxReq
1345447800561|FINE|blah#14|txReq
1345447800561|FINE|blah#15|Req
I am trying extract the first field from each line and depending on whether it belongs to blah#13 or blah#14, blah#15 i am creating the corresponding files using the following script, which seems quite in-efficient in terms of the number of temp files creates. Any suggestions on how I can optimize it ?
cat newLog | grep -i "org.arl.unet.maca.blah#13" >> maca13
cat newLog | grep -i "org.arl.unet.maca.blah#14" >> maca14
cat newLog | grep -i "org.arl.unet.maca.blah#15" >> maca15
cat maca10 | grep -i "txReq" >> maca10TxFrameNtf_temp
exec<blah10TxFrameNtf_temp
while read line
do
echo $line | cut -d '|' -f 1 >>maca10TxFrameNtf
done
cat maca10 | grep -i "Req" >> maca10RxFrameNtf_temp
while read line
do
echo $line | cut -d '|' -f 1 >>maca10TxFrameNtf
done
rm -rf *_temp
Something like this ?
for m in org.arl.unet.maca.blah#13 org.arl.unet.maca.blah#14 org.arl.unet.maca.blah#15
do
grep -i "$m" newLog | grep "txReq" | cut -d' ' -f1 > log.$m
done
I've found it useful at times to use ex instead of grep/sed to modify text files in place without using temps ... saves the trouble of worrying about uniqueness and writability to the temp file and its directory etc. Plus it just seemed cleaner.
In ksh I would use a code block with the edit commands and just pipe that into ex ...
{
# Any edit command that would work at the colon prompt of a vi editor will work
# This one was just a text substitution that would replace all contents of the line
# at line number ${NUMBER} with the word DATABASE ... which strangely enough was
# necessary at one time lol
# The wq is the "write/quit" command as you would enter it at the vi colon prompt
# which are essentially ex commands.
print "${NUMBER}s/.*/DATABASE/"
print "wq"
} | ex filename > /dev/null 2>&1

Resources