How to extract char-sequence after pattern between quotes in Bash? - bash

I have a property file meant for Java like:
oracle {
username = "bla"
password = "blabla"
driver = "driver1"
}
postgres {
username = "pg"
password = "pg"
driver = "pg-driver"
}
when read into java I can extract the oracle.driver property which returns driver1.
Now I want to extract the same string in a bash script.
I have tried something like:
grep -A5 oracle application.conf | grep -Po 'driver = ".*?"' | grep -Po '".*"'
returning "driver1" (including the quotes). I also tried using sed substitute but that also did not yield the driver1 string.
How can I retriever only driver1?

Whenever you have name -> value mappings in your data, first creating an array to store those mappings (f[] below) and then accessing the data by it's name provides the simplest, clearest and easiest to enhance solution:
$ awk -v RS= '$1=="oracle"{ for (i=3;i<=NF;i+=3) f[$i]=$(i+2); print f["username"]}' file
"bla"
$ awk -v RS= '$1=="oracle"{ for (i=3;i<=NF;i+=3) f[$i]=$(i+2); print f["password"]}' file
"blabla"
$ awk -v RS= '$1=="oracle"{ for (i=3;i<=NF;i+=3) f[$i]=$(i+2); print f["driver"]}' file
"driver1"
$ awk -v name="driver" -v RS= '$1=="oracle"{ for (i=3;i<=NF;i+=3) f[$i]=$(i+2); print f[name]}' file
"driver1"

With single awk command - will work in ANY awk implementation:
awk '/oracle/{ f=1 }f && $1=="driver"{ gsub(/"/,""); print $3; exit }' file
/oracle/{ f=1 } - on encountering line matching the pattern oracle - set active flag f
f && $1=="driver" - if it's "active" processed section ("oracle") and the 1st field $1 is equal to driver:
gsub(/"/,"") - remove double quotes from the line
print $3 - print the 3rd field which is the driver value
exit - exit the script execution immediately avoiding redundant processing
The output:
driver1

Using awk you can do this using an empty record separator:
awk -v RS= '/^[[:blank:]]*oracle/{
gsub(/.*driver[[:blank:]]*=[[:blank:]]*|\n.*$|"/, ""); print}' application.conf
driver1
Empty RS makes all the continuous non-empty lines a single record.

You can try with sed too
database='oracle'
search='driver'
sed -n '
/'"$database"'/!d
:A
n
/'"$search"'/!bA
s/[^"]*"\([^"]*\)"/\1/
p
q
' application.conf

Related

change numerical value in file to characters via awk

I'm looking to replace the numerical values in a file with a new value provided by me. Can be present in any part of the text, in some cases, it comes across as the third position but is not always necessarily the case. Also to try and save a new version of the file.
original format
A:fdg:user#server:r
A:g:1234:xtcy
A:d:1111:xtcy
modified format
A:fdg:user#server:rxtTncC
A:g:replaced_value:xtcy
A:d:replaced_value:xtcy
bash line command with awk:
awk -v newValue="newVALUE" 'BEGIN{FS=OFS=":"} /:.:.*:/ && ~/^[0-9]+$/{~=newValue} 1' original_file.txt > replaced_file.txt
You can simply use sed instead of awk:
sed -E 's/\b[0-9]+\b/replaced_value/g' /path/to/infile > /path/to/outfile
Here is an awk that asks you for replacement values for each numerical value it meets:
$ awk '
BEGIN {
FS=OFS=":" # delimiters
}
{
for(i=1;i<=NF;i++) # loop all fields
if($i~/^[0-9]+$/) { # if numerical value found
printf "Provide replacement value for %d: ",$i > "/dev/stderr"
getline $i < "/dev/stdin" # ask for a replacement
}
}1' file_in > file_out # write output to a new file
I would use GNU AWK for this task following way, let file.txt content be
A:fdg:user#server:rxtTncC
A:g:1234:xtcy
A:d:1111:xtcy
then
awk 'BEGIN{newvalue="replacement"}{gsub(/[[:digit:]]+/,newvalue);print}' file.txt
output
A:fdg:user#server:rxtTncC
A:g:replacement:xtcy
A:d:replacement:xtcy
Explanation: replace one or more digits using newvalue. Disclaimer: I assumed numeric is something consisting solely from digits.
(tested in gawk 4.2.1)
How about
awk -F : '$3 ~ /^[0-9]+$/ { $3 = "new value"} {print}' original_file >replaced_file
?

Extract specific substring in shell

I have a file which contains following line:
ro fstype=sd timeout=10 console=ttymxc1,115200 show=true
I'd like to extract and store fstype attribue "sd" in a variable.
I did the job using bash
IFS=" " read -a args <<< file
for arg in ${args[#]}; do
if [[ "$arg" =~ "fstype" ]]; then
id=$(cut -d "=" -f2 <<< "$arg")
echo $id
fi
done
and following awk command in another shell script:
awk -F " " '{print $2}' file | cut -d '=' -f2
Because 'fstype' argument position and file content can differ, how to do the same things and keep compatibility in shell script ?
Could you please try following.
awk 'match($0,/fstype=[^ ]*/){print substr($0,RSTART+7,RLENGTH-7)}' Input_file
OR more specifically to handle any string before = try following:
awk '
match($0,/fstype=[^ ]*/){
val=substr($0,RSTART,RLENGTH)
sub(/.*=/,"",val)
print val
val=""
}
' Input_file
With sed:
sed 's/.*fstype=\([^ ]*\).*/\1/' Input_file
awk code's explanation:
awk ' ##Starting awk program from here.
match($0,/fstype=[^ ]*/){ ##Using match function to match regex fstype= till first space comes in current line.
val=substr($0,RSTART,RLENGTH) ##Creating variable val which has sub-string of current line from RSTART to till RLENGTH.
sub(/.*=/,"",val) ##Substituting everything till = in value of val here.
print val ##Printing val here.
val="" ##Nullifying val here.
}
' Input_file ##mentioning Input_file name here.
Any time you have tag=value pairs in your data I find it best to start by creating an array (f[] below) that maps those tags (names) to their values:
$ awk -v tag='fstype' -F'[ =]' '{for (i=2;i<NF;i+=2) f[$i]=$(i+1); print f[tag]}' file
sd
$ awk -v tag='console' -F'[ =]' '{for (i=2;i<NF;i+=2) f[$i]=$(i+1); print f[tag]}' file
ttymxc1,115200
With the above approach you can do whatever you like with the data just by referencing it by it's name as the index in the array, e.g.:
$ awk -F'[ =]' '{
for (i=2;i<NF;i+=2) f[$i]=$(i+1)
if ( (f["show"] == "true") && (f["timeout"] < 20) ) {
print f["console"], f["fstype"]
}
}' file
ttymxc1,115200 sd
If your data has more than 1 row and there can be different fields on each row (doesn't appear to be true for your data) then add delete f as the first line of the script.
If the key and value can be matched by the regex fstype=[^ ]*, grep and -o option which extracts matched pattern can be used.
$ grep -o 'fstype=[^ ]*' file
fstype=sd
In addition, regex \K can be used with -P option (please make sure this option is only valid in GNU grep).
Patterns that are to the left of \K are not shown with -o.
Therefore, below expression can extract the value only.
$ grep -oP 'fstype=\K[^ ]*' file
sd

Using a value from stored in a different file awk

I have a value stored in a file named cutoff1
If I cat cutoff1 it will look like
0.34722
I want to use the value stored in cutoff1 inside an awk script. Something like following
awk '{ if ($1 >= 'cat cutoff1' print $1 }' hist1.dat >hist_oc1.dat
I think I am making some mistakes. If I do manually it will look like
awk '{ if ($1 >= 0.34722) print $1 }' hist1.dat >hist_oc1.dat
How can I use the value stored in cutoff1 file inside the above mentioned awk script?
The easiest ways to achieve this are
awk -v cutoff="$(cat cutoff1)" '($1 >= cutoff){print $1}' hist.dat
awk -v cutoff="$(< cutoff1)" '($1 >= cutoff){print $1}' hist.dat
or
awk '(NR==FNR){cutoff=$1;next}($1 >= cutoff){print $1}' cutoff1 hist.dat
or
awk '($1 >= cutoff){print $1}' cutoff="$(cat cutoff1)" hist.dat
awk '($1 >= cutoff){print $1}' cutoff="$(< cutoff1)" hist.dat
note: thanks to Glenn Jackman to point to :
man bash Command substitution: Bash performs the expansion by executing command and replacing the command substitution with the
standard output of the command, with any trailing newlines deleted.
Embedded newlines are not deleted, but they may be removed during word
splitting. The command substitution $(cat file) can be replaced by
the equivalent but faster $(< file).
since awk can read multiple files just add the filename before your data file and treat first line specially. No need for external variable declaration.
awk 'NR==1{cutoff=$1; next} $1>=cutoff{print $1}' cutoff data
PS Just noticed that it's similar to the #kvantour's second answer, but keepin it here as a different flavor.
You could use getline to read a value from another file at your convenience. First the main file to process:
$ cat > file
wait
wait
did you see that
nothing more to see here
And cutoff:
$ cat cutoff
0.34722
An wwk script that reads a line from cutoff when it meets the string see in a record:
$ awk '/see/{if((getline val < "cutoff") > 0) print val}1' file
wait
wait
0.34722
did you see that
nothing more to see here
Explained:
$ awk '
/see/ { # when string see is in the line
if((getline val < "cutoff") > 0) # read a value from cutoff if there are any available
print val # and output the value from cutoff
}1' file # output records from file
As there was only one value, it was printed only once even see was seen twice.

Using awk to search for a line that starts with but also contains a string

I have a file that has multiple lines that starts with a keyword. I only want to modify one of them and it's easy to distinguish the two. I want the one that is under the [dbinfo] section. The domain name is static so I know that won't change.
awk -F '=' '$1 ~ /^dbhost/ {print $NF};' myfile.txt
myfile.txt
[ual]
path=/web/
dbhost=ez098sf
[dbinfo]
dbhost=ec0001.us-east-1.localdomain
dbname=ez098sf_default
dbpass=XXXXXX
You can use this awk command to first check for presence of [dbinfo] section and then modify dbhost parameter:
awk -v h='newhost' 'BEGIN{FS=OFS="="}
$0 == "[dbinfo]" {sec=1} sec && $1 == "dbhost"{$2 = h; sec=0} 1' file
[ual]
path=/web/
dbhost=ez098sf
[dbinfo]
dbhost=newhost
dbname=ez098sf_default
dbpass=XXXXXX
You want to utilize a little bit of a state machine here:
awk -F '=' '
$0 ~ /^\[.*\]/ {in_db_info=($0=="[dbinfo]"}
$0 ~ /^dbhost/{if (in_db_info) print $2;}' myfile.txt
You can also do it with sed:
sed '/\[dbinfo\]/,/\[/s/\(^dbhost=\).*/\1domain.com/' myfile.txt

AWK - input variable to pattern from SHELL

I would like to extract text from file using awk what basicly it works correctly but I would like to make it dymamical using variable for looking for pattern.
HOW IT SHOULD WORKS:
File test_input contains (btw: extract from HP DP omnimm -show_locked_devs)
Type: Device
Name/Id: Drive1
Pid: 28405
Host: Host1
Type: Cartridge
Name/Id: Lib1
Pid: 28405
Host: Host1
Location: 47
...
get "Pid" number for Drive1 => command find pattern (Drive1) and display next line from file test_input (28405)
cat test_input | awk 'c&&!--c;/Drive1/{c=1}'| awk '{print $2}'
28405
get "Location" number => find all "Pid" numberes and display next 2 line(record) for each match then use grep for filter "Location" from output and display 2nd field (47)
cat test_input | awk 'c&&!--c;/28405/{c=2; print $0}'| grep Location | awk '{print $2}'
47
I have noticed that double quotes in AWK can handle SHELL variables but when I use SAME command in script then I have got error message "awk: The statement cannot be correctly parsed."
DRIVE=Drive1;cat test_input | awk "c&&!--c;/$DRIVE/{c=1}" | awk '{print $2}'
28405
If you have some hints how get work variables from SHELL please let me know.
also I know that my commands and redirections are probably complicated but yeah I am not script master :)
If You need just use environment variables then you can use the ENVIRON awk built-in hash. If You want to pass arguments to awk, you can use the -v option.
An example for both:
cat >inputfile <<EOT
aaa
bbbxxx
xxxccc
ddd
EOT
VAR=xxx
awk -vVAR="$VAR" '$0~VAR {print ENVIRON["USER"]":"$0}' inputfile
I added the creation of the sample inputfile.
As I know in some awk version a white space is needed between -v and VAR.
If I may suggest you would use ' instead of " around the whole script line. It makes the life a bit easier if you use a lot of awk.
Output:
myuser:bbbxxx
myuser:xxxccc
If I understood well, You need to collect the names of all devices and all locations in non "Device" blocks. I assume clock starting with the tag Type and the tag order is always the same. If not, pls. inform me. Based on these assumptions my code looks like:
awk '$1=="Type:"{dev=$2=="Device"}
dev && $1=="Name/Id:"{name=$2}
dev && $1=="Pid:"{pids[name]=$2}
!dev && $1=="Pid:"{pid=$2}
!dev && $1=="Location:"{locs[pid]=$2}
END {
for(i in pids) {
pid = pids[i];
print i"\t"(pid in locs ? locs[pid] : "None");
}
}
' inputfile
It fills up the pids and and locs hash, then it prints all device names found in pids hash and the location belongs to this pid (if found).
Output:
Drive1 47
Of course if the location is always after the device block, the line could be printed immediately when the location found. So the END part could be dropped.
It's not clear what you want but maybe this:
$ cat tst.awk
BEGIN{ RS=""; FS="[:[:space:]]+" }
{
for (i=1;i<=NF;i+=2)
name2val[$i] = $(i+1)
}
(name2val[key] == val) && (tgt in name2val) {
print name2val[tgt]
}
$
$ awk -v key="Name/Id" -v val="Drive1" -v tgt="Pid" -f tst.awk file
28405
$
$ awk -v key="Pid" -v val="28405" -v tgt="Location" -f tst.awk file
47

Resources