extract ip address from variable string - bash

I'm trying to create a bash script which will be able to change the "allow from" ip address in the phpmyadmin command file (which im still not sure is possible to do) and restart apache
I'm currently trying to extract an ip address from a variable and after searching the web I still have no clue, here is what I have so far...
#bash shell script
#!/bin/bash
clear
echo "Get client IP address"
ip=$(last -i)
echo $ip
exit
echo "restart apache"
/etc/init.d/apache2 reload
I've tried adding the following line with no luck
ip=$(head -n 1 $ip)
If anyone can tell me how I can extract the first instance of an IP address from the variables $ip I would appreciate it very much.

ip=$(last -i | head -n 1 | awk '{print $3}')
Update:
ip=$(last -i | grep -Pom 1 '[0-9.]{7,15}')

You can use grep with read:
read ip < <(last -i | grep -o '[0-9]\+[.][0-9]\+[.][0-9]\+[.][0-9]\+')
read ip < <(last -i | grep -Eo '[0-9]+[.][0-9]+[.][0-9]+[.][0-9]+')
\b may also be helpful there. Just not sure about its compatibility.
And yet another:
ip=$(last -i | gawk 'BEGIN { RS = "[ \t\n]"; FS = "." } /^([0-9]+[.]){3}[0-9]+$/ && ! rshift(or(or($1, $2), or($3, $4)), 8) { print ; exit; }')

To get the first instance you can just do:
ip=$(last -i -1 | awk '{print $3}')

I'd just do
ip=$(last -i -1 | grep -Po '(\d+\.){3}\d+')
The above uses grep with Perl Compatible Regular Expressions which lets us use \d for digits. The regular expression looks for three repetitions of [0-9] followed by a dot (so, for example 123.45.123), then another stretch of digits. The -o flag causes grep to only print the matching line.
This approach has the advantage of working even when the number of fields per line changes (as is often the case, for example with system boot as the 2nd field). However, it needs GNU grep so if you need a more portable solution, use #konsolebox's answer instead.

Using bash only :
read -ra array < <(last -i)
ip="${array[2]}"
Or :
read -ra array < <(last -1 -i)
ip="${array[2]}"

Or if you're a nitpicker (and have a grep with -P), you can test the next:
while read -r testline
do
echo "input :=$testline="
read ip < <(grep -oP '\b(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))\b' <<< "$testline")
echo "result:=${ip:=NOTFOUND}="
echo
done <<EOF
some bla bla 127.0.0.1 some
10.10.10.10
bad one 300.200.300.400
some other bla 127.0.0.1 some another 10.1.1.0
10.10.10.10 10.1.1.0
bad one 300.200.300.400 and good 192.168.1.1
above is empty and no ip here too
EOF
It skips wrong ip adr, like 800.1.1.1 so, for the above test prints:
input :=some bla bla 127.0.0.1 some=
result:=127.0.0.1=
input :=10.10.10.10=
result:=10.10.10.10=
input :=bad one 300.200.300.400=
result:=NOTFOUND=
input :=some other bla 127.0.0.1 some another 10.1.1.0=
result:=127.0.0.1=
input :=10.10.10.10 10.1.1.0=
result:=10.10.10.10=
input :=bad one 300.200.300.400 and good 192.168.1.1=
result:=192.168.1.1=
input :==
result:=NOTFOUND=
input :=above is empty and no ip here too=
result:=NOTFOUND=
The \b is needed to skip matching an ip, like: 610.10.10.10, what is containing a valid ip (10.10.10.10).
The regex is taken from: https://metacpan.org/pod/Regexp::Common::net

Since I happen to have needed to do something in the same ballpark, here is a basic regular expression and an extended regular expression to loosly match an IP address (v4) making sure that there are 4 sequences of 1-3 numbers delimited by a 3 '.'.
# Basic Regular Expression to loosly match an IP address:
bre_match_ip="[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}"
# Extended Regular Expression to loosly match an IP address:
ere_match_ip="[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}"
Of course when matching IP (v4) addresses from a file (say HTML) it's quite easy to inadvertently match a version string or an url which contains versioning as part of its file path. The following is some Awk code I wrote a while ago for use in a Bash script to extract valid unique (no duplicates) IP addresses from a file. It avoids version numbers whether in the text or as part of an url and makes sure the IP numbers are in range.
I appreciate that this is overkill for the original poster and that it is not tailored for his needs but someone doing a search may come across this answer and find the fairly comprehensive nature of the code useful. The Awk code is thankfully well commented as it uses some slightly obscure aspects of Awk that the casual Awk user would probably not be familiar with.
awkExtractIPAddresses='
BEGIN {
# Regex to match an IP address like sequence (even if too long to be an IP).
# This is deliberately a loose match, the END section will check for IP
# address validity.
ipLikeSequence = "[0-9]+[.][0-9]+[.][0-9]+[.][0-9]+[0-9.]*";
# Regex to match a number sequence longer than 3 digits.
digitSequenceTooLongNotIP = "[0-9][0-9][0-9][0-9]+";
# Regex to match an IP address like sequence which is a version number.
# Equivalent to "(version|ver|v)[ .:]*" if "tolower($0)" was used.
versioningNotIP = "[Vv]([Ee][Rr]([Ss][Ii][Oo][Nn])?)?[ .:]*" ipLikeSequence;
# Regexes to match IP address like sequences next to forward slashes, to
# avoid version numbers in urls: e.g. http://web.com/libs/1.6.1.0/file.js
beginsWithFwdSlashNotIP = "[/]" ipLikeSequence;
endsWithFwdSlashNotIP = ipLikeSequence "[/]";
}
{
# Set line to the current line (more efficient than using $0 below).
line = $0;
# Replace sequences on line which will interfere with extracting genuine
# IPs. Use a replacement char and not the empty string to avoid accidentally
# creating a valid IP address from digits on either side of the removed
# sections. Use "/" as the replacement char for the 2 "FwdSlash" regexes so
# that multiple number dot slash sequences all get removed, as using "x"
# could result in inadvertently leaving such a sequence in place.
# e.g. "/lib1.6.1.0/1.2.3.4/5.6.7.8/file.js" leaves "/lib1.6.1.0xx/file.js"
gsub(digitSequenceTooLongNotIP, "x", line);
gsub(versioningNotIP, "x", line);
gsub(beginsWithFwdSlashNotIP, "/", line);
gsub(endsWithFwdSlashNotIP, "/", line);
# Loop through the current line matching IP address like sequences and
# storing them in the index of the array ipUniqueMatches. By using ipMatch
# as the array index duplicates are avoided and the values can be easily
# retrieved by the for loop in the END section. match() automatically sets
# the built in variables RSTART and RLENGTH.
while (match(line, ipLikeSequence))
{
ipMatch = substr(line, RSTART, RLENGTH);
ipUniqueMatches[ipMatch];
line = substr(line, RSTART + RLENGTH + 1);
}
}
END {
# Define some IP address related constants.
ipRangeMin = 0;
ipRangeMax = 255;
ipNumSegments = 4;
ipDelimiter = ".";
# Loop through the ipUniqueMatches array and print any valid IP addresses.
# The awk "for each" type of loop is different from the norm. It provides
# the indexes of the array and NOT the values of the array elements which
# is more usual in this type of loop.
for (ipMatch in ipUniqueMatches)
{
numSegments = split(ipMatch, ipSegments, ipDelimiter);
if (numSegments == ipNumSegments &&
ipSegments[1] >= ipRangeMin && ipSegments[1] <= ipRangeMax &&
ipSegments[2] >= ipRangeMin && ipSegments[2] <= ipRangeMax &&
ipSegments[3] >= ipRangeMin && ipSegments[3] <= ipRangeMax &&
ipSegments[4] >= ipRangeMin && ipSegments[4] <= ipRangeMax)
{
print ipMatch;
}
}
}'
# Extract valid IP addresses from $fileName, they will each be separated
# by a new line.
awkValidIpAddresses=$(awk "$awkExtractIPAddresses" < "$fileName")
I hope this is of interest.

You could use Awk.
ip=$(awk '{if(NR == 1) {print $3; exit;}}' < <(last -i))

Related

Matching pairs using Linux terminal

I have a file named list.txt containing a (supplier,product) pair and I must show the number of products from every supplier and their names using Linux terminal
Sample input:
stationery:paper
grocery:apples
grocery:pears
dairy:milk
stationery:pen
dairy:cheese
stationery:rubber
And the result should be something like:
stationery: 3
stationery: paper pen rubber
grocery: 2
grocery: apples pears
dairy: 2
dairy: milk cheese
Save the input to file, and remove the empty lines. Then use GNU datamash:
datamash -s -t ':' groupby 1 count 2 unique 2 < file
Output:
dairy:2:cheese,milk
grocery:2:apples,pears
stationery:3:paper,pen,rubber
The following pipeline shoud do the job
< your_input_file sort -t: -k1,1r | sort -t: -k1,1r | sed -E -n ':a;$p;N;s/([^:]*): *(.*)\n\1:/\1: \2 /;ta;P;D' | awk -F' ' '{ print $1, NF-1; print $0 }'
where
sort sorts the lines according to what's before the colon, in order to ease the successive processing
the cryptic sed joins the lines with common supplier
awk counts the items for supplier and prints everything appropriately.
Doing it with awk only, as suggested by KamilCuk in a comment, would be a much easier job; doing it with sed only would be (for me) a nightmare. Using both is maybe silly, but I enjoyed doing it.
If you need a detailed explanation, please comment, and I'll find time to provide one.
Here's the sed script written one command per line:
:a
$p
N
s/([^:]*): *(.*)\n\1:/\1: \2 /
ta
P
D
and here's how it works:
:a is just a label where we can jump back through a test or branch command;
$p is the print command applied only to the address $ (the last line); note that all other commands are applied to every line, since no address is specified;
N read one more line and appends it to the current pattern space, putting a \newline in between; this creates a multiline in the pattern space
s/([^:]*): *(.*)\n\1:/\1: \2 / captures what's before the first colon on the line, ([^:]*), as well as what follows it, (.*), getting rid of eccessive spaces, *;
ta tests if the previous s command was successful, and, if this is the case, transfers the control to the line labelled by a (i.e. go to step 1);
P prints the leading part of the multiline up to and including the embedded \newline;
D deletes the leading part of the multiline up to and including the embedded \newline.
This should be close to the only awk code I was referring to:
< os awk -F: '{ count[$1] += 1; items[$1] = items[$1] " " $2 } END { for (supp in items) print supp": " count[supp], "\n"supp":" items[supp]}'
The awk script is more readable if written on several lines:
awk -F: '{ # for each line
# we use the word before the : as the key of an associative array
count[$1] += 1 # increment the count for the given supplier
items[$1] = items[$1] " " $2 # concatenate the current item to the previous ones
}
END { # after processing the whole file
for (supp in items) # iterate on the suppliers and print the result
print supp": " count[supp], "\n"supp":" items[supp]
}

Reverse DNS-style string

In a script running in a Debian environment, what is a good way to reverse a DNS-style string?
For example, if my script has:
example.org
What would be a good way to reverse it, so that the string would read:
org.example
A longer example:
www.example.org
should reverse to:
org.example.www
You could use an iterative approach to build a reversed address:
Initialize the reversed result to empty string
While the string contains .
Extract the last segment chopping off everything from the start until a dot using parameter expansion ${var##*.}
Chop off the last segment with another parameter expansion ${var%.*}
Append to the reversed result the previously saved last segment
Here's one way to implement using pure Bash features:
rdns() {
local s=$1
local reversed last
while [[ "$s" == *.* ]]; do
last=${s##*.}
s=${s%.*}
reversed=$reversed$last.
done
reversed=$reversed$s
echo "$reversed"
}
rdns example
rdns example.org
rdns www.example.org
Outputs:
example
org.example
org.example.www
this might do
s='www.example.org'
echo $s | tr '.' '\n' | tac | paste -sd.
You can try
sed -E 's/\./\n/g;s/$/\n/;:A;s/([^\n]*)\n(.*)(\n)(.*)/\2\3.\1\4/;tA;s/\n//'

Replace and increment letters and numbers with awk or sed

I have a string that contains
fastcgi_cache_path /var/run/nginx-cache15 levels=1:2 keys_zone=MYSITEP:100m inactive=60m;
One of the goals of this script is to increment nginx-cache two digits based on the value find on previous file. For doing that I used this code:
# Replace cache_path
PREV=$(ls -t /etc/nginx/sites-available | head -n1) #find the previous cache_path number
CACHE=$(grep fastcgi_cache_path $PREV | awk '{print $2}' |cut -d/ -f4) #take the string to change
SUB=$(echo $CACHE |sed "s/nginx-cache[0-9]*[0-9]/&#/g;:a {s/0#/1/g;s/1#/2/g;s/2#/3/g;s/3#/4/g;s/4#/5/g;s/5#/6/g;s/6#/7/g;s/7#/8/g;s/8#/9/g;s/9#/#0/g;t a};s/#/1/g") #increment number
sed -i "s/nginx-cache[0-9]*/$SUB/g" $SITENAME #replace number
Maybe not so elegant, but it works.
The other goal is to increment last letter of all occurrences of MYSITEx (MYSITEP, in that case, should become MYSITEQ, after MYSITEP, etc. etc and once MYSITEZ will be reached add another letter, like MYSITEAA, MYSITEAB, etc. etc.
I thought something like:
sed -i "s/MYSITEP[A-Z]*/MYSITEGG/g" $SITENAME
but it can't works cause MYSITEGG is a static value and can't be used.
How can I calculate the last letter, increment it to the next one and once the last Z letter will be reached, add another letter?
Thank you!
Perl's autoincrement will work on letters as well as digits, in exactly the manner you describe
We may as well tidy your nginx-cache increment as well while we're at it
I assume SITENAME holds the name of the file to be modified?
It would look like this. I have to assign the capture $1 to an ordinary variable $n to increment it, as $1 is read-only
perl -i -pe 's/nginx-cache\K(\d+)/ ++($n = $1) /e; s/MYSITE\K(\w+)/ ++($n = $1) /e;' $SITENAME
If you wish, this can be done in a single substitution, like this
perl -i -pe 's/(?:nginx-cache|MYSITE)\K(\w+)/ ++($n = $1) /ge' $SITENAME
Note: The solution below is needlessly complicated, because as Borodin's helpful answer demonstrates (and #stevesliva's comment on the question hinted at), Perl directly supports incrementing letters alphabetically in the manner described in the question, by applying the ++ operator to a variable containing a letter (sequence); e.g.:
$ perl -E '$letters = "ZZ"; say ++$letters'
AAA
The solution below may still be of interest as an annotated showcase of how Perl's power can be harnessed from the shell, showing techniques such as:
use of s///e to determine the replacement string with an expression.
splitting a string into a character array (split //, "....")
use of the ord and chr functions to get the codepoint of a char., and convert a(n incremented) codepoint back to a char.
string replication (x operator)
array indexing and slices:
getting an array's last element ($chars[-1])
getting all but the last element of an array (#chars[0..$#chars-1])
A perl solution (in effect a re-implementation of what ++ can do directly):
perl -pe 's/\bMYSITE\K([A-Z]+)/
#chars = split qr(), $1; $chars[-1] eq "Z" ?
"A" x (1 + scalar #chars)
:
join "", #chars[0..$#chars-1], chr (1 + ord $chars[-1])
/e' <<'EOF'
...=MYSITEP:...
...=MYSITEZP:...
...=MYSITEZZ:...
EOF
yields:
...=MYSITEQ:... # P -> Q
...=MYSITEZQ:... # ZP -> ZQ
...=MYSITEAAA:... # ZZ -> AAA
You can use perl's -i option to replace the input file with the result
(perl -i -pe '...' "$SITENAME").
As Borodin's answer demonstrates, it's not hard to solve all tasks in the question using perl alone.
The s function's /e option allows use of a Perl expression for determining the replacement string, which enables sophisticated replacements:
$1 references the current MYSITE suffix in the expression.
#chars = split qr(), $1 splits the suffix into a character array.
$chars[-1] eq "Z" tests if the last suffix char. is Z
If so: The suffix is replaced with all As, with an additional A appended
("A" x (1 + scalar #chars)).
Otherwise: The last suffix char. is replaced with the following letter in the alphabet
(join "", #chars[0..$#chars-1], chr (1 + ord $chars[-1]))

How can I retrieve the matching records from mentioned file format in bash

XYZNA0000778800Z
16123000012300321000000008000000000000000
16124000012300322000000007000000000000000
17234000012300323000000005000000000000000
17345000012300324000000004000000000000000
17456000012300325000000003000000000000000
9
XYZNA0000778900Z
16123000012300321000000008000000000000000
16124000012300322000000007000000000000000
17234000012300323000000005000000000000000
17345000012300324000000004000000000000000
17456000012300325000000003000000000000000
9
I have above file format from which I want to find a matching record. For example, match a number(7789) on line starting with XYZ and once matched look for a matching number (7345) in lines below starting with 1 until it reaches to line starting with 9. retrieve the entire line record. How can I accomplish this using shell script, awk, sed or any combination.
Expected Output:
XYZNA0000778900Z
17345000012300324000000004000000000000000
With sed one can do:
$ sed -n '/^XYZ.*7789/,/^9$/{/^1.*7345/p}' file
17345000012300324000000004000000000000000
Breakdown:
sed -n ' ' # -n disabled automatic printing
/^XYZ.*7789/, # Match line starting with XYZ, and
# containing 7789
/^1.*7345/p # Print line starting with 1 and
# containing 7345, which is coming
# after the previous match
/^9$/ { } # Match line that is 9
range { stuff } will execute stuff when it's inside range, in this case the range is starting at /^XYZ.*7789/ and ending with /^9$/.
.* will match anything but newlines zero or more times.
If you want to print the whole block matching the conditions, one can use:
$ sed -n '/^XYZ.*7789/{:s;N;/\n9$/!bs;/\n1.*7345/p}' file
XYZNA0000778900Z
16123000012300321000000008000000000000000
16124000012300322000000007000000000000000
17234000012300323000000005000000000000000
17345000012300324000000004000000000000000
17456000012300325000000003000000000000000
9
This works by reading lines between ^XYZ.*7779 and ^9$ into the pattern
space. And then printing the whole thing if ^1.*7345 can be matches:
sed -n ' ' # -n disables printing
/^XYZ.*7789/{ } # Match line starting
# with XYZ that also contains 7789
:s; # Define label s
N; # Append next line to pattern space
/\n9$/!bs; # Goto s unless \n9$ matches
/\n1.*7345/p # Print whole pattern space
# if \n1.*7345 matches
I'd use awk:
awk -v rid=7789 -v fid=7345 -v RS='\n9\n' -F '\n' 'index($1, rid) { for(i = 2; i < $NF; ++i) { if(index($i, fid)) { print $i; next } } }' filename
This works as follows:
-v RS='\n9\n' is the meat of the whole thing. Awk separates its input into records (by default lines). This sets the record separator to \n9\n, which means that records are separated by lines with a single 9 on them. These records are further separated into fields, and
-F '\n' tells awk that fields in a record are separated by newlines, so that each line in a record becomes a field.
-v rid=7789 -v fid=7345 sets two awk variables rid and fid (meant by me as record identifier and field identifier, respectively. The names are arbitrary.) to your search strings. You could encode these in the awk script directly, but this way makes it easier and safer to replace the values with those of a shell variables (which I expect you'll want to do).
Then the code:
index($1, rid) { # In records whose first field contains rid
for(i = 2; i < $NF; ++i) { # Walk through the fields from the second
if(index($i, fid)) { # When you find one that contains fid
print $i # Print it,
next # and continue with the next record.
} # Remove the "next" line if you want all matching
} # fields.
}
Note that multi-character record separators are not strictly required by POSIX awk, and I'm not certain if BSD awk accepts it. Both GNU awk and mawk do, though.
EDIT: Misread question the first time around.
an extendable awk script can be
$ awk '/^9$/{s=0} s&&/7345/; /^XYZ/&&/7789/{s=1} ' file
set flag s when line starts with XYZ and contains 7789; reset when line is just 9, and print when flag is set and contains pattern 7345.
This might work for you (GNU sed):
sed -n '/^XYZ/h;//!H;/^9/!b;x;/^XYZ[^\n]*7789/!b;/7345/p' file
Use the option -n for the grep-like nature of sed. Gather up records beginning with XYZ and ending in 9. Reject any records which do not have 7789 in the header. Print any remaining records that contain 7345.
If the 7345 will always follow the header,this could be shortened to:
sed -n '/^XYZ/h;//!H;/^9/!b;x;/^XYZ[^\n]*7789.*7345/p' file
If all records are well-formed (begin XYZ and end in 9) then use:
sed -n '/^XYZ/h;//!H;/^9/!b;x;/^[^\n]*7789.*7345/p' file

Parse out key=value pairs into variables

I have a bunch of different kinds of files I need to look at periodically, and what they have in common is that the lines have a bunch of key=value type strings. So something like:
Version=2 Len=17 Hello Var=Howdy Other
I would like to be able to reference the names directly from awk... so something like:
cat some_file | ... | awk '{print Var, $5}' # prints Howdy Other
How can I go about doing that?
The closest you can get is to parse the variables into an associative array first thing every line. That is to say,
awk '{ delete vars; for(i = 1; i <= NF; ++i) { n = index($i, "="); if(n) { vars[substr($i, 1, n - 1)] = substr($i, n + 1) } } Var = vars["Var"] } { print Var, $5 }'
More readably:
{
delete vars; # clean up previous variable values
for(i = 1; i <= NF; ++i) { # walk through fields
n = index($i, "="); # search for =
if(n) { # if there is one:
# remember value by name. The reason I use
# substr over split is the possibility of
# something like Var=foo=bar=baz (that will
# be parsed into a variable Var with the
# value "foo=bar=baz" this way).
vars[substr($i, 1, n - 1)] = substr($i, n + 1)
}
}
# if you know precisely what variable names you expect to get, you can
# assign to them here:
Var = vars["Var"]
Version = vars["Version"]
Len = vars["Len"]
}
{
print Var, $5 # then use them in the rest of the code
}
$ cat file | sed -r 's/[[:alnum:]]+=/\n&/g' | awk -F= '$1=="Var"{print $2}'
Howdy Other
Or, avoiding the useless use of cat:
$ sed -r 's/[[:alnum:]]+=/\n&/g' file | awk -F= '$1=="Var"{print $2}'
Howdy Other
How it works
sed -r 's/[[:alnum:]]+=/\n&/g'
This places each key,value pair on its own line.
awk -F= '$1=="Var"{print $2}'
This reads the key-value pairs. Since the field separator is chosen to be =, the key ends up as field 1 and the value as field 2. Thus, we just look for lines whose first field is Var and print the corresponding value.
Since discussion in commentary has made it clear that a pure-bash solution would also be acceptable:
#!/bin/bash
case $BASH_VERSION in
''|[0-3].*) echo "ERROR: Bash 4.0 required" >&2; exit 1;;
esac
while read -r -a words; do # iterate over lines of input
declare -A vars=( ) # refresh variables for each line
set -- "${words[#]}" # update positional parameters
for word; do
if [[ $word = *"="* ]]; then # if a word contains an "="...
vars[${word%%=*}]=${word#*=} # ...then set it as an associative-array key
fi
done
echo "${vars[Var]} $5" # Here, we use content read from that line.
done <<<"Version=2 Len=17 Hello Var=Howdy Other"
The <<<"Input Here" could also be <file.txt, in which case lines in the file would be iterated over.
If you wanted to use $Var instead of ${vars[Var]}, then substitute printf -v "${word%%=*}" %s "${word*=}" in place of vars[${word%%=*}]=${word#*=}, and remove references to vars elsewhere. Note that this doesn't allow for a good way to clean up variables between lines of input, as the associative-array approach does.
I will try to explain you a very generic way to do this which you can adapt easily if you want to print out other stuff.
Assume you have a string which has a format like this:
key1=value1 key2=value2 key3=value3
or more generic
key1_fs2_value1_fs1_key2_fs2_value2_fs1_key3_fs2_value3
With fs1 and fs2 two different field separators.
You would like to make a selection or some operations with these values. To do this, the easiest is to store these in an associative array:
array["key1"] => value1
array["key2"] => value2
array["key3"] => value3
array["key1","full"] => "key1=value1"
array["key2","full"] => "key2=value2"
array["key3","full"] => "key3=value3"
This can be done with the following function in awk:
function str2map(str,fs1,fs2,map, n,tmp) {
n=split(str,map,fs1)
for (;n>0;n--) {
split(map[n],tmp,fs2);
map[tmp[1]]=tmp[2]; map[tmp[1],"full"]=map[n]
delete map[n]
}
}
So, after processing the string, you have the full flexibility to do operations in any way you like:
awk '
function str2map(str,fs1,fs2,map, n,tmp) {
n=split(str,map,fs1)
for (;n>0;n--) {
split(map[n],tmp,fs2);
map[tmp[1]]=tmp[2]; map[tmp[1],"full"]=map[n]
delete map[n]
}
}
{ str2map($0," ","=",map) }
{ print map["Var","full"] }
' file
The advantage of this method is that you can easily adapt your code to print any other key you are interested in, or even make selections based on this, example:
(map["Version"] < 3) { print map["var"]/map["Len"] }
The simplest and easiest way is to use the string substitution like this:
property='my.password.is=1234567890=='
name=${property%%=*}
value=${property#*=}
echo "'$name' : '$value'"
The output is:
'my.password.is' : '1234567890=='
Yore.
Using bash's set command, we can split the line into positional parameters like awk.
For each word, we'll try to read a name value pair delimited by =.
When we find a value, assign it to the variable named $key using bash's printf -v feature.
#!/usr/bin/env bash
line='Version=2 Len=17 Hello Var=Howdy Other'
set $line
for word in "$#"; do
IFS='=' read -r key val <<< "$word"
test -n "$val" && printf -v "$key" "$val"
done
echo "$Var $5"
output
Howdy Other
SYNOPSIS
an awk-based solution that doesn't require manually checking the fields to locate the desired key pair :
approach being avoid splitting unnecessary fields or arrays - only performing regex match via function call when needed
only returning FIRST occurrence of input key value. Subsequent matches along the row are NOT returned
i just called it S() cuz it's the closest letter to $
I only included an array (_) of the 3 test values for demo purposes. Those aren't needed. In fact, no state information is being kept at all
caveat being : key-match must be exact - this version of the code isn't for case-insensitive or fuzzy/agile matching
Tested and confirmed working on
- gawk 5.1.1
- mawk 1.3.4
- mawk-2/1.9.9.6
- macos nawk
CODE
# gawk profile, created Fri May 27 02:07:53 2022
{m,n,g}awk '
function S(__,_) {
return \
! match($(_=_<_), "(^|["(_="[:blank:]]")")"(__)"[=][^"(_)"*") \
? "^$" \
: substr(__=substr($-_, RSTART, RLENGTH), index(__,"=")+_^!_)
}
BEGIN { OFS = "\f" # This array is only for testing
_["Version"] _["Len"] _["Var"] # purposes. Feel free to discard at will
} {
for (__ in _) {
print __, S(__) } }'
OUTPUT
Var
Howdy
Len
17
Version
2
So either call the fields in BAU fashion
- $5, $0, $NF, etc
or call S(QUOTED_KEY_VALUE), case-sensitive, like
As a safeguard, to prevent mis-interpreting null strings
or invalid inputs as $0, a non-match returns ^$
instead of empty string
S("Version") to get back 2.
As a bonus, it can safely handle values in multibyte unicode, both for values and even for keys, regardless of whether ur awk is UTF-8-aware or not :
1 ✜
🤡
2 Version
2
3 Var
Howdy
4 Len
17
5 ✜=🤡 Version=2 Len=17 Hello Var=Howdy Other
I know this is particularly regarding awk but mentioning this as many people come here for solutions to break down name = value pairs ( with / without using awk as such).
I found below way simple straight forward and very effective in managing multiple spaces / commas as well -
Source: http://jayconrod.com/posts/35/parsing-keyvalue-pairs-in-bash
change="foo=red bar=green baz=blue"
#use below if var is in CSV (instead of space as delim)
change=`echo $change | tr ',' ' '`
for change in $changes; do
set -- `echo $change | tr '=' ' '`
echo "variable name == $1 and variable value == $2"
#can assign value to a variable like below
eval my_var_$1=$2;
done

Resources