sed to replace list of subnets at nth position in a file - shell

Adding "ip","ip","ip" to each IP address
tf_alist=$(for i in $alist; do echo -n "\"$i\"",; done | sed 's/,$//')
echo $tf_alist
gives:
"192.168.0.216/29","192.168.92.72/30","192.168.92.70/31"
Now, I've variable.tf as shown below:
variable "allowlisted_cidrs_prod" {
type = list(string)
description = "ip ranges - allowlisted - prod instances"
default = ["192.168.241.88/32", "192.168.128.222/32", "192.168.231.150/32"]
}
variable "allowlisted_cidrs_test" {
type = list(string)
description = "ip ranges - allowlisted - test instances"
default = ["192.168.58.61/32", "192.168.3.224/32"]
}
variable "elb_cipher" {
type = string
description = "ELB cipher"
default = "ELBSecurityPolicy-TLS-1-2-2017-01"
}
At variable "allowlisted_cidrs_prod" I want to replace the below list string:
From:
default = ["192.168.241.88/32", "192.168.128.222/32", "192.168.231.150/32"]
To (as per $tf_alist):
default = ["192.168.0.216/29","192.168.92.72/30","192.168.92.70/31"]
can you suggest a way to do it using sed? TYA!
To replace the string, I'm trying to use the below expression to capture the existing IP list:
old_ip_list=$(cat variable.tf | sed -n '/variable "allowlisted_cidrs_prod"/,$ {/^[[:blank:]]*default[[:blank:]]*=[[:blank:]]*\(.*\).*/ { s//\1/p; q; }}')
gives:
["192.168.241.88/32", "192.168.128.222/32", "192.168.231.150/32"]
Ref: https://sed.js.org/?gist=bdeddb0ed01bdc8f96b3a05952909cd7
Removes "[]" from the given output to match with $tf_alist.
echo -e "\n $old_ip_list" | gsed 's/.*\[//;s/\].*//;'

The best way to edit a structured format like this is to use a tool for that particular structured format. If this is code in a programming language, probably change the code to read the values from an external resource into a variable, and don't modify the code at all if you can avoid it.
Having said that, here is a quick refactoring.
tf_alist=$(printf ',"%s"' $alist | sed 's/^,//')
awk -v new="$tf_alist" '/^variable "/ { v=$2 }
v=="\"allowlisted_cidrs_prod\"" && $1 == "default" { sub(/\[.*/, "[" new "]") }
1' variable.tf >new_variable.tf
This simply keeps track of which variable it has last seen, and only replaces the default line when it's the one we are trying to target. The final 1 is a common Awk idiom to unconditionally print all lines. And of course sub is Awk's equivalent of the s/// command in sed, and "[" new "]" is simple string concatenation; we put square brackets around the value of the variable new (which is defined in the -v option - it contains the value of the shell variable $tf_alist).
As all ad-hoc parsers, this is fairly brittle, but it works with the example you provided (demo: https://ideone.com/7kDNCU). If this is for more than a one-off, seriously think about a different approach, rather than spending more time on making this more robust (or, heavens, reimplementing it in read-only sed).

Related

Output the value/word after one pattern has been found in string in variable (grep, awk, sed, pearl etc)

I have a program that prints data into the console like so (separated by space):
variable1 value1
variable2 value2
variable3 value3
varialbe4 value4
EDIT: Actually the output can look like this:
data[variable1]: value1
pre[variable2] value2
variable3: value3
flag[variable4] value4
In the end I want to search for a part of the name e.g. for variable2 or variable3 but only get value2 or value3 as output.
EDIT: This single value should then be stored in a variable for further processing within the bash script.
I first tried to put all the console output into a file and process it from there with e.g.
# value3_var="$(grep "variable3" file.log | cut -d " " -f2)"
This works fine but is too slow. I need to process ~20 of these variables per run and this takes ~1-2 seconds on my system. Also I need to do this for ~500 runs. EDIT: I actually do not need to automatically process all of the ~20 'searches' automatically with one call of e.g. awk. If there is a way to do it automaticaly, it's fine, but ~20 calls in the bash script are fine here too.
Therefore I thought about putting the console output directly into a variable to remove the slow file access. But this will then eliminate the newline characters which then again makes it more complicated to process:
# console_output=$(./programm_call)
# echo $console_output
variable1 value1 variable2 value2 variable3 value3 varialbe4 value4
EDIT: IT actually looks like this:
# console_output=$(./programm_call)
# echo $console_output
data[variable1]: value1 pre[variable2] value2 variable3: value3 flag[variable4] value4
I found a solution for this kind of string arangement, but these seem only to work with a text file. At least I was not able to use the string stored in $console_output with these examples
How to print the next word after a found pattern with grep,sed and awk?
So, how can I output the next word after a found pattern, when providing a (long) string as variable?
PS: grep on my system does not know the parameter -P...
I'd suggest to use awk:
$ cat ip.txt
data[variable1]: value1
pre[variable2] value2
variable3: value3
flag[variable4] value4
$ cat var_list
variable1
variable3
$ awk 'NR==FNR{a[$1]; next}
{for(k in a) if(index($1, k)) print $2}' var_list ip.txt
value1
value3
To use output of another command as input file, use ./programm_call | awk '...' var_list - where - will indicate stdin as input.
This single value should then be stored in a variable for further processing within the bash script.
If you are doing further text processing, you could do it within awk and thus avoid a possible slower bash loop. See Why is using a shell loop to process text considered bad practice? for details.
Speed up suggestions:
Use LC_ALL=C awk '..' if input is ASCII (Note that as pointed out in comments, this doesn't apply for all cases, so you'll have to test it for your use case)
Use mawk if available, that is usually faster. GNU awk may still be faster for some cases, so again, you'll have to test it for your use case
Use ripgrep, which is usually faster than other grep programs.
$ ./programm_call | rg -No -m1 'variable1\S*\s+(\S+)' -r '$1'
value1
$ ./programm_call | rg -No -m1 'variable3\S*\s+(\S+)' -r '$1'
value3
Here, -o option is used to get only the matched portion. -r is used to get only the required text by replacing the matched portion with the value from the capture group. -m1 option is used to stop searching input once the first match is found. -N is used to disable line number prefix.
Exit after the first grep match, like so:
value3_var="$(grep -m1 "variable3" file.log | cut -d " " -f2)"
Or use Perl, also exiting after the first match. This eliminates the need for a pipe to another process:
value3_var="$(perl -le 'print $1, last if /^variable3\s+(.*)/' file.log)"
If I'm understanding your requirements correctly, how about feeding
the output of programm_call directly to the awk script instead of
assinging a shell variable.
./programm_call | awk '
# the following block is invoked line by line of the input
{
a[$1] = $2
}
# the following block is executed after all lines are read
END {
# please modify the print statement depending on your required output format
print "variable1 = " a["variable1"]
print "variable3 = " a["variable3"]
}'
Output:
variable1 = value1
variable3 = value3
As you see, the script can process all (~20) variables at once.
[UPDATE]
Assumptions including the provided information:
The ./program_call prints approx. 50 pairs of "variable value"
variable and value are delimited by blank character(s)
variable may be enclosed with [ and ]
variable may be followed by :
We have interest with up to 20 variables out of the ~50 pairs
We use just one of the 20 variables at once
We don't want to invoke ./program_call whenever accessing just one variable
We want to access the variable values from within bash script
We may use an associative array to fetch the value via the variable name
Then it will be convenient to read the variable-value pairs directly within
bash script:
#!/bin/bash
declare -A hash # declare an associative array
while read -r key val; do # read key (variable name) and value
key=${key#*[} # remove leading "[" and the characters before it
key=${key%:} # remove trailing ":"
key=${key%]} # remove trailing "]"
hash["$key"]="$val" # store the key and value pair
done < <(./program_call) # feed the output of "./program_call" to the loop
# then you can access the values via the variable name here
foo="${hash["variable2"]}" # the variable "foo" is assigned to "value2"
# do something here
bar="${hash["variable3"]}" # the variable "bar" is assigned to "value3"
# do something here
Some people criticize that bash is too slow to process text lines,
but we process just about 50 lines in this case. I tested a simulation by
generating 50 lines, processing the output with the script above,
repeating the whole process 1,000 times. It completed within a few seconds. (Meaning one batch ends within a few milliseconds.)
This is how to do the job efficiently AND robustly (your approach and all other current answers will result in false matches from some input and some values of the variables you want to search for):
$ cat tst.sh
#!/usr/bin/env bash
vars='variable2 variable3'
awk -v vars="$vars" '
BEGIN {
split(vars,tmp)
for (i in tmp) {
tags[tmp[i]":"]
tags["["tmp[i]"]"]
tags["["tmp[i]"]:"]
}
}
$1 in tags || ( (s=index($1,"[")) && (substr($1,s) in tags) ) {
print $2
}
' "${#:--}"
$ ./tst.sh file
value2
value3
$ cat file | ./tst.sh
value2
value3
Note that the only loop is in the BEGIN section where it populates a hash table (tags[]) with the strings from the input that could match your variable list so that while processing the input it doesn't have to loop, it just does a hash lookup of the current $1 which will be very efficient as well as robust (e.g. will not fail on partial matches or even regexp metachars).
As shown, it'll work whether the input is coming from a file or a pipe. If that's not all you need then edit your question to clarify your requirements and improve your example to show a case where this does not do what you want.

Extract json value on regex on bash script

How can i get the values inner depends in bash script?
manifest.py
# Commented lines
{
'category': 'Sales/Subscription',
'depends': [
'sale_subscription',
'sale_timesheet',
],
'auto_install': True,
}
Expected response:
sale_subscription sale_timesheet
The major problem is linebreak, i have already tried | grep depends but i can not get the sale_timesheet value.
Im trying to add this values comming from files into a var, like:
DOWNLOADED_DEPS=($(ls -A $DOWNLOADED_APPS | while read -r file; do cat $DOWNLOADED_APPS/$file/__manifest__.py | [get depends value])
Example updated.
If this is your JSON file:
{
"category": "Sales/Subscription",
"depends": [
"sale_subscription",
"sale_timesheet"
],
"auto_install": true
}
You can get the desired result using jq like this:
jq -r '.depends | join(" ")' YOURFILE.json
This uses .depends to extract the value from the depends field, pipes it to join(" ") to join the array with a single space in between, and uses -r for raw (unquoted) output.
If it is not a json file and only string then you can use below Regex to find the values. If it's json file then you can use other methods like Thomas suggested.
^'depends':\s*(?:\[\s*)(.*?)(?:\])$
demo
you can use egrep for this as follows:
% egrep -M '^\'depends\':\s*(?:\[\s*)(.*?)(?:\])$' pathTo\jsonFile.txt
you can read about grep
As #Thomas has pointed out in a comment, the OPs input data is not in JSON format:
$ cat manifest.py
# Commented lines // comments not allowed in JSON
{
'category': 'Sales/Subscription', // single quotes should be replaced by double quotes
'depends': [
'sale_subscription',
'sale_timesheet', // trailing comma at end of section not allowed
],
'auto_install': True, // trailing comma issue; should be lower case "true"
}
And while the title of the question mentions regex, there is no sign of a regex in the question. I'll leave a regex based solution for someone else to come up with and instead ...
One (quite verbose) awk solution based on the input looking exactly like what's in the question:
$ awk -F"'" ' # use single quote as field separator
/depends/ { printme=1 ; next } # if we see the string "depends" then set printme=1
printme && /]/ { printme=0 ; next} # if printme=1 and line contains a right bracket then set printme=0
printme { printf pfx $2; pfx=" " } # if printme=1 then print a prefix + field #2;
# first time around pfx is undefined;
# subsequent passes will find pfx set to a space;
# since using "printf" with no "\n" in sight, all output will stay on a single line
END { print "" } # add a linefeed on the end of our output
' json.dat
This generates:
sale_subscription sale_timesheet

Parse out key=value pairs into variables

I have a bunch of different kinds of files I need to look at periodically, and what they have in common is that the lines have a bunch of key=value type strings. So something like:
Version=2 Len=17 Hello Var=Howdy Other
I would like to be able to reference the names directly from awk... so something like:
cat some_file | ... | awk '{print Var, $5}' # prints Howdy Other
How can I go about doing that?
The closest you can get is to parse the variables into an associative array first thing every line. That is to say,
awk '{ delete vars; for(i = 1; i <= NF; ++i) { n = index($i, "="); if(n) { vars[substr($i, 1, n - 1)] = substr($i, n + 1) } } Var = vars["Var"] } { print Var, $5 }'
More readably:
{
delete vars; # clean up previous variable values
for(i = 1; i <= NF; ++i) { # walk through fields
n = index($i, "="); # search for =
if(n) { # if there is one:
# remember value by name. The reason I use
# substr over split is the possibility of
# something like Var=foo=bar=baz (that will
# be parsed into a variable Var with the
# value "foo=bar=baz" this way).
vars[substr($i, 1, n - 1)] = substr($i, n + 1)
}
}
# if you know precisely what variable names you expect to get, you can
# assign to them here:
Var = vars["Var"]
Version = vars["Version"]
Len = vars["Len"]
}
{
print Var, $5 # then use them in the rest of the code
}
$ cat file | sed -r 's/[[:alnum:]]+=/\n&/g' | awk -F= '$1=="Var"{print $2}'
Howdy Other
Or, avoiding the useless use of cat:
$ sed -r 's/[[:alnum:]]+=/\n&/g' file | awk -F= '$1=="Var"{print $2}'
Howdy Other
How it works
sed -r 's/[[:alnum:]]+=/\n&/g'
This places each key,value pair on its own line.
awk -F= '$1=="Var"{print $2}'
This reads the key-value pairs. Since the field separator is chosen to be =, the key ends up as field 1 and the value as field 2. Thus, we just look for lines whose first field is Var and print the corresponding value.
Since discussion in commentary has made it clear that a pure-bash solution would also be acceptable:
#!/bin/bash
case $BASH_VERSION in
''|[0-3].*) echo "ERROR: Bash 4.0 required" >&2; exit 1;;
esac
while read -r -a words; do # iterate over lines of input
declare -A vars=( ) # refresh variables for each line
set -- "${words[#]}" # update positional parameters
for word; do
if [[ $word = *"="* ]]; then # if a word contains an "="...
vars[${word%%=*}]=${word#*=} # ...then set it as an associative-array key
fi
done
echo "${vars[Var]} $5" # Here, we use content read from that line.
done <<<"Version=2 Len=17 Hello Var=Howdy Other"
The <<<"Input Here" could also be <file.txt, in which case lines in the file would be iterated over.
If you wanted to use $Var instead of ${vars[Var]}, then substitute printf -v "${word%%=*}" %s "${word*=}" in place of vars[${word%%=*}]=${word#*=}, and remove references to vars elsewhere. Note that this doesn't allow for a good way to clean up variables between lines of input, as the associative-array approach does.
I will try to explain you a very generic way to do this which you can adapt easily if you want to print out other stuff.
Assume you have a string which has a format like this:
key1=value1 key2=value2 key3=value3
or more generic
key1_fs2_value1_fs1_key2_fs2_value2_fs1_key3_fs2_value3
With fs1 and fs2 two different field separators.
You would like to make a selection or some operations with these values. To do this, the easiest is to store these in an associative array:
array["key1"] => value1
array["key2"] => value2
array["key3"] => value3
array["key1","full"] => "key1=value1"
array["key2","full"] => "key2=value2"
array["key3","full"] => "key3=value3"
This can be done with the following function in awk:
function str2map(str,fs1,fs2,map, n,tmp) {
n=split(str,map,fs1)
for (;n>0;n--) {
split(map[n],tmp,fs2);
map[tmp[1]]=tmp[2]; map[tmp[1],"full"]=map[n]
delete map[n]
}
}
So, after processing the string, you have the full flexibility to do operations in any way you like:
awk '
function str2map(str,fs1,fs2,map, n,tmp) {
n=split(str,map,fs1)
for (;n>0;n--) {
split(map[n],tmp,fs2);
map[tmp[1]]=tmp[2]; map[tmp[1],"full"]=map[n]
delete map[n]
}
}
{ str2map($0," ","=",map) }
{ print map["Var","full"] }
' file
The advantage of this method is that you can easily adapt your code to print any other key you are interested in, or even make selections based on this, example:
(map["Version"] < 3) { print map["var"]/map["Len"] }
The simplest and easiest way is to use the string substitution like this:
property='my.password.is=1234567890=='
name=${property%%=*}
value=${property#*=}
echo "'$name' : '$value'"
The output is:
'my.password.is' : '1234567890=='
Yore.
Using bash's set command, we can split the line into positional parameters like awk.
For each word, we'll try to read a name value pair delimited by =.
When we find a value, assign it to the variable named $key using bash's printf -v feature.
#!/usr/bin/env bash
line='Version=2 Len=17 Hello Var=Howdy Other'
set $line
for word in "$#"; do
IFS='=' read -r key val <<< "$word"
test -n "$val" && printf -v "$key" "$val"
done
echo "$Var $5"
output
Howdy Other
SYNOPSIS
an awk-based solution that doesn't require manually checking the fields to locate the desired key pair :
approach being avoid splitting unnecessary fields or arrays - only performing regex match via function call when needed
only returning FIRST occurrence of input key value. Subsequent matches along the row are NOT returned
i just called it S() cuz it's the closest letter to $
I only included an array (_) of the 3 test values for demo purposes. Those aren't needed. In fact, no state information is being kept at all
caveat being : key-match must be exact - this version of the code isn't for case-insensitive or fuzzy/agile matching
Tested and confirmed working on
- gawk 5.1.1
- mawk 1.3.4
- mawk-2/1.9.9.6
- macos nawk
CODE
# gawk profile, created Fri May 27 02:07:53 2022
{m,n,g}awk '
function S(__,_) {
return \
! match($(_=_<_), "(^|["(_="[:blank:]]")")"(__)"[=][^"(_)"*") \
? "^$" \
: substr(__=substr($-_, RSTART, RLENGTH), index(__,"=")+_^!_)
}
BEGIN { OFS = "\f" # This array is only for testing
_["Version"] _["Len"] _["Var"] # purposes. Feel free to discard at will
} {
for (__ in _) {
print __, S(__) } }'
OUTPUT
Var
Howdy
Len
17
Version
2
So either call the fields in BAU fashion
- $5, $0, $NF, etc
or call S(QUOTED_KEY_VALUE), case-sensitive, like
As a safeguard, to prevent mis-interpreting null strings
or invalid inputs as $0, a non-match returns ^$
instead of empty string
S("Version") to get back 2.
As a bonus, it can safely handle values in multibyte unicode, both for values and even for keys, regardless of whether ur awk is UTF-8-aware or not :
1 ✜
🤡
2 Version
2
3 Var
Howdy
4 Len
17
5 ✜=🤡 Version=2 Len=17 Hello Var=Howdy Other
I know this is particularly regarding awk but mentioning this as many people come here for solutions to break down name = value pairs ( with / without using awk as such).
I found below way simple straight forward and very effective in managing multiple spaces / commas as well -
Source: http://jayconrod.com/posts/35/parsing-keyvalue-pairs-in-bash
change="foo=red bar=green baz=blue"
#use below if var is in CSV (instead of space as delim)
change=`echo $change | tr ',' ' '`
for change in $changes; do
set -- `echo $change | tr '=' ' '`
echo "variable name == $1 and variable value == $2"
#can assign value to a variable like below
eval my_var_$1=$2;
done

extract ip address from variable string

I'm trying to create a bash script which will be able to change the "allow from" ip address in the phpmyadmin command file (which im still not sure is possible to do) and restart apache
I'm currently trying to extract an ip address from a variable and after searching the web I still have no clue, here is what I have so far...
#bash shell script
#!/bin/bash
clear
echo "Get client IP address"
ip=$(last -i)
echo $ip
exit
echo "restart apache"
/etc/init.d/apache2 reload
I've tried adding the following line with no luck
ip=$(head -n 1 $ip)
If anyone can tell me how I can extract the first instance of an IP address from the variables $ip I would appreciate it very much.
ip=$(last -i | head -n 1 | awk '{print $3}')
Update:
ip=$(last -i | grep -Pom 1 '[0-9.]{7,15}')
You can use grep with read:
read ip < <(last -i | grep -o '[0-9]\+[.][0-9]\+[.][0-9]\+[.][0-9]\+')
read ip < <(last -i | grep -Eo '[0-9]+[.][0-9]+[.][0-9]+[.][0-9]+')
\b may also be helpful there. Just not sure about its compatibility.
And yet another:
ip=$(last -i | gawk 'BEGIN { RS = "[ \t\n]"; FS = "." } /^([0-9]+[.]){3}[0-9]+$/ && ! rshift(or(or($1, $2), or($3, $4)), 8) { print ; exit; }')
To get the first instance you can just do:
ip=$(last -i -1 | awk '{print $3}')
I'd just do
ip=$(last -i -1 | grep -Po '(\d+\.){3}\d+')
The above uses grep with Perl Compatible Regular Expressions which lets us use \d for digits. The regular expression looks for three repetitions of [0-9] followed by a dot (so, for example 123.45.123), then another stretch of digits. The -o flag causes grep to only print the matching line.
This approach has the advantage of working even when the number of fields per line changes (as is often the case, for example with system boot as the 2nd field). However, it needs GNU grep so if you need a more portable solution, use #konsolebox's answer instead.
Using bash only :
read -ra array < <(last -i)
ip="${array[2]}"
Or :
read -ra array < <(last -1 -i)
ip="${array[2]}"
Or if you're a nitpicker (and have a grep with -P), you can test the next:
while read -r testline
do
echo "input :=$testline="
read ip < <(grep -oP '\b(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))\b' <<< "$testline")
echo "result:=${ip:=NOTFOUND}="
echo
done <<EOF
some bla bla 127.0.0.1 some
10.10.10.10
bad one 300.200.300.400
some other bla 127.0.0.1 some another 10.1.1.0
10.10.10.10 10.1.1.0
bad one 300.200.300.400 and good 192.168.1.1
above is empty and no ip here too
EOF
It skips wrong ip adr, like 800.1.1.1 so, for the above test prints:
input :=some bla bla 127.0.0.1 some=
result:=127.0.0.1=
input :=10.10.10.10=
result:=10.10.10.10=
input :=bad one 300.200.300.400=
result:=NOTFOUND=
input :=some other bla 127.0.0.1 some another 10.1.1.0=
result:=127.0.0.1=
input :=10.10.10.10 10.1.1.0=
result:=10.10.10.10=
input :=bad one 300.200.300.400 and good 192.168.1.1=
result:=192.168.1.1=
input :==
result:=NOTFOUND=
input :=above is empty and no ip here too=
result:=NOTFOUND=
The \b is needed to skip matching an ip, like: 610.10.10.10, what is containing a valid ip (10.10.10.10).
The regex is taken from: https://metacpan.org/pod/Regexp::Common::net
Since I happen to have needed to do something in the same ballpark, here is a basic regular expression and an extended regular expression to loosly match an IP address (v4) making sure that there are 4 sequences of 1-3 numbers delimited by a 3 '.'.
# Basic Regular Expression to loosly match an IP address:
bre_match_ip="[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}"
# Extended Regular Expression to loosly match an IP address:
ere_match_ip="[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}"
Of course when matching IP (v4) addresses from a file (say HTML) it's quite easy to inadvertently match a version string or an url which contains versioning as part of its file path. The following is some Awk code I wrote a while ago for use in a Bash script to extract valid unique (no duplicates) IP addresses from a file. It avoids version numbers whether in the text or as part of an url and makes sure the IP numbers are in range.
I appreciate that this is overkill for the original poster and that it is not tailored for his needs but someone doing a search may come across this answer and find the fairly comprehensive nature of the code useful. The Awk code is thankfully well commented as it uses some slightly obscure aspects of Awk that the casual Awk user would probably not be familiar with.
awkExtractIPAddresses='
BEGIN {
# Regex to match an IP address like sequence (even if too long to be an IP).
# This is deliberately a loose match, the END section will check for IP
# address validity.
ipLikeSequence = "[0-9]+[.][0-9]+[.][0-9]+[.][0-9]+[0-9.]*";
# Regex to match a number sequence longer than 3 digits.
digitSequenceTooLongNotIP = "[0-9][0-9][0-9][0-9]+";
# Regex to match an IP address like sequence which is a version number.
# Equivalent to "(version|ver|v)[ .:]*" if "tolower($0)" was used.
versioningNotIP = "[Vv]([Ee][Rr]([Ss][Ii][Oo][Nn])?)?[ .:]*" ipLikeSequence;
# Regexes to match IP address like sequences next to forward slashes, to
# avoid version numbers in urls: e.g. http://web.com/libs/1.6.1.0/file.js
beginsWithFwdSlashNotIP = "[/]" ipLikeSequence;
endsWithFwdSlashNotIP = ipLikeSequence "[/]";
}
{
# Set line to the current line (more efficient than using $0 below).
line = $0;
# Replace sequences on line which will interfere with extracting genuine
# IPs. Use a replacement char and not the empty string to avoid accidentally
# creating a valid IP address from digits on either side of the removed
# sections. Use "/" as the replacement char for the 2 "FwdSlash" regexes so
# that multiple number dot slash sequences all get removed, as using "x"
# could result in inadvertently leaving such a sequence in place.
# e.g. "/lib1.6.1.0/1.2.3.4/5.6.7.8/file.js" leaves "/lib1.6.1.0xx/file.js"
gsub(digitSequenceTooLongNotIP, "x", line);
gsub(versioningNotIP, "x", line);
gsub(beginsWithFwdSlashNotIP, "/", line);
gsub(endsWithFwdSlashNotIP, "/", line);
# Loop through the current line matching IP address like sequences and
# storing them in the index of the array ipUniqueMatches. By using ipMatch
# as the array index duplicates are avoided and the values can be easily
# retrieved by the for loop in the END section. match() automatically sets
# the built in variables RSTART and RLENGTH.
while (match(line, ipLikeSequence))
{
ipMatch = substr(line, RSTART, RLENGTH);
ipUniqueMatches[ipMatch];
line = substr(line, RSTART + RLENGTH + 1);
}
}
END {
# Define some IP address related constants.
ipRangeMin = 0;
ipRangeMax = 255;
ipNumSegments = 4;
ipDelimiter = ".";
# Loop through the ipUniqueMatches array and print any valid IP addresses.
# The awk "for each" type of loop is different from the norm. It provides
# the indexes of the array and NOT the values of the array elements which
# is more usual in this type of loop.
for (ipMatch in ipUniqueMatches)
{
numSegments = split(ipMatch, ipSegments, ipDelimiter);
if (numSegments == ipNumSegments &&
ipSegments[1] >= ipRangeMin && ipSegments[1] <= ipRangeMax &&
ipSegments[2] >= ipRangeMin && ipSegments[2] <= ipRangeMax &&
ipSegments[3] >= ipRangeMin && ipSegments[3] <= ipRangeMax &&
ipSegments[4] >= ipRangeMin && ipSegments[4] <= ipRangeMax)
{
print ipMatch;
}
}
}'
# Extract valid IP addresses from $fileName, they will each be separated
# by a new line.
awkValidIpAddresses=$(awk "$awkExtractIPAddresses" < "$fileName")
I hope this is of interest.
You could use Awk.
ip=$(awk '{if(NR == 1) {print $3; exit;}}' < <(last -i))

Reading java .properties file from bash

I am thinking of using sed for reading .properties file, but was wondering if there is a smarter way to do that from bash script?
This would probably be the easiest way: grep + cut
# Usage: get_property FILE KEY
function get_property
{
grep "^$2=" "$1" | cut -d'=' -f2
}
The solutions mentioned above will work for the basics. I don't think they cover multi-line values though. Here is an awk program that will parse Java properties from stdin and produce shell environment variables to stdout:
BEGIN {
FS="=";
print "# BEGIN";
n="";
v="";
c=0; # Not a line continuation.
}
/^\#/ { # The line is a comment. Breaks line continuation.
c=0;
next;
}
/\\$/ && (c==0) && (NF>=2) { # Name value pair with a line continuation...
e=index($0,"=");
n=substr($0,1,e-1);
v=substr($0,e+1,length($0) - e - 1); # Trim off the backslash.
c=1; # Line continuation mode.
next;
}
/^[^\\]+\\$/ && (c==1) { # Line continuation. Accumulate the value.
v= "" v substr($0,1,length($0)-1);
next;
}
((c==1) || (NF>=2)) && !/^[^\\]+\\$/ { # End of line continuation, or a single line name/value pair
if (c==0) { # Single line name/value pair
e=index($0,"=");
n=substr($0,1,e-1);
v=substr($0,e+1,length($0) - e);
} else { # Line continuation mode - last line of the value.
c=0; # Turn off line continuation mode.
v= "" v $0;
}
# Make sure the name is a legal shell variable name
gsub(/[^A-Za-z0-9_]/,"_",n);
# Remove newlines from the value.
gsub(/[\n\r]/,"",v);
print n "=\"" v "\"";
n = "";
v = "";
}
END {
print "# END";
}
As you can see, multi-line values make things more complex. To see the values of the properties in shell, just source in the output:
cat myproperties.properties | awk -f readproperties.awk > temp.sh
source temp.sh
The variables will have '_' in the place of '.', so the property some.property will be some_property in shell.
If you have ANT properties files that have property interpolation (e.g. '${foo.bar}') then I recommend using Groovy with AntBuilder.
Here is my wiki page on this very topic.
I wrote a script to solve the problem and put it on my github.
See properties-parser
One option is to write a simple Java program to do it for you - then run the Java program in your script. That might seem silly if you're just reading properties from a single properties file. However, it becomes very useful when you're trying to get a configuration value from something like a Commons Configuration CompositeConfiguration backed by properties files. For a time, we went the route of implementing what we needed in our shell scripts to get the same behavior we were getting from CompositeConfiguration. Then we wisened up and realized we should just let CompositeConfiguration do the work for us! I don't expect this to be a popular answer, but hopefully you find it useful.
If you want to use sed to parse -any- .properties file, you may end up with a quite complex solution, since the format allows line breaks, unquoted strings, unicode, etc: http://en.wikipedia.org/wiki/.properties
One possible workaround would using java itself to preprocess the .properties file into something bash-friendly, then source it. E.g.:
.properties file:
line_a : "ABC"
line_b = Line\
With\
Breaks!
line_c = I'm unquoted :(
would be turned into:
line_a="ABC"
line_b=`echo -e "Line\nWith\nBreaks!"`
line_c="I'm unquoted :("
Of course, that would yield worse performance, but the implementation would be simpler/clearer.
In Perl:
while(<STDIN>) {
($prop,$val)=split(/[=: ]/, $_, 2);
# and do stuff for each prop/val
}
Not tested, and should be more tolerant of leading/trailing spaces, comments etc., but you get the idea. Whether you use Perl (or another language) over sed is really dependent upon what you want to do with the properties once you've parsed them out of the file.
Note that (as highlighted in the comments) Java properties files can have multiple forms of delimiters (although I've not seen anything used in practice other than colons). Hence the split uses a choice of characters to split upon.
Ultimately, you may be better off using the Config::Properties module in Perl, which is built to solve this specific problem.
I have some shell scripts that need to look up some .properties and use them as arguments to programs I didn't write. The heart of the script is a line like this:
dbUrlFile=$(grep database.url.file etc/zocalo.conf | sed -e "s/.*: //" -e "s/#.*//")
Effectively, that's grep for the key and filter out the stuff before the colon and after any hash.
if you want to use "shell", the best tool to parse files and have proper programming control is (g)awk. Use sed only simple substitution.
I have sometimes just sourced the properties file into the bash script. This will lead to environment variables being set in the script with the names and contents from the file. Maybe that is enough for you, too. If you have to do some "real" parsing, this is not the way to go, of course.
Hmm, I just run into the same problem today. This is poor man's solution, admittedly more straightforward than clever;)
decl=`ruby -ne 'puts chomp.sub(/=(.*)/,%q{="\1";}).gsub(".","_")' my.properties`
eval $decl
then, a property 'my.java.prop' can be accessed as $my_java_prop.
This can be done with sed or whatever, but I finally went with ruby for its 'irb' which was handy for experimenting.
It's quite limited (dots should be replaced only before '=',no comment handling), but could be a starting point.
#Daniel, I tried to source it, but Bash didn't like dots in variable names.
I have had some success with
PROPERTIES_FILE=project.properties
function source_property {
local name=$1
eval "$name=\"$(sed -n '/^'"$name"'=/,/^[A-Z]\+_*[A-Z]*=/p' $PROPERTIES_FILE|sed -e 's/^'"$name"'=//g' -e 's/"/\\"/g'|head -n -1)\""
}
source_property 'SOME_PROPERTY'
This is a solution that properly parses quotes and terminates at a space when not given quotes. It is safe: no eval is used.
I use this code in my .bashrc and .zshrc for importing variables from shell scripts:
# Usage: _getvar VARIABLE_NAME [sourcefile...]
# Echos the value that would be assigned to VARIABLE_NAME
_getvar() {
local VAR="$1"
shift
awk -v Q="'" -v QQ='"' -v VAR="$VAR" '
function loc(text) { return index($0, text) }
function unquote(d) { $0 = substr($0, eq+2) d; print substr($0, 1, loc(d)-1) }
{ sub(/^[ \t]+/, ""); eq = loc("=") }
substr($0, 1, eq-1) != VAR { next } # assignment is not for VAR: skip
loc("=" QQ) == eq { unquote(QQ); exit }
loc("=" Q) == eq { unquote( Q); exit }
{ print substr($1, eq + 1); exit }
' "$#"
}
This saves the desired variable name and then shifts the argument array so the rest can be passed as files to awk.
Because it's so hard to call shell variables and refer to quote characters inside awk, I'm defining them as awk variables on the command line. Q is a single quote (apostrophe) character, QQ is a double quote, and VAR is that first argument we saved earlier.
For further convenience, there are two helper functions. The first returns the location of the given text in the current line, and the second prints the content between the first two quotes in the line using quote character d (for "delimiter"). There's a stray d concatenated to the first substr as a safety against multi-line strings (see "Caveats" below).
While I wrote the code for POSIX shell syntax parsing, that appears to only differ from your format by whether there is white space around the asignment. You can add that functionality to the above code by adding sub(/[ \t]*=[ \t]*/, "="); before the sub(…) on awk's line 4 (note: line 1 is blank).
The fourth line strips off leading white space and saves the location of the first equals sign. Please verify that your awk supports \t as tab, this is not guaranteed on ancient UNIX systems.
The substr line compares the text before the equals sign to VAR. If that doesn't match, the line is assigning a different variable, so we skip it and move to the next line.
Now we know we've got the requested variable assignment, so it's just a matter of unraveling the quotes. We do this by searching for the first location of =" (line 6) or =' (line 7) or no quotes (line 8). Each of those lines prints the assigned value.
Caveats: If there is an escaped quote character, we'll return a value truncated to it. Detecting this is a bit nontrivial and I decided not to implement it. There's also a problem of multi-line quotes, which get truncated at the first line break (this is the purpose of the "stray d" mentioned above). Most solutions on this page suffer from these issues.
In order to let Java do the tricky parsing, here's a solution using jrunscript to print the keys and values in a bash read-friendy (key, tab character, value, null character) way:
#!/usr/bin/env bash
jrunscript -e '
p = new java.util.Properties();
p.load(java.lang.System.in);
p.forEach(function(k,v) { out.format("%s\t%s\000", k, v); });
' < /tmp/test.properties \
| while IFS=$'\t' read -d $'\0' -r key value; do
key=${key//./_}
printf -v "$key" %s "$value"
printf '=> %s = "%s"\n' "$key" "$value"
done
I found printf -v in this answer by #david-foerster.
To quote jrunscript: Warning: Nashorn engine is planned to be removed from a future JDK release

Resources