Parsing file like .ini using bash, sed, awk

Parsing file like .ini using bash, sed, awk - bash

I have a file like this:
[User1] <- unique id
name= <- values can be empty
pwd=
...
<- empty line
[User2]
name=
pwd=
..
[User3]
name=
pwd=
..
I need ability:
to get the fields values for User2
to change the field falue (e.g. pwd).
PS Using bash, sed or awk is preferable

You can do it with three rules like this (nawk compatible):
awk -F= '
/^\[/ { user=$1; gsub("[][]","",user) }
user == "User2" && $1 == "pwd" { $0=$1"=some_pwd" }
1
'
Output:
[User1]
name=
pwd=
...
[User2]
name=
pwd=some_pwd
..
[User3]
name=
pwd=
..

Here's a simple solution to change the value of pwd. This will add an extra newline to the end of the record if pwd is the last field.
awk '/^\[User2\]/ { sub( "\npwd=[^\n]*(\n|$)",
"\npwd=newvalue\n") } 1' ORS='\n\n' RS= input-file > output-file
mv output-file > input-file

This one is a clear win for Python vs. AWK, as Python comes with a built-in module for just this sort of problem.
The module's name changed from Python 2.x to Python 3.x; the try block at the top should allow this to work with either Python 2.x or Python 3.x (and I tested it with both on my computer).
EDIT: I just slightly improved the answer. Now instead of writing a new file, it writes a temp file, and when it is successfully done it deletes the original file and renames the temp file to the original file name. On non-Windows system the step of removing the original file is optional.
import os
import sys
try:
import ConfigParser as cp
except ImportError:
import configparser as cp
try:
_, fname = sys.argv
except Exception:
print("Usage: configedit <filename>")
temp_file = fname + ".tempfile"
c = cp.ConfigParser()
c.read(fname)
c.set("User2", "pwd", "XkcdApprovedLongerPassword")
with open(temp_file, "w") as f:
c.write(f)
os.remove(fname)
os.rename(temp_file, fname)

As requested via sed:
sed -i "/\[User2\]/,/^$/{s/\(^pwd\)\=.*$/\1\=password/}"
Change "password" in the line above to whatever you want to change the password to within the file.
This script will search between "[User2]" and the next blank line.
It will then find the line starting with "pwd=" and change anything after that.
For those looking closely I didn't capture the "=" sign for readability for the requester.

Related

Update version number in property file using bash

I am new in bash scripting and I need help with awk. So the thing is that I have a property file with version inside and I want to update it.
version=1.1.1.0
and I use awk to do that
file="version.properties"
awk -F'["]' -v OFS='"' '/version=/{
split($4,a,".");
$4=a[1]"."a[2]"."a[3]"."a[4]+1
}
;1' $file > newFile && mv newFile $file
but I am getting strange result version="1.1.1.0""...1
Could someone help me please with this.

You mentioned in your comment you want to update the file in place. You can do that in a one-liner with perl:
perl -pe '/^version=/ and s/(\d+\.\d+\.\d+\.)(\d+)/$1 . ($2+1)/e' -i version.properties
Explanation
-e is followed by a script to run. With -p and -i, the effect is to run that script on each line, and modify the file in place if the script changes anything.
The script itself, broken down for explanation, is:
/^version=/ and # Do the following on lines starting with `version=`
s/ # Make a replacement on those lines
(\d+\.\d+\.\d+\.)(\d+)/ # Match x.y.z.w, and set $1 = `x.y.z.` and $2 = `w`
$1 . ($2+1)/ # Replace x.y.z.w with a copy of $1, followed by w+1
e # This tells Perl the replacement is Perl code rather
# than a text string.
Example run
$ cat foo.txt
version=1.1.1.2
$ perl -pe '/^version=/ and s/(\d+\.\d+\.\d+\.)(\d+)/$1 . ($2+1)/e' -i foo.txt
$ cat foo.txt
version=1.1.1.3

This is not the best way, but here's one fix.
Test case
I am assuming the input file has at least one line that is exactly version=1.1.1.0.
$ awk -F'["]' -v OFS='"' '/version=/{
> split($4,a,".");
> $4=a[1]"."a[2]"."a[3]"."a[4]+1
> }
> ;1' <<<'version=1.1.1.0'
Output:
version=1.1.1.0"""...1
The """ is because you are assigning to field 4 ($4). When you do that, awk adds field separators (OFS) between fields 1 and 2, 2 and 3, and 3 and 4. Three OFS => """, in your example.
Minimal change
$ awk -F'["]' -v OFS='"' '/version=/{
split($1,a,".");
$1=a[1]"."a[2]"."a[3]"."a[4]+1;
print
}
' <<<'version=1.1.1.0'
version=1.1.1.1
Two changes:
Change $4 to $1
Since the input field separator (-F) is ["], $4 is whatever would be after the third " (if there were any in the input). Therefore, split($4, ...) splits an empty field. The contents of the line, before the first " (if any), are in $1.
print at the end instead of ;1
The 1 after the closing curly brace is the next condition, and there is no action specified. The default action is to print the current line, as modified, so the 1 triggers printing. Instead, just print within your action when you are done processing. That way your action is self-contained. (Of course, if you needed to do other processing, you might want to print later, after that processing.)

You can use the = as the delimiter, like this:
awk -F= -v v=1.0.1 '$1=="version"{printf "version=\"%s\"\n", v}' file.properties

To extract a string from filename and insert it into the file

I want to write a bash script for extracting a string from the file name and insert that string into a specific location in the same file.
For example:
Under /root dir there are different date directories 20160201, 20160202, 20160203 and under each directory there is a file abc20160201.dat, abc20160202.dat, abc20160203.dat.
My requirement is that I need to extract the date from each file name first, and then insert that date into the second column of each record in the file.
For extracting the date I am using
f=abc20160201.dat
s=`echo $f | cut -c 4-11`
echo "$f -> $s"
and for inserting the date iI am using
awk 'BEGIN { OFS = "~"; ORS = "\n" ; date="20160201" ; IFS = "~"} { $1=date"~"$1 ; print } ' file > tempdate
But in my awk command the date is coming in the first column. Please let me know what I am doing wrong here.
The file on which this operation is being done is a delimited file with fields separated by ~ characters.
Or if anybody has a better solution for this, please let me know.

The variable for the input field separator is FS, not IFS. Consequently, the input line is not being split at all, hence when you add the date after field 1, it appears at the end of the line.
You should be able to use:
f=abc20160201.dat
s=$(echo $f | cut -c 4-11)
awk -v date="$s" 'BEGIN { FS = OFS = "~" } { $1 = $1 OFS date; print }' $f
That generates the modified output to standard output. AFAIK, awk doesn't have an overwrite option, so if you want to modify the files 'in place', you'll write the output of the script to a temporary file, and then copy or move the temporary file over the original (removing the temporary if you copied). Copying preserves both hard links and symbolic links (and owner, group, permissions); moving doesn't. If the file names are neither symlinks nor linked files, moving is simpler. (Copying always 'works', but the copy takes longer than a move, requires the remove, and there's a longer window while the over-writing copy could leave you with an incomplete file if interrupted.)
Generalizing a bit:
for file in /root/2016????/*.dat
do
tmp=$(mktemp "$(dirname "$file")/tmp.XXXXXX)")
awk -v date="$(basename "$file" | cut -c 4-11)" \
'BEGIN { FS = OFS = "~" } { $1 = $1 OFS date; print }' "$file" >"$tmp"
mv "$tmp" "$file"
done
One of the reasons for preferring $(…) over back-quotes is that it is much easier to manage nested operations and quoting using $(…). The mktemp command creates the temporary file in the same directory as the source file; you can legitimately decide to use mktemp "${TMPDIR:-/tmp}/tmp.XXXXXX instead. A still more general script would iterate of "$#" (the arguments it is passed), but it might need to validate that the base name of the file matches the format you require/expect.
Adding code to deal with cleaning up on interrupts, or selecting between copy and move, is left as an exercise for the reader. Note that the script makes no attempt to detect whether it has been run on a file before. If you run it three times on the same file, you'll end up with columns 2-4 all containing the date.

how to find the position of a string in a file in unix shell script

Can you please help me solve this puzzle? I am trying to print the location of a string (i.e., line #) in a file, first to the std output, and then capture that value in a variable to be used later. The string is “my string”, the file name is “myFile” which is defined as follows:
this is first line
this is second line
this is my string on the third line
this is fourth line
the end
Now, when I use this command directly at the command prompt:
% awk ‘s=index($0, “my string”) { print “line=” NR, “position= ” s}’ myFile
I get exactly the result I want:
% line= 3, position= 9
My question is: if I define a variable VAR=”my string”, why can’t I get the same result when I do this:
% awk ‘s=index($0, $VAR) { print “line=” NR, “position= ” s}’ myFile
It just won’t work!! I even tried putting the $VAR in quotation marks, to no avail? I tried using VAR (without the $ sign), no luck. I tried everything I could possibly think of ... Am I missing something?

awk variables are not the same as shell variables. You need to define them with the -v flag
For example:
$ awk -v var="..." '$0~var{print NR}' file
will print the line number(s) of pattern matches. Or for your case with the index
$ awk -v var="$Var" 'p=index($0,var){print NR,p}' file
using all uppercase may not be good convention since you may accidentally overwrite other variables.
to capture the output into a shell variable
$ info=$(awk ...)
for multi line output assignment to shell array, you can do
$ values=( $(awk ...) ); echo ${values[0]}
however, if the output contains more than one field, it will be assigned it's own array index. You can change it with setting the IFS variable, such as
$ IFS=$(echo -en "\n\b"); values=( $(awk ...) )
which will capture the complete lines as the array values.

Adding file information to an AWK comparison

I'm using awk to perform a file comparison against a file listing in found.txt
while read line; do
awk 'FNR==NR{a[$1]++;next}$1 in a' $line compare.txt >> $CHECKFILE
done < found.txt
found.txt contains full path information to a number of files that may contain the data. While I am able to determine that data exists in both files and output that data to $CHECKFILE, I wanted to be able to put the line from found.txt (the filename) where the line was found.
In other words I end up with something like:
File " /xxxx/yyy/zzz/data.txt "contains the following lines in found.txt $line
just not sure how to get the /xxxx/yyy/zzz/data.txt information into the stream.
Appended for clarification:
The file found.txt contains the full path information to several files on the system
/path/to/data/directory1/file.txt
/path/to/data/directory2/file2.txt
/path/to/data/directory3/file3.txt
each of the files has a list of parameters that need to be checked for existence before appending additional information to them later in the script.
so for example, file.txt contains the following fields
parameter1 = true
parameter2 = false
...
parameter35 = true
the compare.txt file contains a number of parameters as well.
So if parameter35 (or any other parameter) shows up in one of the three files I get it's output dropped to the Checkfile.
Both of the scripts (yours and the one I posted) will give me that output but I would also like to echo in the line that is being read at that point in the loop. Sounds like I would just be able to somehow pipe it in, but my awk expertise is limited.

It's not really clear what you want but try this (no shell loop required):
awk '
ARGIND==1 { ARGV[ARGC] = $0; ARGC++; next }
ARGIND==2 { keys[$1]; next }
$1 in keys { print FILENAME, $1 }
' found.txt compare.txt > "$CHECKFILE"
ARGIND is gawk-specific, if you don't have it add FNR==1{ARGIND++}.

Pass the name into awk inside a variable like this:
awk -v file="$line" '{... print "File: " file }'

Grep search strings with line breaks

How to use grep to output occurrences of the string 'export to excel' in the input files given below? Specifically, how to handle the line breaks that happen in between the search strings? Is there a switch in grep that can do this or some other command probably?
Input files:
File a.txt:
blah blah ... export to
excel ...
blah blah..
File b.txt:
blah blah ... export to excel ...
blah blah..

Do you just want to find files that contain the pattern, ignoring linebreaks, or do you want to actually see the matching lines?
If the former, you can use tr to convert newlines to spaces:
tr '\n' ' ' | grep 'export to excel'
If the latter you can do the same thing, but you may want to use the -o flag to only print the actual match. You'll then want to adjust your regex to include any extra context you want.

I don't know how to do this in grep. I checked the man page for egrep(1) and it can't match with a newline in the middle either.
I like the solution #Laurence Gonsalves suggested, of using tr(1) to wipe out the newlines. But as he noted, it will be a pain to print the matching lines if you do it that way.
If you want to match despite a newline and then print the matching line(s), I can't think of a way to do it with grep, but it would be not too hard in any of Python, AWK, Perl, or Ruby.
Here's a Python script that solves the problem. I decided that, for lines that only match when joined to the previous line, I would print a --> arrow before the second line of the match. Lines that match outright are always printed without the arrow.
This is written assuming that /usr/bin/python is Python 2.x. You can trivially change the script to work under Python 3.x if desired.
#!/usr/bin/python
import re
import sys
s_pat = "export\s+to\s+excel"
pat = re.compile(s_pat)
def print_ete(fname):
try:
f = open(fname, "rt")
except IOError:
sys.stderr.write('print_ete: unable to open file "%s"\n' % fname)
sys.exit(2)
prev_line = ""
i_last = -10
for i, line in enumerate(f):
# is ete within current line?
if pat.search(line):
print "%s:%d: %s" % (fname, i+1, line.strip())
i_last = i
else:
# construct extended line that included previous
# note newline is stripped
s = prev_line.strip("\n") + " " + line
# is ete within extended line?
if pat.search(s):
# matched ete in extended so want both lines printed
# did we print prev line?
if not i_last == (i - 1):
# no so print it now
print "%s:%d: %s" % (fname, i, prev_line.strip())
# print cur line with special marker
print "--> %s:%d: %s" % (fname, i+1, line.strip())
i_last = i
# make sure we don't match ete twice
prev_line = re.sub(pat, "", line)
try:
if sys.argv[1] in ("-h", "--help"):
raise IndexError # print help
except IndexError:
sys.stderr.write("print_ete <filename>\n")
sys.stderr.write('grep-like tool to print lines matching "%s"\n' %
"export to excel")
sys.exit(1)
print_ete(sys.argv[1])
EDIT: added comments.
I went to some trouble to make it print the correct line number on each line, using a format similar to what you would get with grep -Hn.
It could be much shorter and simpler if you don't need line numbers, and you don't mind reading in the whole file at once into memory:
#!/usr/bin/python
import re
import sys
# This pattern not compiled with re.MULTILINE on purpose.
# We *want* the \s pattern to match a newline here so it can
# match across multiple lines.
# Note the match group that gathers text around ete pattern uses a character
# class that matches anything but "\n", to grab text around ete.
s_pat = "([^\n]*export\s+to\s+excel[^\n]*)"
pat = re.compile(s_pat)
def print_ete(fname):
try:
text = open(fname, "rt").read()
except IOError:
sys.stderr.write('print_ete: unable to open file "%s"\n' % fname)
sys.exit(2)
for s_match in re.findall(pat, text):
print s_match
try:
if sys.argv[1] in ("-h", "--help"):
raise IndexError # print help
except IndexError:
sys.stderr.write("print_ete <filename>\n")
sys.stderr.write('grep-like tool to print lines matching "%s"\n' %
"export to excel")
sys.exit(1)
print_ete(sys.argv[1])

grep -A1 "export to" filename | grep -B1 "excel"

I have tested this a little and it seems to work:
sed -n '$b; /export to excel/{p; b}; N; /export to\nexcel/{p; b}; D' filename
You can allow for some extra white space at the end and beginning of the lines like this:
sed -n '$b; /export to excel/{p; b}; N; /export to\s*\n\s*excel/{p; b}; D' filename

use gawk. set record separator as excel, then check for "export to".
gawk -vRS="excel" '/export.*to/{print "found export to excel at record: "NR}' file
or
gawk '/export.*to.*excel/{print}
/export to/&&!/excel/{
s=$0
getline line
if (line~/excel/){
printf "%s\n%s\n",s,line
}
}' file

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio