extract xml using a variable in awk - bash

I want search an XML file to extract specific XML block containing this string 58B338939C5B1970E1008000AC10E225_HCA_13
I am able to do it via the following command:
awk 'BEGIN{RS="<[/]?WorkResponseMessage>"} /58B338939C5B1970E1008000AC10E225_HCA_13/{print $0,"</WorkResponseMessage>"}' ag1.xml > ag2.xml
My query is I want to pass the search string in a variable from command line and use that variable to search, for example:
awk 'BEGIN{RS="<[/]?WorkResponseMessage>"} /$m/{print $0,"</WorkResponseMessage>"}' ag1.xml > ag2.xml
Here 'm' is my variable. I am able to get the value inside 'm', but it doesn't seem to work with the awk command. I have tried using quotes("",'') for m as well and that doesn't work either. The awk -v option also doesn't work with this

try this -
m="58B338939C5B1970E1008000AC10E225_HCA_13"
echo $m
awk -v m="$m" 'BEGIN{RS="<[/]?WorkResponseMessage>"} $0 ~ m {print $0,"</WorkResponseMessage>"}' ag1.xml > ag2.xml

Related

Update version number in property file using bash

I am new in bash scripting and I need help with awk. So the thing is that I have a property file with version inside and I want to update it.
version=1.1.1.0
and I use awk to do that
file="version.properties"
awk -F'["]' -v OFS='"' '/version=/{
split($4,a,".");
$4=a[1]"."a[2]"."a[3]"."a[4]+1
}
;1' $file > newFile && mv newFile $file
but I am getting strange result version="1.1.1.0""...1
Could someone help me please with this.
You mentioned in your comment you want to update the file in place. You can do that in a one-liner with perl:
perl -pe '/^version=/ and s/(\d+\.\d+\.\d+\.)(\d+)/$1 . ($2+1)/e' -i version.properties
Explanation
-e is followed by a script to run. With -p and -i, the effect is to run that script on each line, and modify the file in place if the script changes anything.
The script itself, broken down for explanation, is:
/^version=/ and # Do the following on lines starting with `version=`
s/ # Make a replacement on those lines
(\d+\.\d+\.\d+\.)(\d+)/ # Match x.y.z.w, and set $1 = `x.y.z.` and $2 = `w`
$1 . ($2+1)/ # Replace x.y.z.w with a copy of $1, followed by w+1
e # This tells Perl the replacement is Perl code rather
# than a text string.
Example run
$ cat foo.txt
version=1.1.1.2
$ perl -pe '/^version=/ and s/(\d+\.\d+\.\d+\.)(\d+)/$1 . ($2+1)/e' -i foo.txt
$ cat foo.txt
version=1.1.1.3
This is not the best way, but here's one fix.
Test case
I am assuming the input file has at least one line that is exactly version=1.1.1.0.
$ awk -F'["]' -v OFS='"' '/version=/{
> split($4,a,".");
> $4=a[1]"."a[2]"."a[3]"."a[4]+1
> }
> ;1' <<<'version=1.1.1.0'
Output:
version=1.1.1.0"""...1
The """ is because you are assigning to field 4 ($4). When you do that, awk adds field separators (OFS) between fields 1 and 2, 2 and 3, and 3 and 4. Three OFS => """, in your example.
Minimal change
$ awk -F'["]' -v OFS='"' '/version=/{
split($1,a,".");
$1=a[1]"."a[2]"."a[3]"."a[4]+1;
print
}
' <<<'version=1.1.1.0'
version=1.1.1.1
Two changes:
Change $4 to $1
Since the input field separator (-F) is ["], $4 is whatever would be after the third " (if there were any in the input). Therefore, split($4, ...) splits an empty field. The contents of the line, before the first " (if any), are in $1.
print at the end instead of ;1
The 1 after the closing curly brace is the next condition, and there is no action specified. The default action is to print the current line, as modified, so the 1 triggers printing. Instead, just print within your action when you are done processing. That way your action is self-contained. (Of course, if you needed to do other processing, you might want to print later, after that processing.)
You can use the = as the delimiter, like this:
awk -F= -v v=1.0.1 '$1=="version"{printf "version=\"%s\"\n", v}' file.properties

Use awk to separate text file into multiple files

I've read a couple of other questions about this, but none of them seem to be working. I'm currently trying to split something like file A.txt using the delimiter "STOPHERE".
This is the code:
#!/bin/bash
awk 'BEGIN{
RS = "STOPHERE"
file = 0}
{
file++
print $0 > ("sepf" file)
}' A.txt
File A:
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa lwdjnuqqfqaaaaaaaaaa qlknfqek fkgnl efekfnwegelflfne
ldnwefne f STOPHEREsdfnkjnf nnnnnnnnnnnnnnnnnnnnnnnasd fefffffffffffffflllo
aldn3orn STOPHERE
fknjke bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbowqff STOPHERE i
asfjfenf STOPHERE
Into these:
sepf1:
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa lwdjnuqqfqaaaaaaaaaa qlknfqek fkgnl efekfnwegelflfne
ldnwefne f
sepf2:
sdfnkjnf nnnnnnnnnnnnnnnnnnnnnnnasd fefffffffffffffflllo
aldn3orn
sepf3:
#line starts here
fknjke bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbowqff
sepf4:
i
asfjfenf
So basically, the formatting has to stay exactly the same between the STOPHERE.
But for some reason, this is the kind of output I'm getting in some of the files:
Eg: sepf2
TOPHEREsdfnkjnf nnnnnnnnnnnnnnnnnnnnnnnasd fefffffffffffffflllo
aldn3orn
Any ideas as to why the "TOPHERE" remains??
GNU awk allows RS to be a regex. So you can provide multiple characters as a record separator. Your code can also be simplified as AWK provides a default value of 0.
So this will generate separate files for each record.
awk -v RS="STOPHERE" '{print $0 > ("sepf" ++file)}'

how to find the position of a string in a file in unix shell script

Can you please help me solve this puzzle? I am trying to print the location of a string (i.e., line #) in a file, first to the std output, and then capture that value in a variable to be used later. The string is “my string”, the file name is “myFile” which is defined as follows:
this is first line
this is second line
this is my string on the third line
this is fourth line
the end
Now, when I use this command directly at the command prompt:
% awk ‘s=index($0, “my string”) { print “line=” NR, “position= ” s}’ myFile
I get exactly the result I want:
% line= 3, position= 9
My question is: if I define a variable VAR=”my string”, why can’t I get the same result when I do this:
% awk ‘s=index($0, $VAR) { print “line=” NR, “position= ” s}’ myFile
It just won’t work!! I even tried putting the $VAR in quotation marks, to no avail? I tried using VAR (without the $ sign), no luck. I tried everything I could possibly think of ... Am I missing something?
awk variables are not the same as shell variables. You need to define them with the -v flag
For example:
$ awk -v var="..." '$0~var{print NR}' file
will print the line number(s) of pattern matches. Or for your case with the index
$ awk -v var="$Var" 'p=index($0,var){print NR,p}' file
using all uppercase may not be good convention since you may accidentally overwrite other variables.
to capture the output into a shell variable
$ info=$(awk ...)
for multi line output assignment to shell array, you can do
$ values=( $(awk ...) ); echo ${values[0]}
however, if the output contains more than one field, it will be assigned it's own array index. You can change it with setting the IFS variable, such as
$ IFS=$(echo -en "\n\b"); values=( $(awk ...) )
which will capture the complete lines as the array values.

Remove the newline character in awk

I am trying to remove the new line character for a date function and have it include spaces. I am saving the variables using this:
current_date=$(date "+%m/%d/ AT %y%H:%M:%S" )
I can see that this is the right format I need by doing a echo $current_date.
However, when I need to use this variable it does not act the way I would like it.
awk '(++n==47) {print "1\nstring \nblah '$current_date' blah 2; n=0} (/blah/) {n=0} {print}' input file > output file
I need the date to stay in the current line of text and continue with no newline unless specified.
Thanks in advance.
Rather than attempting to insert the variable into the command string as you are doing, you can pass it to awk like this:
awk -v date="$(date "+%m/%d/ AT %y%H:%M:%S")" '# your awk one-liner here' input_file
You can then use the variable date as an awk variable within the script:
print "1\nstring \nblah " date " blah 2";
As an aside, it looks like your original print statement was broken, as there were double quotes missing from the end of it.

Assign a variable the value of a string in a file

I have a file called info.log which contains the line:
/home/jax/Main_X_1_A
X, 1 and A are meaningful and they can change. However "Main" and the underscores remain the same.
Is it possible to use a utility to assign a shell variable a value based on the information in info.log?
E.g.
MY_VERSION="?_?_?";
Where the question marks represent the single characters that are found in those locations.
For example if info.log contained this line:
/home/jax/Main_1_2_3
And we used that data to initialise a shell variable:
MY_VERSION=...
echo $MY_VERSION
The output would be:
1_2_3
Updating question with better example:
Info.log
MODULE=TEST
QUICK_BUILD_DIR=/usr/apps/Main_1_2_3
ANT_FILE=build.xml
FANCE=/usr/apps/test/Main_1_2_3
I want to be able to take these three numbers(1, 2 and 3):
QUICK_BUILD_DIR=/usr/apps/Main_1_2_3
And assign them to variables.
Note: 1, 2 and 3 are just example numbers and they can change.
Can you try this?
var="MY_VERSION=1_3_2"
version=$(echo $var | sed 's/.*MAIN_\(.*\)/\1/') #version will be 1_3_2
This uses bash and sed.
A GNU Awk Solution
$ MY_VERSION=$(awk -F/ '/Main_/ { sub(/Main_/, "", $NF); print $NF }' info.log)
$ echo "$MY_VERSION"
X_1_A
You can use this awk command:
cat file
/home/jill/Main_1_2_4
/home/jax/Main_1_2_3
/home/john/Main_X_1_A
awk -v u=jax -F '/' '$3==u{sub(/^Main_/, "", $4); print $4}' file
1_2_3
Here you can pass any username in u variable to awk (as jax is being passed here) and version will be picked from that particular line.
No need for external utilities. Bash can do the string manipulation for you:
$ cat info.log
/home/jax/Main_X_1_A
$ read -r a < info.log
$ b="${a#*_}"
$ echo "$b"
X_1_A

Resources