grep result get separated by spaces when saving it to variable - bash

need your help on this. I have a simple XML file goes like:
<Entity ID="12345" Record="1">
<Info>
<Type>Individual</Type>
<Name>Test</Name>
</Info>
<Entity Record="2">
<Info>
<Type>Individual</Type>
<Name>Test2</Name>
</Info>
And what I want to do is to grep the attributes and its value for the node.
This is my code:
entities=($(grep -oP '(?<=<Entity ).*(?=>)' "abc.xml"))
for j in ${!entities[*]}
do
echo "${entities[$j]}"
((count++))
done
echo "Total Count: $count"
Ouput:
ID="12345"
Record="1"
Record="2"
Total Count: 3
However, my desired result is supposed to be:
ID="12345" Record="1"
Record="2"
Total Count: 2
When I save the grep result to a variable, it somehow get separated whenever there is a space. Wondering if anyone could help me on this, thank you in advance.

I would highly suggest to use an XML parser, for example you could use xmlstarlet
Now assuming this is your valid XML file:
<?xml version="1.0" encoding="utf-8"?>
<foo>
<Entity ID="12345" Record="1">
<Info>
<Type>Individual</Type>
<Name>Test</Name>
</Info>
</Entity>
<Entity ID="123456" Record="1">
<Info>
<Type>Individual</Type>
<Name>Test</Name>
</Info>
</Entity>
</foo>
To extract the fields something for starting could be:
xmlstarlet sel -T -t -m //Entity -o ID= -v "#ID" -o " Redcord=" -v "#Record" -n your.xml
This will print:
ID=12345 Redcord=1
ID=123456 Redcord=1
To count the number of elements:
xmlstarlet sel -t -c "count(//Entity)" your.xml
These are just the basics but hope it can help you to get an idea.

your IFS is wrong:
#!/bin/bash
ifs_ini="$IFS"
IFS=$'\n'
entities=( $(grep -oP '(?<=<Entity ).*(?=>)' "abc.xml") )
for j in ${!entities[#]}; do
echo "${entities[$j]}"
((count++))
done
echo "Total Count: $count"
IFS="$ifs_ini"
output:
ID="12345" Record="1"
Record="2"
Total Count: 2

Related

Retrieve values of multiple attributes of an XML element using shell scrit

I've an XML file with below format.
<products>
<product name="A" version="1" location="tmp">
<product name="B" version="1.2" location="tmp">
<product name="C" version="2" location="tmp">
</products>
I need the values of name and version attribute for each product element. Below is the desired output
Product Version
A 1
B 1.2
Below command giving me only the product name.
echo 'cat //products/product/#name' | xmllint --shell envinfo.xml | awk -F\" '\NR % 2 == 0 { print $2 }'
Output
A
B
C
Is there any way to get multiple attribute values from each element
Thanks in advance.

how can a BPEL variable be put into a shell variable

a BPEL process creates a xml document, a certain XSD file that has xml structure and i want to parse that BPEL variable with xmllint or xmlstarlet with a unix shell commandline command. is that possible at all?
how can i put the BPEL variable into a shell variable , in order to be able to parse it with xmllint for instance?
INPUT:
<?xml version="1.0"?>
<ns:ItemList xmlns:ns="http:///blabla">
<GenericItem>
<ns2:LocalItem xmlns:ns2="http:///blabla">
<ItemSource> </ItemSource>
<ConcItemSource>
<name></name>
<requirements/>
<strategy/>
</ConcItemSource>
<dataFormat/>
<directory></directory>
<file/>
</ns2:LocalItem>
</GenericItem>
<GenericItem>
<ns2:LocalItem xmlns:ns2="http:///blabla">
<ItemSource>
</ItemSource>
<ConcItemSource>
<name></name>
<requirements/>
<strategy/>
</ConcItemSource>
<dataFormat/>
<directory></directory>
<file/>
</ns2:LocalItem>
</GenericItem>
</ns:ItemList>
Using xmlstarlet :
$ cat bpel.xml
<?xml version="1.0"?>
<ns:ItemList xmlns:ns="http:///blabla">
<GenericItem>
<ns2:LocalItem xmlns:ns2="http:///blabla">
<ItemSource> </ItemSource>
<ConcItemSource>
<name></name>
<requirements/>
<strategy/>
</ConcItemSource>
<dataFormat/>
<directory>d1</directory>
<file/>
</ns2:LocalItem>
</GenericItem>
<GenericItem>
<ns2:LocalItem xmlns:ns2="http:///blabla">
<ItemSource>
</ItemSource>
<ConcItemSource>
<name></name>
<requirements/>
<strategy/>
</ConcItemSource>
<dataFormat/>
<directory>d2</directory>
<file/>
</ns2:LocalItem>
</GenericItem>
</ns:ItemList>
command line :
$ dir1=$(xmlstarlet sel -t -v '//directory[1]/text()' bpel.xml)
$ echo "$dir1"
d1
Using a for loop :
$ count=$(xmlstarlet sel -t -v 'count(//directory)' bpel.xml)
$ for ((i=1; i<=count; i++)) {
xmlstarlet sel -t -v "//directory[$i]/text()" bpel.xml >> newfile
}
But you can do simply :
$ xmlstarlet sel -t -v "//directory/text()" bpel.xml >> newfile
xmlstarlet from STDIN :
command_producing_xml | xmlstarlet sel -t -v "//directory/text()" -

How to parse xml using xmllint and store in arrays

In shell script, I have an xml file as p.xml, as follows and I want to parse it and get values in two arrays. I am trying to use xmllint, but could not get desired data.
<?xml version="1.0" encoding="UTF-8"?>
<Share_Collection>
<Share id="data/Backup" resource-id="data/Backup" resource-type="SimpleShare" share-name="Backup" protocols="cifs,afp"/>
<Share id="data/Documents" resource-id="data/Documents" resource-type="SimpleShare" share-name="Documents" protocols="cifs,afp"/>
<Share id="data/Music" resource-id="data/Music" resource-type="SimpleShare" share-name="Music" protocols="cifs,afp"/>
<Share id="data/OwnCloud" resource-id="data/OwnCloud" resource-type="SimpleShare" share-name="OwnCloud" protocols="cifs,afp"/>
<Share id="data/Pictures" resource-id="data/Pictures" resource-type="SimpleShare" share-name="Pictures" protocols="cifs,afp"/>
<Share id="data/Videos" resource-id="data/Videos" resource-type="SimpleShare" share-name="Videos" protocols="cifs,afp"/>
</Share_Collection>
I want to get an array all share ids and one array containing share-names. So two array would be like
share-ids-array = ["data/Backup", "data/Documents", "data/Music", "data/OwnCloud", "data/Pictures", "data/Videos"]
share-names-array = ["Backup", "Documents", "Music", "OwnCloud", "Pictures", "Videos"]
I started as follows:
xmllint --xpath '//Share/#id' p.xml
xmllint --xpath '//Share/#share-name' p.xml
that gives me
id="data/Backup"
id="data/Documents" id="data/Music" id="data/OwnCloud" id="data/Pictures" id="data/Videos"
Any help to build those two arrays will be appreciated.
Here is one solution with grep (and tr)...sed or awk are other alternatives. By the way, you cannot use hyphens in variable names in bash.
share_ids=($( xmllint --xpath '//Share/#id' p.xml | grep -Po '".*?"' | tr -d \" ))
share_names=($( xmllint --xpath '//Share/#share-name' p.xml | grep -Po '".*?"' | tr -d \" ))
Example:
$ echo ${share_names[#]}
Backup Documents Music OwnCloud Pictures Videos
Using xmlstarlet is probably better, though:
share_names=($( xmlstarlet sel -T -t -m '//Share/#share-name' -v '.' -n p.xml ))

how to output multiple lines using xmllint and xpath

I'm writing a simple bash script to parse some xml. I was using sed and awk but I think xmllint is better suited.
Unfortunately I'm completely new to xpath so I'm really battling.
I'm trying to take the following xml:
<?xml version="1.0" encoding="UTF-8"?>
<releaseNote>
<name>APPLICATION_ercc2</name>
<change>
<date hour="11" day="10" second="21" year="2013" month="0" minute="47"/>
<submitter>Automatically Generated</submitter>
<description>ReleaseNote Created</description>
</change>
<change>
<version>2</version>
<date hour="11" day="10" second="25" year="2013" month="1" minute="47"/>
<submitter>fred.bloggs</submitter>
<description> first version</description>
<install/>
</change>
<change>
<version>3</version>
<date hour="12" day="10" second="34" year="2013" month="1" minute="2"/>
<submitter>fred.bloggs</submitter>
<description> tweaks</description>
<install/>
</change>
<change>
<version>4</version>
<date hour="15" day="10" second="52" year="2013" month="1" minute="38"/>
<submitter>fred.bloggs</submitter>
<description> fixed missing image, dummy user, etc</description>
<install/>
</change>
<change>
<version>5</version>
<date hour="17" day="10" second="31" year="2013" month="1" minute="40"/>
<submitter>fred.bloggs</submitter>
<description> fixed auth filter and added multi opco stuff</description>
<install/>
</change>
.....
and process it to pass in '3' as the variable to an xpath script, and output something like this:
4 fred.bloggs 10/1/2013 15:38 fixed missing image, dummy user, etc
5 fred.bloggs 10/1/2013 17:40 fixed auth filter and added multi opco stuff
In other words, a complex combination of the contents of each node, where the value of version is greater than, for example, 3.
One tool you might find useful for this sort of thing is xmlstarlet, although using an xpath tool might be less idiosyncratic.
With xmlstarlet, the following works (I added a close tag for releaseNote to your example):
$ summary() {
xmlstarlet sel -t -m "//change[version > $2]" \
-v submitter -o $'\t' \
-v date/#day -o '/' -v date/#month -o '/' -v date/#year -o ' ' \
-v date/#hour -o ':' -v date/#minute -o $'\t' \
-v description -n $1
}
$ summary test.xml 3
fred.bloggs 10/1/2013 15:38 fixed missing image, dummy user, etc
fred.bloggs 10/1/2013 17:40 fixed auth filter and added multi opco stuff
$

Iterate through XML with xmlstarlet

I have the following XML:
<?xml version="1.0" encoding="UTF-8"?>
<test-report>
<testsuite>
<test name="RegisterConnection1Tests">
<testcase name="testRregisterConnection001"></testcase>
<testcase name="testRegisterConnection002"></testcase>
</test>
<test name="RegisterConnection2Tests">
<testcase name="testRregisterConnection001"></testcase>
<testcase name="testRegisterConnection002"></testcase>
</test>
</testsuite>
</test-report>
And I want the output:
RegisterConnection1Tests,testRregisterConnection001
RegisterConnection1Tests,testRregisterConnection002
RegisterConnection2Tests,testRregisterConnection001
RegisterConnection2Tests,testRregisterConnection002
I'm confused as to how to show the children as I expected
xmlstarlet sel -t -m 'test-report/testsuite/test' -v '#name' -v '//testcase/#name' -n $1 to work, though it only inputs:
RegisterConnection1TeststestRregisterConnection001
RegisterConnection2TeststestRregisterConnection001
To add the missing comma you can add another -v "','"
In your second column you are selecting with an absolute xpath expression from the root element and not from the element matched by the template, the double slashes are wrong. Since you want one line per testcase I would iterate over the testcase elements and then add the name attribute of the parent element like this:
xmlstarlet sel -t -m 'test-report/testsuite/test/testcase' -v '../#name' -v "','" -v '#name' -n $1

Resources