Xpath - Find specific element, print all elements of that node - xpath

Given the following Xpath to an element
/std:Batch/BatchSection/ContractPartner/Contractor/Contract/contractNumber
How can I print out all subelements of the node Contract
where sequenceNumber= 12345?
I tried
xmllint --xpath "string(/std:Batch/BatchSection/ContractPartner/Contractor/Contract/contractNumber[contractNumber='12345'])" test.xml
However, that is an invalid XPath expression. How to fix that?
Example input:
<std:Batch xmlns:std="http://www.test.com/contractBatch" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<year>2020</year>
<batchType>3</batchType>
<runDate>2020-04-11</runDate>
<text>Datatest</text>
<jobInfo>Test</jobInfo>
<BatchSection>
<addedAtDate>2020-04-11</addedAtDate>
<ContractPartner>
<contractDealerAG>44444</contractDealerAG>
<contractorType/>
<isoCountry>NL</isoCountry>
<language>EN</language>
<Contractor>
<contractor>44444</contractor>
<Contract>
<contractor>44444</contractor>
<sequenceNumber>12345</sequenceNumber>
<info1>abcd</info1>
</Contract>
</Contractor>
</ContractPartner>
</BatchSection>
</std:Batch>
Desired output (where sequenceNumber=12345):
<Contract>
<contractor>44444</contractor>
<sequenceNumber>12345</sequenceNumber>
<info1>abcd</info1>
</Contract>

You have to deal with the dreaded namespaces, unfortunately... Try it like this:
xmllint --xpath "//*[local-name()='Contract'] [.//*[local-name()='sequenceNumber'][./text()='12345']]" test.xml
and see if it works.

I'm assuming you mean sequenceNumber, as per xml example, if that's the case then you may need to do something like this to return the node Contract:
xmllint --xpath "//sequenceNumber[.="12345"]/.." test.xml

Related

Get the position of an element with specific attribute value

I'm trying to get with xPath the position only of the first element which has the attribute value true.
<?xml version="1.0" encoding="UTF-8"?>
<elements>
<element attribute="false"/>
<element attribute="true"/>
<element attribute="true"/>
</elements>
What I have so fare is:
head(/elements/element[#attribute='true']/position())
Result:
1
But it should be:
2
What am I doing wrong?
position() returns the position of the element in the nodelist created by the predicate, i.e. with the false excluded. Instead of position, you can e.g. count the number of preceding elemements.
For example, this works even in XPath 1.0:
1+count(/elements/element[#attribute="true"][1]/preceding-sibling::element)
I think it's (with XPath 3):
head(index-of(/elements/element/#attribute, 'true'))
saxon-lint --xpath 'count(//element[#attribute="true"]/position())' file.xml
From Michael answer:
saxon-lint --xpath 'head(index-of(/elements/element/#attribute, "true"))' file.xml
Output
2

Is there a way to use two #* in an xpath selector to select an element?

I have an HTML element I would like to select that looks like this:
<button data-button-id="close" class="modal__cross modal__cross-web"></button>
Now clearly I can use this XPath selector:
//button[(contains(#data-button-id,'close')) and (contains(#class,'modal'))]
to select the element. However, I would really like to select buttons that have both close and modal contained in any attributes. So I can generalize the selector and say:
//button[(contains(#*,'close')) and (contains(#class,'modal'))]
and that works. What I'd love to do is extend it to this:
//button[(contains(#*,'close')) and (contains(#*,'modal'))]
but that doesn't return any results. So clearly that doesn't mean what I'd like it to mean. Is there a way to do it correctly?
Thanks,
Craig
It looks like you're using XPath 1.0: in 1.0, if you supply a node-set as the first argument to contains(), it takes the first node in the node-set. The order of attributes is completely unpredictable, so there's no way of knowing whether contains(#*, 'close') will succeed or not. In 2.0+, this gives you an error.
In both 1.0 and 2.0, #*[contains(., 'close')] returns true if any attribute contains "close" as a substring.
This expression works:
//button[attribute::*[contains(.,"close")] and attribute::*[contains(.,"modal")]]
Given this html
<button data-button-id="close" class="modal__cross modal__cross-web"></button>
<button key="close" last="xyz_modal"></button>
Testing with xmllint
echo -e 'cat //button[attribute::*[contains(.,"close")] and attribute::*[contains(.,"modal")]]\nbye' | xmllint --html --shell test.html
/ > cat //button[attribute::*[contains(.,"close")] and attribute::*[contains(.,"modal")]]
-------
<button data-button-id="close" class="modal__cross modal__cross-web"></button>
-------
<button key="close" last="xyz_modal"></button>
/ > bye
Try this one to select required element:
//button[#*[contains(., 'close')] and #*[contains(., 'modal')]]

xpath multiple nodes query with custom strings

I have a working multiple node xpath query and I want to add some custom strings between the results.
<FooBar>
<Foo>
<Fooid>A</Fooid>
<Booid>222</Booid>
<Wooid>Z</Wooid>
</Foo>
<Foo>
<Fooid>B</Fooid>
<Booid>333</Booid>
<Wooid>Y</Wooid>
</Foo>
<Foo>
<Fooid>C</Fooid>
<Booid>444</Booid>
<Wooid>X</Wooid>
</Foo>
</FooBar>
I have messed with different combinations of string-joins and/or concats, but the result was always wrong or ended up in a syntax-error. My xpath version is Xpath 2.0
//Foo/Fooid | //Foo/Booid | Foo/Wooid
The above xpath results in:
A
222
Z
My preferred result would be:
(A)
{222}
[Z]
what is the correct usage of string-join in order to get the brackets around the three ids?
after doing some research and with your comments, I was able to achive the desired solution with this line:
//Foo/concat('(', Fooid, ')'), //Foo/concat('{', Booid, '}'),Foo/concat('[', Wooid, ']')
The '|' was replaced by a comma.
to concat these characters, use their html entity instead.
concat('&lpar;', //Fooid, '&rpar;')
for parentheses use
&lpar;
&rpar;
for brackets
&lbrack;
&rbrack;
for brackes
&lbrace;
&rbrace;
See full character entity sets here

XPath 1.0 lowest value regardless of ordering

I have this data, and I'm looking for the lowest bid.
<root>
<current_bid>$1.00</current_bid>
<current_bid>$2.00</current_bid>
<current_bid>$3.00</current_bid>
<current_bid>$4.00</current_bid>
<current_bid>$5.00</current_bid>
</root>
This is my XPath 1.0 attempt:
//current_bid[not(translate (., '$,.','') > translate(//current_bid, '$,.',''))]
And it works fine (returns only the $1.00 bid) with the data above, but if I change the ordering of the data to let's say this here:
<root>
<current_bid>$5.00</current_bid>
<current_bid>$1.00</current_bid>
<current_bid>$2.00</current_bid>
<current_bid>$3.00</current_bid>
<current_bid>$4.00</current_bid>
</root>
Then it gives a wrong output (returns all values).
Shouldn't the order be irrelevant when I use //current_bid, since it queries the whole document?
Also: how would I go if I wanted the second lowest bid?
XPath 1.0 processes nodes in document order so there's no way to sort them with pure XPath. It can be done with XSL processing
This approach works only if minimum is at first position.
Xpath:
'//current_bid[(position()<=last()) and not(translate (., "$,.","") > translate(//current_bid, "$,.",""))]'
Sample:
<root>
<current_bid>$1.00</current_bid>
<current_bid>$5.00</current_bid>
<current_bid>$2.00</current_bid>
<current_bid>$4.00</current_bid>
<current_bid>$3.00</current_bid>
</root>
Testing on command line with xmllint
xmllint --xpath '//current_bid[(position()<=last()) and not(translate (., "$,.","") > translate(//current_bid, "$,.",""))]' test.xml ; echo
Result:
<current_bid>$1.00</current_bid>
If the number of nodes is known in advance perhaps it could be done with nested conditions but would give a very complex XPath expression.

Compare multiple strings in XPath

Is there any better way to write things like [node/text()="a" or node/text()="b"] like this [contains(arraytype("a", "b"), node/text())]? Does the array type exist in XPath and can I use it inside the contains function to write more readable code?
Thanks in advance. :)
Code :
require xpath > 1
saxon-lint --xpath '/students/student/name[text()=("A", "B")]' file.xml
Output :
<name>A</name>
Files :
<?xml version="1.0"?>
<students>
<student>
<stuId>1</stuId>
<name>A</name>
<mark>75</mark>
<result></result>
</student>
</students>
Check
saxon-lint (my own project)
Thanks #Andersson for the tip

Resources