XPath OR operator for different nodes - xpath

How can I do with XPath:
//bookstore/book/title or //bookstore/city/zipcode/title
Just //title won't work because I also have //bookstore/magazine/title
p.s. I saw a lot of or examples but mainly with attributes or single node structure.

All title nodes with zipcode or book node as parent:
Version 1:
//title[parent::zipcode|parent::book]
Version 2:
//bookstore/book/title|//bookstore/city/zipcode/title
Version 3: (results are sorted based on source data rather than the order of book then zipcode)
//title[../../../*[book] or ../../../../*[city/zipcode]]
or - used within true/false - a Boolean operator in xpath
| - a Union operator in xpath that appends the query to the right of the operator to the result set from the left query.

If you want to select only one of two nodes with union operator, you can use this solution:
(//bookstore/book/title | //bookstore/city/zipcode/title)[1]

It the element has two xpath. Then you can write two xpaths like below:
xpath1 | xpath2
Eg:
//input[#name="username"] | //input[#id="wm_login-username"]

Related

Can I condense an XPath expression that checks for conditional values of an attribute?

I have the following XML payload:
<fizz>
<buzz class="foo">
<whatever/>
</buzz>
</fizz>
The value of the /fizz/buzz[#class]/#class attribute can be foo, bar or whistlefeather. I'm trying to write an efficient XPath expression that covers all three scenarios. The best I have is:
/fizz/buzz[#class]/#class = 'foo' |
/fizz/buzz[#class]/#class = 'bar' |
/fizz/buzz[#class]/#class = 'whistlefeather'
Is there some "shorthand" way to make this more condense/efficient (less verbose)?
Using this (all xpath version) :
/fizz/buzz[#class='foo' or #class='bar' or #class='whistlefeather']
Using xpath >=2 :
/fizz/buzz[#class=("foo", "bar", "whistlefeather")]
Correcting the answer from #GillesQuenot:
Using any XPath version:
/fizz/buzz[#class='foo' or #class='bar' or #class='whistlefeather']
Using XPath 2.0 or later:
/fizz/buzz[#class=("foo", "bar", "whistlefeather")]
(Note, this returns the selected buzz elements. It's unclear what you actually want the expression to return.)

How to select a specific category value in this Xpath expression

I have a feed here. I'm trying to create an XPath expression that returns items that have a category equal to Bananas. Due to the limitations in my XML parser, I can't use namespaces directly to select items.
The expression /rss/channel/item//*[name()='itunes:category'] returns this:
Element='<itunes:category
xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
text="Apples"/>'
Element='<itunes:category
xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
text="Bananas"/>'
...
And /rss/channel/item//*[name()='itunes:category']/#text returns this:
Attribute='text=Apples'
Attribute='text=Bananas'
...
But I can't figure out how to limit the response to just a single category (e.g., Bananas)?
I want some kind of expression like this:
/rss/channel/item//*[name()='itunes:category' and contains(., 'Bananas')]
But this doesn't work. It's not syntactically valid. What would be the right XPath expression syntax to just return Bananas?
Do you just mean to filter by attributes of item child, but still return item node?
/rss/channel/item/*[name()='itunes:category' and contains(#text,'Apples')]/parent::item
or simplier
/rss/channel/item[*[name()='itunes:category' and #text='Apples']]
I used Apples in example because using your example xml file there is 0 results for Bananas.

xpath expression to read value based on value of sibling

I've below xml and would like to read the value of 'Value' tag whose Name matches 'test2'. I'm using the below xpath , but did not work. Can someone help.
/*[ local-name()='OutputData']/*[ local-name()='OutputDataItem']/*[ local-name()='Name'][normalize-space(.) = 'test2']//*[local-name()='Value']/text()
<get:OutputData>
<get:OutputDataItem>
<get:Name>test1</get:Name>
<get:Value/>
</get:OutputDataItem>
<get:OutputDataItem>
<get:Name>test2</get:Name>
<get:Value>B5B4</get:Value>
</get:OutputDataItem>
<get:OutputDataItem>
<get:Name>test3</get:Name>
<get:Value/>
</get:OutputDataItem>
<get:OutputDataItem>
<get:Name>OP_VCscEncrptCd_VAR</get:Name>
<get:Value/>
</get:OutputDataItem>
</get:OutputData>
Thanks
You were close, but because the get:name and get:value are siblings, you need to adjust your XPath a little.
Your XPath was attempting to address get:value elements that were descendants of get:name, rather than as siblings. Move the criteria that is filtering the get:name into a predicate, then step down into the get:value:
/*[ local-name()='OutputData']/*[ local-name()='OutputDataItem']
[*[ local-name()='Name'][normalize-space(.) = 'test2']]/*[local-name()='Value']/text()
You could also combine the criteria of the predicate filter on the get:name and use an and:
/*[ local-name()='OutputData']/*[ local-name()='OutputDataItem']
[*[ local-name()='Name' and normalize-space(.) = 'test2']]/*[local-name()='Value']/text()
This should work I think:
//*[local-name()="get:Name" and text()="test2"]/following-sibling::*[local-name()="get:Value"]/text()

Select attribute and text() in the same query

I would like to select a attribute and the text() value of a node in one query, e.g. I have
<Tag1 #myattr='test'>MyText</Tag1>
and I am interested in getting "test" and "MyText" with one query.
The obvious
//Tag1/#myattr | //Tag1/text()
fails due to the fact, that Unions are only allowed over node-sets.
Any ideas?
I think, given XPath 2.0, you want a sequence of string values which you get with //Tag1/(#myattr, .)/string(). If you want a single string then use //Tag1/string-join((#myattr, .), ' ').
BTW, your path //Tag1/#myattr | //Tag1/text() would select a sequence containing an attribute value and a text node. I don't see how that would fail.

How to get H1,H2,H3,... using a single xpath expression

How can I get H1,H2,H3 contents in one single xpath expression?
I know I could do this.
//html/body/h1/text()
//html/body/h2/text()
//html/body/h3/text()
and so on.
Use:
/html/body/*[self::h1 or self::h2 or self::h3]/text()
The following expression is incorrect:
//html/body/*[local-name() = "h1"
or local-name() = "h2"
or local-name() = "h3"]/text()
because it may select text nodes that are children of unwanted:h1, different:h2, someWeirdNamespace:h3.
Another recommendation: Always avoid using // when the structure of the XML document is statically known. Using // most often results in significant inefficiencies because it causes the complete document (sub)tree roted in the context node to be traversed.

Resources