immediately preceding-sibling must contain attribute - xpath

Here is my XML file:
<w type="fruit-hard">apple</w>
<w type="fruit-soft">orange</w>
<w type="vegetable">carrot</w>
I need to find carrot's immediately preceding sibling whose type is fruit-soft. In Chrome (locally loaded XML file), when I try
$x("//w[#type='vegetable']/preceding-sibling::w[1]")
I get "orange" element node like I want, but how do I require that its type be "fruit-soft"? My attempt (below) returns "false."
$x("//w[#type='vegetable']/preceding-sibling::w[1] and preceding-sibling::w[#type='fruit-soft']")

Your original XPath ...
//w[#type='vegetable']/preceding-sibling::w[1]
... is equivalent to
//w[#type='vegetable']/preceding-sibling::w[position()=1]
. You can add additional criteria to the predicate as needed:
//w[#type='vegetable']/preceding-sibling::w[position()=1 and #type='fruit-soft']
Or you can add an add a separate predicate
//w[#type='vegetable']/preceding-sibling::w[1][#type='fruit-soft']
Note that this attempt:
//w[#type='vegetable']/preceding-sibling::w[1] and preceding-sibling::w[#type='fruit-soft']
returns false because the parts on either side of the and are evaluated separately, converted to type boolean, and combined to yield the final result. Supposing that the context node against which that is evaluated is the document root, there will never be a node matching preceding-sibling::w[#type='fruit-soft']. Moreover, even if there were such a node, that expression does not require nodes matching the first part to be the same ones that matches the second part.

Related

What does Camel Splitter actually do with XML Document when splitting with xpath?

I have a document with an order and a number of lines. I need to break the order into lines so I have a camel splitter set to xpath with the order line as it's value. This works fine.
However, what I get going forward is an element for the order line, which is what I want, but when converting it I need information from the order element - but if I try to get the parent element via xpath following the split, this doesn't work.
Does Camel create copies of the nodes returned by the xpath expression, or return a list of nodes within the parent document? If the former, can I make it the latter? If the latter, any ideas why a "../*" expression would return nothing?
Thanks!
Screwtape.
Look at the split options that are available when using a Tokenizer:
http://camel.apache.org/splitter.html
You have four different modes (i, w, u, t) and the 'w' one is keeping the ancestor context. In such case, the parent node (=the thing you apparently need) will be repeated in each sub-message
Default:
<m:order><id>123</id><date>2014-02-25</date></m:order>
'w' mode:
<m:orders>
<m:order><id>123</id><date>2014-02-25</date>...</m:order>
</m:orders>

XPath query for matching any attribute with name starting with X

From this Stack Overflow answer, I can see how to match on any attribute values that start with a particular string. For example, finding a list of elements that have any attribute with a value starting with the letter h:
//*[#*[starts-with(., 'h')]]
My question is how to match attribute names that start with a particular string?
As I understand it, the #* indicates matching any attribute within the wider //* collection of any elements. The starts-with function takes two parameters, the haystack and the needle, and the . haystack indicates the attribute value. I feel like I'm 90% of my way there, but I can't figure out what I need to match on the attribute name, rather than value.
Example matches:
<example>
<first match-me="123">MATCH!</first>
<second do-not-match-me="456">NO MATCH!</second>
<third match-me-yes-please="789">MATCH!</third>
<fourth>
<fifth match-me:oh-yes>MATCH!</fifth>
</fourth>
</example>
"I feel like I'm 90% of my way there" - yes, the last phase is specifying attribute name method:
//*[#*[starts-with(name(), 'match-me')]]
Note, relying on simple attribute name matching you may encounter an issue when dealing with attribute names like match-me:oh-yes (colon inside). In such cases, you'll probably get an error The prefix "match-me" for attribute "match-me:oh-yes" associated with an element type "fifth" is not bound OR Attribute name "match-me:oh-yes" associated with an element type "fifth" must be followed by the ' = ' character.(in case of "valueless" attribute)

BaseX XQuery error: root(): no context value bound

I am trying to run the following XQuery expression in BaseX to extract elements between two succeeding headings. (as an article section).
xquery for $x in doc("test.xq")//h2,
$y in $x/following-sibling::h2[1]
return //*[$x/following::* and $y/preceding::*]
But it gives the error
Error:
Stopped at D:/Program Files/BaseX/data/test.xq, 1/74:
[XPDY0002] root(): no context value bound.
By the expression I mean if $x is heading and $y is the first heading following $x, then select the common text for $x/following::* and $y/preceding::*
However I am not sure my expression works, but my question here is how can execute my intended query without error?
If you have also an expression which works for my need, that is welcomed.
[...] to extract elements between two succeeding headings [...]
You need something more like:
for $x in doc("test.xq")//h2
return $x/following-sibling::*[preceding-sibling::h2[1] is $x]
but on its own it won't give you anything useful because the XPath and XQuery data model only has flat sequences, not "multi-dimensional arrays". When you have a for that returns a sequence of values for each "iteration", the overall result of the for expression is the concatenation of all the result sequences, so as written above this expression will simply return you all the elements in every "section" in a single flat list. If you want to group the elements by section then you'd need to construct a new XML element for each group
for $x in doc("test.xq")//h2
return
<section>{$x/following-sibling::*[preceding-sibling::h2[1] is $x]}</section>
The error (as documented here) comes from this expression:
//*[$x/following::* and $y/preceding::*]
which begins with //. The abbreviation // stands for /descendant-or-self::node()/, which of course begins with /. The XPath standard says:
A / by itself selects the root node of the document containing the
context node. If it is followed by a relative location path, then the
location path selects the set of nodes that would be selected by the
relative location path relative to the root node of the document
containing the context node.
But from what you've shown us, there is nothing indicating that you've established a context node. So XPath doesn't have any way to know what document contains the context node. That's what the error message is referring to when it says
root(): no context value bound
To fix the error, you could precede the // with an explicit doc(...) or any other explicit way to set the context:
doc("test.xq")//*[$x/following::* and $y/preceding::*]
or
root($x)//*[$x/following::* and $y/preceding::*]
This should get rid of the error, but as Ian Roberts has written, it won't give you the result you want. See his answer for that.

How to select a node based on its child's text value?

I want to select a node based on the text value of a child.
My structure is as follows (sorry for german nodes):
<InspizierteAbwassertechnischeAnlage>
<Objektbezeichnung>10502002</Objektbezeichnung>
<Anlagentyp>1</Anlagentyp>
</InspizierteAbwassertechnischeAnlage>
How can I select the <InspizierteAbwassertechnischeAnlage> node where e.g. <Objektbezeichnung> = 10502002?
Why your solution didn't work
ancestor:://*[text()='10502002'] is syntactically incorrect, it's not valid XPath. I'm not sure what you tried to do with the axes here.
//*[text()='10502002'] itself would just select the Objektbezeichnung itself and not its parent. It would also select any other element with such a value, regardless of its name. In case of this document, nothing redundant would be returned but you have to be careful when using wildcards (*)
The solution
It's quite simple, you have to use a predicate to inspect the content of the child element
//InspizierteAbwassertechnischeAnlage[Objektbezeichnung = '10502002']
Note the double slash (// ), it is the abbreviated syntax for the descendant-or-self axis. The above expression translates to:
/descendant-or-self::InspizierteAbwassertechnischeAnlage[Objektbezeichnung = '10502002']
Or in plain English
In the set of all descendants of the document's root, find InspizierteAbwassertechnischeAnlage elements that contain at least one Objektbezeichnung element with a value of 10502002
As for German element names, at least it's not Hottentottenstottertrottelmutterbeutelrattenlattengitterkofferattentäter or Rhababerbarbarabarbarbarenbartbarbierbierbarbärbel

Modify XPath to return second of two values

I have an XPath that returns two items. I want to modify it so that it returns only the second, or the last if there are more than 2.
//a[#rel='next']
I tried
//a[#rel='next'][2]
but that doesn't return anything at all. How can I rewrite the xpath so I get only the 2nd link?
Found the answer in
XPATH : finding an attribute node (and only one)
In my case the right XPath would be
(//a[#rel='next'])[last()]
EDIT (by Tomalak) - Explanation:
This selects all a[#rel='next'] nodes, and takes the last of the entire set:
(//a[#rel='next'])[last()]
This selects all a[#rel='next'] nodes that are the respective last a[#rel='next'] of the parent context each of them is in:
//a[#rel='next'][last()] equivalent: //a[#rel='next' and position()=last()]
This selects all a[#rel='next'] nodes that are the second a[#rel='next'] of the parent context each of them is in (in your case, each parent context had only one a[#rel='next'], that's why you did not get anything back):
//a[#rel='next'][2] equivalent: //a[#rel='next' and position()=2]
For the sake of completeness: This selects all a nodes that are the last of the parent context each of them is in, and of them only those that have #rel='next' (XPath predicates are applied from left to right!):
//a[last()][#rel='next'] NOT equiv!: //a[position()=last() and #rel='next']

Resources