From this Stack Overflow answer, I can see how to match on any attribute values that start with a particular string. For example, finding a list of elements that have any attribute with a value starting with the letter h:
//*[#*[starts-with(., 'h')]]
My question is how to match attribute names that start with a particular string?
As I understand it, the #* indicates matching any attribute within the wider //* collection of any elements. The starts-with function takes two parameters, the haystack and the needle, and the . haystack indicates the attribute value. I feel like I'm 90% of my way there, but I can't figure out what I need to match on the attribute name, rather than value.
Example matches:
<example>
<first match-me="123">MATCH!</first>
<second do-not-match-me="456">NO MATCH!</second>
<third match-me-yes-please="789">MATCH!</third>
<fourth>
<fifth match-me:oh-yes>MATCH!</fifth>
</fourth>
</example>
"I feel like I'm 90% of my way there" - yes, the last phase is specifying attribute name method:
//*[#*[starts-with(name(), 'match-me')]]
Note, relying on simple attribute name matching you may encounter an issue when dealing with attribute names like match-me:oh-yes (colon inside). In such cases, you'll probably get an error The prefix "match-me" for attribute "match-me:oh-yes" associated with an element type "fifth" is not bound OR Attribute name "match-me:oh-yes" associated with an element type "fifth" must be followed by the ' = ' character.(in case of "valueless" attribute)
Related
I am trying to run the following XQuery expression in BaseX to extract elements between two succeeding headings. (as an article section).
xquery for $x in doc("test.xq")//h2,
$y in $x/following-sibling::h2[1]
return //*[$x/following::* and $y/preceding::*]
But it gives the error
Error:
Stopped at D:/Program Files/BaseX/data/test.xq, 1/74:
[XPDY0002] root(): no context value bound.
By the expression I mean if $x is heading and $y is the first heading following $x, then select the common text for $x/following::* and $y/preceding::*
However I am not sure my expression works, but my question here is how can execute my intended query without error?
If you have also an expression which works for my need, that is welcomed.
[...] to extract elements between two succeeding headings [...]
You need something more like:
for $x in doc("test.xq")//h2
return $x/following-sibling::*[preceding-sibling::h2[1] is $x]
but on its own it won't give you anything useful because the XPath and XQuery data model only has flat sequences, not "multi-dimensional arrays". When you have a for that returns a sequence of values for each "iteration", the overall result of the for expression is the concatenation of all the result sequences, so as written above this expression will simply return you all the elements in every "section" in a single flat list. If you want to group the elements by section then you'd need to construct a new XML element for each group
for $x in doc("test.xq")//h2
return
<section>{$x/following-sibling::*[preceding-sibling::h2[1] is $x]}</section>
The error (as documented here) comes from this expression:
//*[$x/following::* and $y/preceding::*]
which begins with //. The abbreviation // stands for /descendant-or-self::node()/, which of course begins with /. The XPath standard says:
A / by itself selects the root node of the document containing the
context node. If it is followed by a relative location path, then the
location path selects the set of nodes that would be selected by the
relative location path relative to the root node of the document
containing the context node.
But from what you've shown us, there is nothing indicating that you've established a context node. So XPath doesn't have any way to know what document contains the context node. That's what the error message is referring to when it says
root(): no context value bound
To fix the error, you could precede the // with an explicit doc(...) or any other explicit way to set the context:
doc("test.xq")//*[$x/following::* and $y/preceding::*]
or
root($x)//*[$x/following::* and $y/preceding::*]
This should get rid of the error, but as Ian Roberts has written, it won't give you the result you want. See his answer for that.
Here is my XML file:
<w type="fruit-hard">apple</w>
<w type="fruit-soft">orange</w>
<w type="vegetable">carrot</w>
I need to find carrot's immediately preceding sibling whose type is fruit-soft. In Chrome (locally loaded XML file), when I try
$x("//w[#type='vegetable']/preceding-sibling::w[1]")
I get "orange" element node like I want, but how do I require that its type be "fruit-soft"? My attempt (below) returns "false."
$x("//w[#type='vegetable']/preceding-sibling::w[1] and preceding-sibling::w[#type='fruit-soft']")
Your original XPath ...
//w[#type='vegetable']/preceding-sibling::w[1]
... is equivalent to
//w[#type='vegetable']/preceding-sibling::w[position()=1]
. You can add additional criteria to the predicate as needed:
//w[#type='vegetable']/preceding-sibling::w[position()=1 and #type='fruit-soft']
Or you can add an add a separate predicate
//w[#type='vegetable']/preceding-sibling::w[1][#type='fruit-soft']
Note that this attempt:
//w[#type='vegetable']/preceding-sibling::w[1] and preceding-sibling::w[#type='fruit-soft']
returns false because the parts on either side of the and are evaluated separately, converted to type boolean, and combined to yield the final result. Supposing that the context node against which that is evaluated is the document root, there will never be a node matching preceding-sibling::w[#type='fruit-soft']. Moreover, even if there were such a node, that expression does not require nodes matching the first part to be the same ones that matches the second part.
I want to select a node based on the text value of a child.
My structure is as follows (sorry for german nodes):
<InspizierteAbwassertechnischeAnlage>
<Objektbezeichnung>10502002</Objektbezeichnung>
<Anlagentyp>1</Anlagentyp>
</InspizierteAbwassertechnischeAnlage>
How can I select the <InspizierteAbwassertechnischeAnlage> node where e.g. <Objektbezeichnung> = 10502002?
Why your solution didn't work
ancestor:://*[text()='10502002'] is syntactically incorrect, it's not valid XPath. I'm not sure what you tried to do with the axes here.
//*[text()='10502002'] itself would just select the Objektbezeichnung itself and not its parent. It would also select any other element with such a value, regardless of its name. In case of this document, nothing redundant would be returned but you have to be careful when using wildcards (*)
The solution
It's quite simple, you have to use a predicate to inspect the content of the child element
//InspizierteAbwassertechnischeAnlage[Objektbezeichnung = '10502002']
Note the double slash (// ), it is the abbreviated syntax for the descendant-or-self axis. The above expression translates to:
/descendant-or-self::InspizierteAbwassertechnischeAnlage[Objektbezeichnung = '10502002']
Or in plain English
In the set of all descendants of the document's root, find InspizierteAbwassertechnischeAnlage elements that contain at least one Objektbezeichnung element with a value of 10502002
As for German element names, at least it's not Hottentottenstottertrottelmutterbeutelrattenlattengitterkofferattentäter or Rhababerbarbarabarbarbarenbartbarbierbierbarbärbel
I am trying to find XPath of an element which has no attribute. It can only be identified by its parent's attribute. However, the parent also does not have unique attribute.
Eg: //*[#id="btn"][1]/ul/li[2]/a/span
Here there are 2 elements with id=btn. How do i get the 2nd element. The above syntax gives me 1st element.. However if i use:
//*[#id="btn"][2]/ul/li[2]/a/span
I get an error message
"The xpath expression '//*[#id="btn"][2]/ul/li[2]/a/span' cannot be evaluated or does not result in a WebElement "
Try this, you select those two first, then use brackets around and index them.
(//*[#id="btn"]/ul/li[2]/a/span)[2]
By the way, it's not a good practice to have multiple elements sharing same ids, if you are the developer, may consider change them.
What is the XPath to find only ONE node (whichever) having a certain attribute (actually I'm interested in the attribute, not the node). For example, in my XML, I have several tags having a lang attribute. I know all of them must have the same value. I just want to get any of them.
Right now, I do this : //*[1][#lang]/#lang, but it seems not to work properly, for an unknown reason.
My tries have led me to things ranging from concatenation of all the #lang values ('en en en en...') to nothing, with sometimes inbetween what I want but not on all XML.
EDIT :
Actually //#lang[1] can not work, because the function position() is called before the test on a lang attribute presence. So it always takes the very first element found in the XML. It worked best at the time because many many times, the lang attribute was on root element.
After some more tackling, here is a working solution :
(//#lang)[1]
Parentheses are needed to separate the [1] from the attribute name, otherwise the position() function is applied within the parent element of the attribute (which is useless since there can be only one attribute of a certain name within a tag : that's why //#lang[2] always selects nothing).
Did you tried this?
//#lang[1]
here you can see an example.
The following XPath seems to do what you want:
//*[#lang][1]/attribute::lang