How would an Xpath expression look like that retrieves all attribute names (not attribute values!) for a given node resp. xml tag?
Assume the following XML document:
<bookstore>
<book>
<title lang="eng">Harry Potter</title>
<price>29.99</price>
</book>
<book>
<title lang="fr" type="easyreading">Monsieur Claude</title>
<price>39.95</price>
</book>
</bookstore>
The Xpath //title/#* would select "eng, fr, easyreading", but which Xpath would select "lang, lang, type"?
Give this a try:
//#*/name()
returns
String='lang'
String='lang'
String='type'
See here regarding the name() function.
Write an expression that selects all the items ISBN and TITLE that their return
is “3/12/2017”
Code -
<itemlist>
<item>
<title>
The Bonfire of the Vanities
</title>
<type>Book</type>
<authors>
<author>Wolfe, Tom</author>
</authors>
<subjects>
<subject>New York</subject>
<subject>Race Relations</subject>
</subjects>
<isbn>0374115370</isbn>
<location>Adult</location>
<collection>Fiction</collection>
<status return="3/12/2017">Checked Out</status>
</item>
</itemlist>
//itemlist/item[status/#return='3/12/2017']/(isbn|title)
Find item elements whose status element child has return attribute that is "3/12/2017", then take those items' children that are isbn or title elements.
this is my HTML
<book>
<div id="name"></div>
<span id="age"></span>
<p id="contact_number"></p>
...
...
(more attributes)
</book>
I need to extract all the text() inside <book></book> except the p with id="contact_number"
so basically i need //book//text() except //book//p[#id="contact_number"]//text()
How can i do this in a single xpath query?
There might be a better way if you can put the requirement differently. Anyway, to answer the question the way it asked, you can try this :
//book//text()[not(ancestor::p/#id='contact_number')]
or maybe just use parent::p instead of ancestor::p :
//book//text()[not(parent::p/#id='contact_number')]
add [normalize-space()] at the end if you need to filter out empty text nodes.
Try the following:
//*[not(self::p[#id = 'contact_number'])]/text()[normalize-space()]
I'm trying to use xpath to get the raw value of an element. The element is a description and it can contain raw text or xhtml.
So it can be as follows:
<description>asdasdasd <a>Item1</a> asd <a> Price </a></description>
based on the above xml, i just need this:
asdasdasd Item1 asd Price
I've tried //description/text(), //description/descendant::*/text() and some others with no success. Any suggestion?
Just use:
//description
The value of an element is its text
Or if it must be a string and there is just one element:
string(//description)
The XPath bookstore/book[1] selects the first book node under bookstore.
How can I select the first node that matches a more complicated condition, e.g. the first node that matches /bookstore/book[#location='US']
Use:
(/bookstore/book[#location='US'])[1]
This will first get the book elements with the location attribute equal to 'US'. Then it will select the first node from that set. Note the use of parentheses, which are required by some implementations.
Note, this is not the same as /bookstore/book[1][#location='US'] unless the first element also happens to have that location attribute.
/bookstore/book[#location='US'][1] works only with simple structure.
Add a bit more structure and things break.
With-
<bookstore>
<category>
<book location="US">A1</book>
<book location="FIN">A2</book>
</category>
<category>
<book location="FIN">B1</book>
<book location="US">B2</book>
</category>
</bookstore>
/bookstore/category/book[#location='US'][1] yields
<book location="US">A1</book>
<book location="US">B2</book>
not "the first node that matches a more complicated condition". /bookstore/category/book[#location='US'][2] returns nothing.
With parentheses you can get the result the original question was for:
(/bookstore/category/book[#location='US'])[1] gives
<book location="US">A1</book>
and (/bookstore/category/book[#location='US'])[2] works as expected.
As an explanation to Jonathan Fingland's answer:
multiple conditions in the same predicate ([position()=1 and #location='US']) must be true as a whole
multiple conditions in consecutive predicates ([position()=1][#location='US']) must be true one after another
this implies that [position()=1][#location='US'] != [#location='US'][position()=1]
while [position()=1 and #location='US'] == [#location='US' and position()=1]
hint: a lone [position()=1] can be abbreviated to [1]
You can build complex expressions in predicates with the Boolean operators "and" and "or", and with the Boolean XPath functions not(), true() and false(). Plus you can wrap sub-expressions in parentheses.
The easiest way to find first english book node (in the whole document), taking under consideration more complicated structered xml file, like:
<bookstore>
<category>
<book location="US">A1</book>
<book location="FIN">A2</book>
</category>
<category>
<book location="FIN">B1</book>
<book location="US">B2</book>
</category>
</bookstore>
is xpath expression:
/descendant::book[#location='US'][1]
<bookstore>
<book location="US">A1</book>
<category>
<book location="US">B1</book>
<book location="FIN">B2</book>
</category>
<section>
<book location="FIN">C1</book>
<book location="US">C2</book>
</section>
</bookstore>
So Given the above; you can select the first book with
(//book[#location='US'])[1]
And this will find the first one anywhere that has a location US. [A1]
//book[#location='US']
Would return the node set with all books with location US. [A1,B1,C2]
(//category/book[#location='US'])[1]
Would return the first book location US that exists in a category anywhere in the document. [B1]
(/bookstore//book[#location='US'])[1]
will return the first book with location US that exists anywhere under the root element bookstore; making the /bookstore part redundant really. [A1]
In direct answer:
/bookstore/book[#location='US'][1]
Will return you the first node for book element with location US that is under bookstore [A1]
Incidentally if you wanted, in this example to find the first US book that was not a direct child of bookstore:
(/bookstore/*//book[#location='US'])[1]
Use the index to get desired node if xpath is complicated or more than one node present with same xpath.
Ex :
(//bookstore[#location = 'US'])[index]
You can give the number which node you want.
if namespace is provided on the given xml, its better to use this.
(/*[local-name() ='bookstore']/*[local-name()='book'][#location='US'])[1]
for ex.
<input b="demo">
And
(input[#b='demo'])[1]
With help of an online xpath tester I'm writing this answer...
For this:
<table id="t2"><tbody>
<tr><td>123</td><td>other</td></tr>
<tr><td>foo</td><td>columns</td></tr>
<tr><td>bar</td><td>are</td></tr>
<tr><td>xyz</td><td>ignored</td></tr>
</tbody></table>
the following xpath:
id("t2") / tbody / tr / td[1]
outputs:
123
foo
bar
xyz
Since 1 means select all td elements which are the first child of their own direct parent.
But the following xpath:
(id("t2") / tbody / tr / td)[1]
outputs:
123