XPath query: Query not working as expected - xpath

I was learning XPath using the following xml document: http://www.w3schools.com/xpath/xpath_examples.asp
Now, when I execute the query:
bookstore/book/author[contains(.,'G')]
I get the result: Giada De Laurentiis, James McGovern as expected. Now, since contains() returns a boolean value, I expected the following query to return all authors:
bookstore/book/author[true]
however, it returns an empty set. Can somebody explain?

you need bookstore/book/author
UPDATE: to pass true into XPATH you have to use bookstore/book/author[true()]
author[true] just means that you want to get all author element which has true subelement.
You can check it yourself, try expressions
bookstore/book[author1] vs bookstore/book[author]
First one returns you nothing, because there's no book element with author1 subelements. Second one returns you all elements. But if you remove author subnodes from some of books nodes, you'll get only those having author subnode.
So if you take xml like this
<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
</book>
</bookstore>
then
bookstore/book[author] returns
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
and bookstore/book[title] returns
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
</book>

I expected the following query to return all authors:
bookstore/book/author[true]
however, it returns an empty set. Can somebody explain?
The above expression selects all bookstore/book/author elements that have a child element named true. In the provided XML document no author element has a child named true -- therefore the XPath expression selects nothing.
In a comment the OP asks:
But why does bookstore/book/author[true] not work, since if is
similar to the situation when the contains() always returns true
contains() never returns (the string) "true" -- it returns the boolean value true() -- this is different from the string "true".
Explanation:
You are confused by the fact that the serialization of the boolean value true() to sting is the string "true".
This fact doesnt mean that the string "true" and the boolean value true() are identical.

Related

Getting attribute name (not attribute value) with Xpath

How would an Xpath expression look like that retrieves all attribute names (not attribute values!) for a given node resp. xml tag?
Assume the following XML document:
<bookstore>
<book>
<title lang="eng">Harry Potter</title>
<price>29.99</price>
</book>
<book>
<title lang="fr" type="easyreading">Monsieur Claude</title>
<price>39.95</price>
</book>
</bookstore>
The Xpath //title/#* would select "eng, fr, easyreading", but which Xpath would select "lang, lang, type"?
Give this a try:
//#*/name()
returns
String='lang'
String='lang'
String='type'
See here regarding the name() function.

Marklogic - Xpath using get attribute value

I have displayed sample Xml data in below , If title lang ="it" then i want to get category attribute value ?
<book category="CLASSICS">
<title lang="it">Purgatorio</title>
<author>Dante Alighieri</author>
<year>1308</year>
<price>30.00</price>
</book>
"If title lang ="it" then i want to get category attribute value ?"
The XPath should be straightforward :
//book[title/#lang='it']/#category
You can also use following XPATH Expression.
doc("XML-URI")/book[title/#lang/string() eq "it"]/#category

Selecting attributes of a filtered nodeset in XPath

I'm trying to learn XPath syntax. I'm using the w3schools example here:
http://www.w3schools.com/xsl/tryit.asp?filename=try_xpath_select_pricenodes_high
..which is based on the following XML:
<?xml version="1.0" encoding="UTF-8""?>
<bookstore>
<book category="COOKING">
<title la="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
</bookstore>
The example selects the title of books having a price greater than 35. I wanted to play with this example and select the category names instead of titles. So, I tried this:
/bookstore/book[price>35]/#category
And, as you can see by testing it yourself on that site, it produces no output. What am I getting wrong?
Thanks for your time.
The query is OK. But since you're dealing with attribute nodes and not elements, you have to adjust the code that prints the result. Change the lines
document.write(nodes[i].childNodes[0].nodeValue);
document.write(result.childNodes[0].nodeValue);
to
document.write(nodes[i].nodeValue);
document.write(result.nodeValue);
and you'll get the expected result.

Get node value from two dependent xml files

I am a beginner in Xquery and I have an xml code in which I want to dependent two different XML file with another file.
Book.xml
<library>
<Book>
<title>Title1</title>
<author>Ellizabith</author>
</Book>
<Book>
<title>Title2</title>
<author>Sam</author>
</Book>
<Book>
<title>Title3</title>
<author>Ryan</author>
</Book>
</library>
author.xml
<authorRoot>
<author>
<Name>Rayan</Name>
<Location>Yahoo</Location>
</author>
<author>
<Name>Syan</Name>
<Location>Google</Location>
</author>
<author>
<Name>Sam</Name>
<Location>Bing</Location>
</author>
</authorRoot>
in this answer a query to show the location of all the authors of book whose title contains the word "Title2".
This is my code :
for $p in doc("C:\Users\User\Desktop\Book.xml")//library/book/[title/contains(., 'Title1')]
for $a in doc("C:\Users\User\Desktop\author.xml")//authorRoot
let $p := $p/author/text()
let $d := $a/author
let $f := $d/text()=$p/Location/text()
return $f
There are multiple smaller problems with your codes.
Names in authors and books XML files do not match. I guess it's only some typos.
Predicates belong to an axis step, not in their own (remove the slash after book in line one).
XML and XQuery are capitalization sensitive! <Book/> uses a capital B, so do the same in your XQuery, again in line 1.
In Line two, you're looping over all <authorRoot/> elements. Use authorRoot/author instead.
In line three you're hiding the book $p with it's name for the rest of this FLWOR expression, but you want to use the book again in line five. Better use another variable name.
Better do not use the descendant-or-self-step // if you don't need it (lines one and two). This decreases performance.
I don't get what your idea for filtering was in lines three to five. Compare yourself with this working solution. Additionally I used speaking variable names, don't confuse yourself with unnecessary short ones.
Replace $book and $authors by the respecting doc(...) functions.
for $book in $books//library/Book[title/contains(., 'Title1')]
for $author in $authors//authorRoot/author
where $book/author = $author/Name
return $author/Location/text()
If you want to have a list of distinct places, wrap distinct-values(...) around all four lines.
An alternative without explicit loops:
$authors/authorRoot/author[
Name = $books/library/Book[contains(title, 'Title1')]/author)
]/Location
The second solution is also valid XPath 1.0, the first requires an XPath 3.0 or XQuery processor.
If you use
let $books := <library>
<Book>
<title>Title1</title>
<author>Ellizabith</author>
</Book>
<Book>
<title>Title2</title>
<author>Sam</author>
</Book>
<Book>
<title>Title3</title>
<author>Ryan</author>
</Book>
</library>
let $authors := <authorRoot>
<author>
<Name>Rayan</Name>
<Location>Yahoo</Location>
</author>
<author>
<Name>Syan</Name>
<Location>Google</Location>
</author>
<author>
<Name>Sam</Name>
<Location>Bing</Location>
</author>
<author>
<Name>Ellizabith</Name>
<Location>Apple</Location>
</author>
</authorRoot>
return $authors/author[Name = $books/Book[contains(title, 'Title1')]/author]/Location/string()
the result is Apple.

How to select the first element with a specific attribute using XPath

The XPath bookstore/book[1] selects the first book node under bookstore.
How can I select the first node that matches a more complicated condition, e.g. the first node that matches /bookstore/book[#location='US']
Use:
(/bookstore/book[#location='US'])[1]
This will first get the book elements with the location attribute equal to 'US'. Then it will select the first node from that set. Note the use of parentheses, which are required by some implementations.
Note, this is not the same as /bookstore/book[1][#location='US'] unless the first element also happens to have that location attribute.
/bookstore/book[#location='US'][1] works only with simple structure.
Add a bit more structure and things break.
With-
<bookstore>
<category>
<book location="US">A1</book>
<book location="FIN">A2</book>
</category>
<category>
<book location="FIN">B1</book>
<book location="US">B2</book>
</category>
</bookstore>
/bookstore/category/book[#location='US'][1] yields
<book location="US">A1</book>
<book location="US">B2</book>
not "the first node that matches a more complicated condition". /bookstore/category/book[#location='US'][2] returns nothing.
With parentheses you can get the result the original question was for:
(/bookstore/category/book[#location='US'])[1] gives
<book location="US">A1</book>
and (/bookstore/category/book[#location='US'])[2] works as expected.
As an explanation to Jonathan Fingland's answer:
multiple conditions in the same predicate ([position()=1 and #location='US']) must be true as a whole
multiple conditions in consecutive predicates ([position()=1][#location='US']) must be true one after another
this implies that [position()=1][#location='US'] != [#location='US'][position()=1]
while [position()=1 and #location='US'] == [#location='US' and position()=1]
hint: a lone [position()=1] can be abbreviated to [1]
You can build complex expressions in predicates with the Boolean operators "and" and "or", and with the Boolean XPath functions not(), true() and false(). Plus you can wrap sub-expressions in parentheses.
The easiest way to find first english book node (in the whole document), taking under consideration more complicated structered xml file, like:
<bookstore>
<category>
<book location="US">A1</book>
<book location="FIN">A2</book>
</category>
<category>
<book location="FIN">B1</book>
<book location="US">B2</book>
</category>
</bookstore>
is xpath expression:
/descendant::book[#location='US'][1]
<bookstore>
<book location="US">A1</book>
<category>
<book location="US">B1</book>
<book location="FIN">B2</book>
</category>
<section>
<book location="FIN">C1</book>
<book location="US">C2</book>
</section>
</bookstore>
So Given the above; you can select the first book with
(//book[#location='US'])[1]
And this will find the first one anywhere that has a location US. [A1]
//book[#location='US']
Would return the node set with all books with location US. [A1,B1,C2]
(//category/book[#location='US'])[1]
Would return the first book location US that exists in a category anywhere in the document. [B1]
(/bookstore//book[#location='US'])[1]
will return the first book with location US that exists anywhere under the root element bookstore; making the /bookstore part redundant really. [A1]
In direct answer:
/bookstore/book[#location='US'][1]
Will return you the first node for book element with location US that is under bookstore [A1]
Incidentally if you wanted, in this example to find the first US book that was not a direct child of bookstore:
(/bookstore/*//book[#location='US'])[1]
Use the index to get desired node if xpath is complicated or more than one node present with same xpath.
Ex :
(//bookstore[#location = 'US'])[index]
You can give the number which node you want.
if namespace is provided on the given xml, its better to use this.
(/*[local-name() ='bookstore']/*[local-name()='book'][#location='US'])[1]
for ex.
<input b="demo">
And
(input[#b='demo'])[1]
With help of an online xpath tester I'm writing this answer...
For this:
<table id="t2"><tbody>
<tr><td>123</td><td>other</td></tr>
<tr><td>foo</td><td>columns</td></tr>
<tr><td>bar</td><td>are</td></tr>
<tr><td>xyz</td><td>ignored</td></tr>
</tbody></table>
the following xpath:
id("t2") / tbody / tr / td[1]
outputs:
123
foo
bar
xyz
Since 1 means select all td elements which are the first child of their own direct parent.
But the following xpath:
(id("t2") / tbody / tr / td)[1]
outputs:
123

Resources