How to select the first element with a specific attribute using XPath - xpath

The XPath bookstore/book[1] selects the first book node under bookstore.
How can I select the first node that matches a more complicated condition, e.g. the first node that matches /bookstore/book[#location='US']

Use:
(/bookstore/book[#location='US'])[1]
This will first get the book elements with the location attribute equal to 'US'. Then it will select the first node from that set. Note the use of parentheses, which are required by some implementations.
Note, this is not the same as /bookstore/book[1][#location='US'] unless the first element also happens to have that location attribute.

/bookstore/book[#location='US'][1] works only with simple structure.
Add a bit more structure and things break.
With-
<bookstore>
<category>
<book location="US">A1</book>
<book location="FIN">A2</book>
</category>
<category>
<book location="FIN">B1</book>
<book location="US">B2</book>
</category>
</bookstore>
/bookstore/category/book[#location='US'][1] yields
<book location="US">A1</book>
<book location="US">B2</book>
not "the first node that matches a more complicated condition". /bookstore/category/book[#location='US'][2] returns nothing.
With parentheses you can get the result the original question was for:
(/bookstore/category/book[#location='US'])[1] gives
<book location="US">A1</book>
and (/bookstore/category/book[#location='US'])[2] works as expected.

As an explanation to Jonathan Fingland's answer:
multiple conditions in the same predicate ([position()=1 and #location='US']) must be true as a whole
multiple conditions in consecutive predicates ([position()=1][#location='US']) must be true one after another
this implies that [position()=1][#location='US'] != [#location='US'][position()=1]
while [position()=1 and #location='US'] == [#location='US' and position()=1]
hint: a lone [position()=1] can be abbreviated to [1]
You can build complex expressions in predicates with the Boolean operators "and" and "or", and with the Boolean XPath functions not(), true() and false(). Plus you can wrap sub-expressions in parentheses.

The easiest way to find first english book node (in the whole document), taking under consideration more complicated structered xml file, like:
<bookstore>
<category>
<book location="US">A1</book>
<book location="FIN">A2</book>
</category>
<category>
<book location="FIN">B1</book>
<book location="US">B2</book>
</category>
</bookstore>
is xpath expression:
/descendant::book[#location='US'][1]

<bookstore>
<book location="US">A1</book>
<category>
<book location="US">B1</book>
<book location="FIN">B2</book>
</category>
<section>
<book location="FIN">C1</book>
<book location="US">C2</book>
</section>
</bookstore>
So Given the above; you can select the first book with
(//book[#location='US'])[1]
And this will find the first one anywhere that has a location US. [A1]
//book[#location='US']
Would return the node set with all books with location US. [A1,B1,C2]
(//category/book[#location='US'])[1]
Would return the first book location US that exists in a category anywhere in the document. [B1]
(/bookstore//book[#location='US'])[1]
will return the first book with location US that exists anywhere under the root element bookstore; making the /bookstore part redundant really. [A1]
In direct answer:
/bookstore/book[#location='US'][1]
Will return you the first node for book element with location US that is under bookstore [A1]
Incidentally if you wanted, in this example to find the first US book that was not a direct child of bookstore:
(/bookstore/*//book[#location='US'])[1]

Use the index to get desired node if xpath is complicated or more than one node present with same xpath.
Ex :
(//bookstore[#location = 'US'])[index]
You can give the number which node you want.

if namespace is provided on the given xml, its better to use this.
(/*[local-name() ='bookstore']/*[local-name()='book'][#location='US'])[1]

for ex.
<input b="demo">
And
(input[#b='demo'])[1]

With help of an online xpath tester I'm writing this answer...
For this:
<table id="t2"><tbody>
<tr><td>123</td><td>other</td></tr>
<tr><td>foo</td><td>columns</td></tr>
<tr><td>bar</td><td>are</td></tr>
<tr><td>xyz</td><td>ignored</td></tr>
</tbody></table>
the following xpath:
id("t2") / tbody / tr / td[1]
outputs:
123
foo
bar
xyz
Since 1 means select all td elements which are the first child of their own direct parent.
But the following xpath:
(id("t2") / tbody / tr / td)[1]
outputs:
123

Related

XPath Expression referencing a node

I am trying to reference a node in an expression. Take this simple example:
<?xml version="1.0" encoding="UTF-8" ?>
<homelist>
<homes>
<home>
<hname>house</hname>
<location>hell</location>
<url>wee</url>
<cID>1234</cID>
</home>
</homes>
<contacts>
<contactdetails cID="1234">
<cname>John Smith</cname>
<phone>0123234</phone>
<email>test#gmail.com</email>
</contactdetails>
</contacts>
</homelist>
I basically want to select nodes if it's value is somewhere else in the tree.
For example, I want to display the url of homes that have cID of John Smith. I tried this but it doesn't work, what is wrong with it:
homelist/homes/home[ancestor::homelist/contacts/contactdetails[cname="John Smith"]/url
"/homelist/homes/home[cID = /homelist/contacts/contactdetails[cname='John Smith']/#cID]/url"
You want to find the <home> whose <cID> child's text content equals that of the cID= attribute of the <contactdetails> whose <cname> contains 'John Smith', then return its <url> child.
Note that I've written this as an absolute path, from the root, since you didn't tell us what the context node was going to be for this XPath.
There are certainly other ways of writing the same concept; this is just the first one that occurred to me offhand.
If you preferred to use ancestor or parent, you could say
"/homelist/homes/home[cID = ancestor::homelist/contacts/contactdetails[cname='John Smith']/#cID]/url"

Selecting attributes of a filtered nodeset in XPath

I'm trying to learn XPath syntax. I'm using the w3schools example here:
http://www.w3schools.com/xsl/tryit.asp?filename=try_xpath_select_pricenodes_high
..which is based on the following XML:
<?xml version="1.0" encoding="UTF-8""?>
<bookstore>
<book category="COOKING">
<title la="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
</bookstore>
The example selects the title of books having a price greater than 35. I wanted to play with this example and select the category names instead of titles. So, I tried this:
/bookstore/book[price>35]/#category
And, as you can see by testing it yourself on that site, it produces no output. What am I getting wrong?
Thanks for your time.
The query is OK. But since you're dealing with attribute nodes and not elements, you have to adjust the code that prints the result. Change the lines
document.write(nodes[i].childNodes[0].nodeValue);
document.write(result.childNodes[0].nodeValue);
to
document.write(nodes[i].nodeValue);
document.write(result.nodeValue);
and you'll get the expected result.

Get node value from two dependent xml files

I am a beginner in Xquery and I have an xml code in which I want to dependent two different XML file with another file.
Book.xml
<library>
<Book>
<title>Title1</title>
<author>Ellizabith</author>
</Book>
<Book>
<title>Title2</title>
<author>Sam</author>
</Book>
<Book>
<title>Title3</title>
<author>Ryan</author>
</Book>
</library>
author.xml
<authorRoot>
<author>
<Name>Rayan</Name>
<Location>Yahoo</Location>
</author>
<author>
<Name>Syan</Name>
<Location>Google</Location>
</author>
<author>
<Name>Sam</Name>
<Location>Bing</Location>
</author>
</authorRoot>
in this answer a query to show the location of all the authors of book whose title contains the word "Title2".
This is my code :
for $p in doc("C:\Users\User\Desktop\Book.xml")//library/book/[title/contains(., 'Title1')]
for $a in doc("C:\Users\User\Desktop\author.xml")//authorRoot
let $p := $p/author/text()
let $d := $a/author
let $f := $d/text()=$p/Location/text()
return $f
There are multiple smaller problems with your codes.
Names in authors and books XML files do not match. I guess it's only some typos.
Predicates belong to an axis step, not in their own (remove the slash after book in line one).
XML and XQuery are capitalization sensitive! <Book/> uses a capital B, so do the same in your XQuery, again in line 1.
In Line two, you're looping over all <authorRoot/> elements. Use authorRoot/author instead.
In line three you're hiding the book $p with it's name for the rest of this FLWOR expression, but you want to use the book again in line five. Better use another variable name.
Better do not use the descendant-or-self-step // if you don't need it (lines one and two). This decreases performance.
I don't get what your idea for filtering was in lines three to five. Compare yourself with this working solution. Additionally I used speaking variable names, don't confuse yourself with unnecessary short ones.
Replace $book and $authors by the respecting doc(...) functions.
for $book in $books//library/Book[title/contains(., 'Title1')]
for $author in $authors//authorRoot/author
where $book/author = $author/Name
return $author/Location/text()
If you want to have a list of distinct places, wrap distinct-values(...) around all four lines.
An alternative without explicit loops:
$authors/authorRoot/author[
Name = $books/library/Book[contains(title, 'Title1')]/author)
]/Location
The second solution is also valid XPath 1.0, the first requires an XPath 3.0 or XQuery processor.
If you use
let $books := <library>
<Book>
<title>Title1</title>
<author>Ellizabith</author>
</Book>
<Book>
<title>Title2</title>
<author>Sam</author>
</Book>
<Book>
<title>Title3</title>
<author>Ryan</author>
</Book>
</library>
let $authors := <authorRoot>
<author>
<Name>Rayan</Name>
<Location>Yahoo</Location>
</author>
<author>
<Name>Syan</Name>
<Location>Google</Location>
</author>
<author>
<Name>Sam</Name>
<Location>Bing</Location>
</author>
<author>
<Name>Ellizabith</Name>
<Location>Apple</Location>
</author>
</authorRoot>
return $authors/author[Name = $books/Book[contains(title, 'Title1')]/author]/Location/string()
the result is Apple.

Selecting a XML node with LINQ, and modifying

I've got the following XML:
<Config>
<Book>
<Name> Book Name #1 </Name>
<Available In>
<Country>US</Country>
<Country>Canada</Country>
</Available In>
</Book>
</Config>
I need to find all instances of Book which are available in a specific country, and then introduce a node underneath "Available In". My selection statement fails anytime I add the where statement:
XElement xmlFile = XElement.Load(xmlFileLocation);
var q = (from c in xmlFile.Elements(“Book”)
where c.Elements(Country).Value == "Canada"
select c;
.Value can't be resolved, and toString give me the entire subnode in stringform. I need to select all books in a particular country so that I can then update them all to include a new locale node, ex:
<Config>
<Book>
<Name> Book Name #1 </Name>
<Available In>
<Country>US</Country>
<Country>Canada</Country>
</Available In>
<LocaleIDs>
<LocalID> 3066 </LocaleID>
<LocaleIDs>
</Book>
</Config>
Thanks for your help!
You're trying to use Value on the result of calling Elements which returns a sequence of elements. That's not going to work - it doesn't make any sense. You want to call it on a single element at a time.
Additionally, you're trying to look for direct children of Book, which ignores the Available In element, which isn't even a valid element name...
I suspect you want something like:
var query = xmlFile.Elements("Book")
.Where(x => x.Descendants("Country")
.Any(x => (string) x == "Canada"));
In other words, find Book elements where any of the descendant Country elements has a text value of "Canada".
You'll still need to fix your XML to use valid element names though...

XPath query: Query not working as expected

I was learning XPath using the following xml document: http://www.w3schools.com/xpath/xpath_examples.asp
Now, when I execute the query:
bookstore/book/author[contains(.,'G')]
I get the result: Giada De Laurentiis, James McGovern as expected. Now, since contains() returns a boolean value, I expected the following query to return all authors:
bookstore/book/author[true]
however, it returns an empty set. Can somebody explain?
you need bookstore/book/author
UPDATE: to pass true into XPATH you have to use bookstore/book/author[true()]
author[true] just means that you want to get all author element which has true subelement.
You can check it yourself, try expressions
bookstore/book[author1] vs bookstore/book[author]
First one returns you nothing, because there's no book element with author1 subelements. Second one returns you all elements. But if you remove author subnodes from some of books nodes, you'll get only those having author subnode.
So if you take xml like this
<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
</book>
</bookstore>
then
bookstore/book[author] returns
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
and bookstore/book[title] returns
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
</book>
I expected the following query to return all authors:
bookstore/book/author[true]
however, it returns an empty set. Can somebody explain?
The above expression selects all bookstore/book/author elements that have a child element named true. In the provided XML document no author element has a child named true -- therefore the XPath expression selects nothing.
In a comment the OP asks:
But why does bookstore/book/author[true] not work, since if is
similar to the situation when the contains() always returns true
contains() never returns (the string) "true" -- it returns the boolean value true() -- this is different from the string "true".
Explanation:
You are confused by the fact that the serialization of the boolean value true() to sting is the string "true".
This fact doesnt mean that the string "true" and the boolean value true() are identical.

Resources