Get node value from two dependent xml files - xpath

I am a beginner in Xquery and I have an xml code in which I want to dependent two different XML file with another file.
Book.xml
<library>
<Book>
<title>Title1</title>
<author>Ellizabith</author>
</Book>
<Book>
<title>Title2</title>
<author>Sam</author>
</Book>
<Book>
<title>Title3</title>
<author>Ryan</author>
</Book>
</library>
author.xml
<authorRoot>
<author>
<Name>Rayan</Name>
<Location>Yahoo</Location>
</author>
<author>
<Name>Syan</Name>
<Location>Google</Location>
</author>
<author>
<Name>Sam</Name>
<Location>Bing</Location>
</author>
</authorRoot>
in this answer a query to show the location of all the authors of book whose title contains the word "Title2".
This is my code :
for $p in doc("C:\Users\User\Desktop\Book.xml")//library/book/[title/contains(., 'Title1')]
for $a in doc("C:\Users\User\Desktop\author.xml")//authorRoot
let $p := $p/author/text()
let $d := $a/author
let $f := $d/text()=$p/Location/text()
return $f

There are multiple smaller problems with your codes.
Names in authors and books XML files do not match. I guess it's only some typos.
Predicates belong to an axis step, not in their own (remove the slash after book in line one).
XML and XQuery are capitalization sensitive! <Book/> uses a capital B, so do the same in your XQuery, again in line 1.
In Line two, you're looping over all <authorRoot/> elements. Use authorRoot/author instead.
In line three you're hiding the book $p with it's name for the rest of this FLWOR expression, but you want to use the book again in line five. Better use another variable name.
Better do not use the descendant-or-self-step // if you don't need it (lines one and two). This decreases performance.
I don't get what your idea for filtering was in lines three to five. Compare yourself with this working solution. Additionally I used speaking variable names, don't confuse yourself with unnecessary short ones.
Replace $book and $authors by the respecting doc(...) functions.
for $book in $books//library/Book[title/contains(., 'Title1')]
for $author in $authors//authorRoot/author
where $book/author = $author/Name
return $author/Location/text()
If you want to have a list of distinct places, wrap distinct-values(...) around all four lines.
An alternative without explicit loops:
$authors/authorRoot/author[
Name = $books/library/Book[contains(title, 'Title1')]/author)
]/Location
The second solution is also valid XPath 1.0, the first requires an XPath 3.0 or XQuery processor.

If you use
let $books := <library>
<Book>
<title>Title1</title>
<author>Ellizabith</author>
</Book>
<Book>
<title>Title2</title>
<author>Sam</author>
</Book>
<Book>
<title>Title3</title>
<author>Ryan</author>
</Book>
</library>
let $authors := <authorRoot>
<author>
<Name>Rayan</Name>
<Location>Yahoo</Location>
</author>
<author>
<Name>Syan</Name>
<Location>Google</Location>
</author>
<author>
<Name>Sam</Name>
<Location>Bing</Location>
</author>
<author>
<Name>Ellizabith</Name>
<Location>Apple</Location>
</author>
</authorRoot>
return $authors/author[Name = $books/Book[contains(title, 'Title1')]/author]/Location/string()
the result is Apple.

Related

Getting attribute name (not attribute value) with Xpath

How would an Xpath expression look like that retrieves all attribute names (not attribute values!) for a given node resp. xml tag?
Assume the following XML document:
<bookstore>
<book>
<title lang="eng">Harry Potter</title>
<price>29.99</price>
</book>
<book>
<title lang="fr" type="easyreading">Monsieur Claude</title>
<price>39.95</price>
</book>
</bookstore>
The Xpath //title/#* would select "eng, fr, easyreading", but which Xpath would select "lang, lang, type"?
Give this a try:
//#*/name()
returns
String='lang'
String='lang'
String='type'
See here regarding the name() function.

Selecting attributes of a filtered nodeset in XPath

I'm trying to learn XPath syntax. I'm using the w3schools example here:
http://www.w3schools.com/xsl/tryit.asp?filename=try_xpath_select_pricenodes_high
..which is based on the following XML:
<?xml version="1.0" encoding="UTF-8""?>
<bookstore>
<book category="COOKING">
<title la="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
</bookstore>
The example selects the title of books having a price greater than 35. I wanted to play with this example and select the category names instead of titles. So, I tried this:
/bookstore/book[price>35]/#category
And, as you can see by testing it yourself on that site, it produces no output. What am I getting wrong?
Thanks for your time.
The query is OK. But since you're dealing with attribute nodes and not elements, you have to adjust the code that prints the result. Change the lines
document.write(nodes[i].childNodes[0].nodeValue);
document.write(result.childNodes[0].nodeValue);
to
document.write(nodes[i].nodeValue);
document.write(result.nodeValue);
and you'll get the expected result.

Selecting a XML node with LINQ, and modifying

I've got the following XML:
<Config>
<Book>
<Name> Book Name #1 </Name>
<Available In>
<Country>US</Country>
<Country>Canada</Country>
</Available In>
</Book>
</Config>
I need to find all instances of Book which are available in a specific country, and then introduce a node underneath "Available In". My selection statement fails anytime I add the where statement:
XElement xmlFile = XElement.Load(xmlFileLocation);
var q = (from c in xmlFile.Elements(“Book”)
where c.Elements(Country).Value == "Canada"
select c;
.Value can't be resolved, and toString give me the entire subnode in stringform. I need to select all books in a particular country so that I can then update them all to include a new locale node, ex:
<Config>
<Book>
<Name> Book Name #1 </Name>
<Available In>
<Country>US</Country>
<Country>Canada</Country>
</Available In>
<LocaleIDs>
<LocalID> 3066 </LocaleID>
<LocaleIDs>
</Book>
</Config>
Thanks for your help!
You're trying to use Value on the result of calling Elements which returns a sequence of elements. That's not going to work - it doesn't make any sense. You want to call it on a single element at a time.
Additionally, you're trying to look for direct children of Book, which ignores the Available In element, which isn't even a valid element name...
I suspect you want something like:
var query = xmlFile.Elements("Book")
.Where(x => x.Descendants("Country")
.Any(x => (string) x == "Canada"));
In other words, find Book elements where any of the descendant Country elements has a text value of "Canada".
You'll still need to fix your XML to use valid element names though...

XQuery ancestor axis doesn't work, but explicit XPath does

Consider the following XML snippet:
<doc>
<chapter id="1">
<item>
<para>some text here</para>
</item>
</chapter>
</doc>
In XQuery, I have a function that needs to do some things based on the ancestor chapter of a given "para" element that is passed in as a parameter, as shown in the stripped down example below:
declare function doSomething($para){
let $chapter := $para/ancestor::chapter
return "some stuff"
};
In that example, $chapter keeps coming up empty. However, if I write the function similar to the follwing (i.e., without using the ancestor axis), I get the desired "chapter" element:
declare function doSomething($para){
let $chapter := $para/../..
return "some stuff"
};
The problem is that I cannot use explicit paths as in the latter example because the XMl I will be searching is not guaranteed to have the "chapter" element as a grandparent every time. It may be a great-grandparent or great-great-grandparent, and so on, as shown below:
<doc>
<chapter id="1">
<item>
<subItem>
<para>some text here</para>
</subItem>
</item>
</chapter>
</doc>
Does anyone have an explanation as to why the axis doesn't work, while the explicit XPath does? Also, does anyone have any suggestions on how to solve this problem?
Thank you.
SOLUTION:
The mystery is now solved.
The node in question was re-created in another function, which had the result of stripping it of all of its ancestor information. Unfortunately, the previous developer did not document this wonderful, little function and has cost us all a good deal of time.
So, the ancestor axis worked exactly as it should - it was just being applied to a deceptive node.
I thank all of you for your efforts in answering my questions.
The ancestor axis does work fine. I suspect your problem is namespaces. The example you showed and that I ran (below) has XML without any namespaces. If your XML have a namespace then you would need to provide that in the ancestor XPath, like this: $para/ancestor:foo:chapter where in this case the prefix _foo_ is bound to the correct namespace for the chapter element.
let $doc := <doc>
<chapter id="1">
<item>
<para>some text here</para>
</item>
</chapter>
</doc>
let $para := $doc//para
return $para/ancestor::chapter
RESULT:
<?xml version="1.0" encoding="UTF-8"?>
<chapter id="1">
<item>
<para>some text here</para>
</item>
</chapter>
These things almost always boil down to namespaces! As a daignostic to confirm 100% that namespace are not the issue, can you try:
declare function local:doSomething($para) {
let $chapter := $para/ancestor::*[local-name() = 'chapter']
return $chapter
};
This seems surprising to me; which XQuery implementation are you using? With BaseX, the following query...
declare function local:doSomething($para) {
let $chapter := $para/ancestor::chapter
return $chapter
};
let $xml :=
<doc>
<chapter id="1">
<item>
<para>some text here</para>
</item>
</chapter>
</doc>
return local:doSomething($xml//para)
...returns...
<chapter id="1">
<item>
<para>some text here</para>
</item>
</chapter>
I suspect namespaces too. If $para/../.. works but $para/parent::item/parent::chapter turns up empty, then you know it's a question of namespaces.
Look for an xmlns declaration at the top of your content, e.g.:
<doc xmlns="http://example.com">
...
</doc>
In your XQuery, you then need to bind that namespace to a prefix and use that prefix in your XQuery/XPath expressions, like this:
declare namespace my="http://example.com";
declare function doSomething($para){
let $chapter := $para/ancestor::my:chapter
return "some stuff"
};
What prefix you use doesn't matter. The important thing is that the namespace URI (http://example.com in the above example) matches up.
It makes sense that ../.. selects the element you want, because .. is short for parent::node() which selects the parent node regardless of its name (or namespace). Whereas ancestor::chapter will only select <chapter> elements that are not in a namespace (unless you have declared a default element namespace, which is usually not a good idea in XQuery because it affects both your input and your output).

How to select the first element with a specific attribute using XPath

The XPath bookstore/book[1] selects the first book node under bookstore.
How can I select the first node that matches a more complicated condition, e.g. the first node that matches /bookstore/book[#location='US']
Use:
(/bookstore/book[#location='US'])[1]
This will first get the book elements with the location attribute equal to 'US'. Then it will select the first node from that set. Note the use of parentheses, which are required by some implementations.
Note, this is not the same as /bookstore/book[1][#location='US'] unless the first element also happens to have that location attribute.
/bookstore/book[#location='US'][1] works only with simple structure.
Add a bit more structure and things break.
With-
<bookstore>
<category>
<book location="US">A1</book>
<book location="FIN">A2</book>
</category>
<category>
<book location="FIN">B1</book>
<book location="US">B2</book>
</category>
</bookstore>
/bookstore/category/book[#location='US'][1] yields
<book location="US">A1</book>
<book location="US">B2</book>
not "the first node that matches a more complicated condition". /bookstore/category/book[#location='US'][2] returns nothing.
With parentheses you can get the result the original question was for:
(/bookstore/category/book[#location='US'])[1] gives
<book location="US">A1</book>
and (/bookstore/category/book[#location='US'])[2] works as expected.
As an explanation to Jonathan Fingland's answer:
multiple conditions in the same predicate ([position()=1 and #location='US']) must be true as a whole
multiple conditions in consecutive predicates ([position()=1][#location='US']) must be true one after another
this implies that [position()=1][#location='US'] != [#location='US'][position()=1]
while [position()=1 and #location='US'] == [#location='US' and position()=1]
hint: a lone [position()=1] can be abbreviated to [1]
You can build complex expressions in predicates with the Boolean operators "and" and "or", and with the Boolean XPath functions not(), true() and false(). Plus you can wrap sub-expressions in parentheses.
The easiest way to find first english book node (in the whole document), taking under consideration more complicated structered xml file, like:
<bookstore>
<category>
<book location="US">A1</book>
<book location="FIN">A2</book>
</category>
<category>
<book location="FIN">B1</book>
<book location="US">B2</book>
</category>
</bookstore>
is xpath expression:
/descendant::book[#location='US'][1]
<bookstore>
<book location="US">A1</book>
<category>
<book location="US">B1</book>
<book location="FIN">B2</book>
</category>
<section>
<book location="FIN">C1</book>
<book location="US">C2</book>
</section>
</bookstore>
So Given the above; you can select the first book with
(//book[#location='US'])[1]
And this will find the first one anywhere that has a location US. [A1]
//book[#location='US']
Would return the node set with all books with location US. [A1,B1,C2]
(//category/book[#location='US'])[1]
Would return the first book location US that exists in a category anywhere in the document. [B1]
(/bookstore//book[#location='US'])[1]
will return the first book with location US that exists anywhere under the root element bookstore; making the /bookstore part redundant really. [A1]
In direct answer:
/bookstore/book[#location='US'][1]
Will return you the first node for book element with location US that is under bookstore [A1]
Incidentally if you wanted, in this example to find the first US book that was not a direct child of bookstore:
(/bookstore/*//book[#location='US'])[1]
Use the index to get desired node if xpath is complicated or more than one node present with same xpath.
Ex :
(//bookstore[#location = 'US'])[index]
You can give the number which node you want.
if namespace is provided on the given xml, its better to use this.
(/*[local-name() ='bookstore']/*[local-name()='book'][#location='US'])[1]
for ex.
<input b="demo">
And
(input[#b='demo'])[1]
With help of an online xpath tester I'm writing this answer...
For this:
<table id="t2"><tbody>
<tr><td>123</td><td>other</td></tr>
<tr><td>foo</td><td>columns</td></tr>
<tr><td>bar</td><td>are</td></tr>
<tr><td>xyz</td><td>ignored</td></tr>
</tbody></table>
the following xpath:
id("t2") / tbody / tr / td[1]
outputs:
123
foo
bar
xyz
Since 1 means select all td elements which are the first child of their own direct parent.
But the following xpath:
(id("t2") / tbody / tr / td)[1]
outputs:
123

Resources