I got an .xml file which has the following entries:
<country>
<province id="prov-cid-cia-Greece-3" country="GR">
<name>Attiki</name>
<area>3808</area>
<population>3522769</population>
<city id="cty-Greece-Athens" is_country_cap="yes" country="GR" province="prov-cid-cia-Greece-3">
<name>Athens</name>
<longitude>23.7167</longitude>
<latitude>37.9667</latitude>
<population year="81">885737</population>
<located_at watertype="sea" sea="sea-Mittelmeer"/>
</city>
</province>
</country>
However, there are also nodes which are called city without the province as a parent:
<country>
<city id="stadt-Shkoder-AL-AL" country="AL">
<name>Shkoder</name>
<longitude>19.2</longitude>
<latitude>42.2</latitude>
<population year="87">62000</population>
<located_at watertype="lake" lake="lake-Skutarisee"/>
</city>
</country>
Without further explanation, I want to select all nodes city, however, in my current query it selects only cities without province as a parent
query = f"//country/city[#is_country_cap = \"yes\" and ./located_at[#watertype]]/name/text()"
How could I include the /province/country in my query? I have tried:
query = f"//country/(city | ./province/city)[#is_country_cap = \"yes\" and ./located_at[#watertype]]/name/text()"
without any success, I get an Error.
You can match for all <city> elements that have a parent of <country> or <province>. Then, in a second predicate, add your other requirements like this:
//city[parent::country or parent::province][#is_country_cap = 'yes' and located_at[#watertype]]/name
Or, approaching your language
query = f"//city[parent::country or parent::province][#is_country_cap = \"yes\" and located_at[#watertype]]/name/text()"
Maybe this is of some help to you.
Your mistake has been using the | operator instead of the keyword or. In XPath, the | operator means "merge nodesets" and not a logical "OR" like in C.
Related
<Cities>
<city>
<name />
<country />
<population asof = "2019" />
<total> 2918695</total>
<Average_age> 28 </Average_age>
</city>
<city>
<name />
<country />
<population asof = "2020" />
<total> 78805467 </total>
<Average_age> 32 </Average_age>
</city>
</Cities>
I want to build a Xpath query which returns the total population of cities where asof is higher than 2018
Try this XPath-1.0 expression:
sum(/Cities/city[population/#asof > 2018]/total)
Or, another, less specific, version:
sum(//city[population/#asof > 2018]/total)
the expression to grab population with asof attribute greater than 2018 would be:
//population[#asof > '2018']
If you looking for <total> which is a sibling of <population> despite your indentation use following-sibling::total after the expression
otherwise use /total
lets follow the first approach so the XPath continues as:
//population[#asof > '2019']/following-sibling::total
and add /text() at the end to get text inside of desired <total> tag. additionally if you want sum of populations you can put the whole expression inside sum() function. the inside expression of sum gonna be like:
//population[#asof > '2019']/following-sibling::total/text()
I am trying to fetch two nodes from XML as combined result using OR condition.
Nodes in XML where name = John or name="jim",both should be returned . So basically I expect following result:
<person name="John"></person>
<person name="Jim"></person>
I have tried XPath function * ///person[#name="John"] or ///person[#name="Jim"]*
but it gives me only one node.
How to construct Xpath function in this case ?
regards,
Venky
I would use a predicate person[#name = ('John', 'Jim')] if we assume Saxon means a Saxon 9 version where XPath 2 or 3 is supported. Of course the right place for your or expression would be inside the square brackets person[#name = 'Jim' or #name = 'John'].
In case below two elements do not show in same time
<a title='a' />
<b title='b' />
I want to check if one of them can show
does xpath support the 'or' function? I just want to write in one line:
//a[#title='a'] or .. #title='b' ??
XPath Operators
Select either matching nodes (your case here):
//a[#title='a'] | //b[#title='b']
Select one element with either matching attributes
//a[#title='a' or #title='b']
If you want to match either <a/> elements with #title='a' attribute or <b/> elements with #title='b' attribute, you can also match all elements and perform a test on their name:
//*[local-name(.) = 'a' and #title='a' or local-name(.) = 'b' and #title='b']
I have an xpath-expression like this:
element[#attr="a"] | element[#attr="b"] | element[#attr="c"] | … which is an »or« statement. So can I create an expression that guarantees the result to appear in the order as in the query, even if the elements appear in a different order in the document?
f.e. an document fragment in this order:
<doc>
<element attr="c" />
<element attr="b" />
<element attr="a" />
.
.
.
</doc>
and a result list ordered like this:
[0] <element attr="a" />
[1] <element attr="b" />
[2] <element attr="c" />
.
.
.
The | operator computes the union of its operands and with XPath 1.0 you simply get a set of nodes, the order is undefined, though most XPath APIs then return the result in document order or allow you to say which order you want or whether order matters (see for instance http://www.w3.org/TR/DOM-Level-3-XPath/xpath.html#XPathResult).
With XPath 2.0 you get a sequence of nodes ordered in document order, with XPath 2.0 if you want the order of your subexpressions you would need to use the comma operator, not the union operator i.e. element[#attr="a"] , element[#attr="b"] , element[#attr="c"].
can I create an expression that guarantees the result to appear in the
order as in the query, even if the elements appear in a different
order in the document?
Not with any XPath 1.0 engine -- they return the resulting XmlNodeList in document order.
With XPath 2.0 one can specify that a sequence is to be returned, using the comma , operator, like this:
element[#attr="a"] , element[#attr="b"] , element[#attr="c"]
Finally, If you are limited with an XPath 1.0 implementation, one way of getting the results in the desired order is to evaluate these three XPath expressions:
element[#attr="a"]
element[#attr="b"]
element[#attr="c"]
Then you can access the first result first, the second result -- second and the third result -- third.
below is the xml file -
<Countries>
<Country>
<Name>India</Name>
<Capital>New Delhi</Capital>
</Country>
<Country>
<Name>USA</Name>
<Capital>Washington DC</Capital>
</Country>
<Country>
<Name>England</Name>
<Capital>London</Capital>
</Country>
<Country>
<Name>Japan</Name>
<Capital>Tokyo</Capital>
</Country>
<Country>
<Name>Srilanka</Name>
<Capital>Colombo</Capital>
</Country>
</Countries>
I have stored it in BaseX, an XMLDB. Now like plain DBs if I had stored there, I would have written simple select statement to retrieve the data from table. For example:
select name, capital from country
and got both the rows. Right? How can this be done using XQuery?
In a relational database (which you graphically describe as a "plain database") every query takes tables as its input and produces a table as its output. In an XML database, the input is XML and the output is XML. So you need to describe the XML you want to produce. Once you have done that, the answer to your question is yes: you can certainly write an XQuery to produce that output.
Seems you want a sequence of all name and capital elements:
/Countries/Country/(Name|Capital)
The result produced by this query is:
<Name>India</Name>
<Capital>New Delhi</Capital>
<Name>USA</Name>
<Capital>Washington DC</Capital>
<Name>England</Name>
<Capital>London</Capital>
<Name>Japan</Name>
<Capital>Tokyo</Capital>
<Name>Srilanka</Name>
<Capital>Colombo</Capital>
If you are expecting the output without elements then try this one
for $x in doc("Country")/Countries/Country
return string-join( ($x/Name, $x/Capital), " - ")
The output will be -
`India - New Delhi USA - Washington DC England - London Japan - Tokyo Srilanka - Colombo`
If you want them on separate lines then, this is what you will have to use -
for $x in doc("Country")/Countries/Country
return <li>{string-join( ($x/Name, $x/Capital), " - ")}</li>
Instead of <li> you can use any other relevant tag.