Select children but exclude specific one - xpath

I have this XML structure:
<?version="1.0" encoding="ISO-8859-1" ?>
<feed xmlns="http://www.w3.org/2005/Atom" >
<companyinfo>
<addresses>
<address type="mailing" >
<city>NEW YORK</city>
</address>
<address type="business" >
<city>NEW YORK</city>
</address>
<node1>node1</node1>
<node2>node2</node2>
<node3>nod3</node3>
</addresses>
</companyinfo>
</feed>
I want to select all children of <companyinfo> but exclude addresses from the result. Meaning my selection becomes all the <nodeX>.
After reading around and looking at related threads this and this, I came up with the following:
//companyinfo[not(addresses)] # does not work
//companyinfo/*[not(addresses)] # does not work
Am I misunderstanding how not(expr) works?
Am I actually trying to select companyinfo IF addresses node is not present?

Your xml is invalid, but if you fix it (and the typo), this expression, though convoluted,
//*[local-name()="companyinfo"]//*[local-name()="addresses"]//*[not(ancestor-or-self::*[local-name()="address"])]
should output
node1
node2
node3

Related

Can't index data in alphabetical order in spanish alphabet before to select it in a query

I have a set of assets which had a property "name".
I want to get a dynamic number of those assets and I should get it alphabetically sorted by that "name" property.
I query that with this query:
type=dam:Asset
path=/content/dam/en/foobar/contacts/
orderby=#jcr:content/data/master/#name
orderby.sort=asc
p.limit=3
and this is working, so in a set of names:
[Paloma, Abel, José, Eduardo]
it retrieves:
Abel, Eduardo, José.
The problem is with spanish alphabet, in which Á is the same letter as A.
So in a set of:
[Paloma, Abel, José, Álvaro, Eduardo]
it retrieves:
Abel, Eduardo, José.
Being Álvaro excluded because its not part of the first 3 elements after ordeby it, when in should be the second, it should retrieve:
Abel, Álvaro, Eduardo.
So, to fix that, I've created a custom oak lucene index like below:
<?xml version="1.0" encoding="UTF-8"?>
<jcr:root xmlns:oak="http://jackrabbit.apache.org/oak/ns/1.0" xmlns:jcr="http://www.jcp.org/jcr/1.0" xmlns:nt="http://www.jcp.org/jcr/nt/1.0" xmlns:rep="internal"
jcr:mixinTypes="[rep:AccessControllable]"
jcr:primaryType="nt:unstructured">
<socialLucene/>
<workflowDataLucene/>
<slingeventJob/>
<jcrLanguage/>
<versionStoreIndex/>
<repMembers/>
<cqReportsLucene/>
<commerceLucene/>
<counter/>
<authorizables/>
<enablementResourceName/>
<externalPrincipalNames/>
<cmLucene/>
<foobarCFIndexFilter
jcr:primaryType="oak:QueryIndexDefinition"
async="[async,nrt]"
evaluatePathRestrictions="{Boolean}true"
includedPaths="[/content/dam/es/foobar,/content/dam/en/foobar]"
queryPaths="[/content/dam/es/foobar,/content/dam/en/foobar]"
reindex="{Boolean}false"
reindexCount="{Long}24"
seed="{Long}3850652403740003290"
type="lucene">
<analyzers jcr:primaryType="nt:unstructured">
<default jcr:primaryType="nt:unstructured">
<filters jcr:primaryType="nt:unstructured">
<Synonym
jcr:primaryType="nt:unstructured"
format="solr"
synonyms="synonyms.txt">
<synonyms.txt/>
</Synonym>
</filters>
<tokenizer
jcr:primaryType="nt:unstructured"
name="Classic"/>
</default>
</analyzers>
<indexRules jcr:primaryType="nt:unstructured">
<nt:base jcr:primaryType="nt:unstructured">
<properties jcr:primaryType="nt:unstructured">
<title
jcr:primaryType="nt:unstructured"
analyzed="{Boolean}true"
isRegexp="{Boolean}false"
name="jcr:content/data/master/title"
nodeScopeIndex="{Boolean}true"
ordered="{Boolean}true"
propertyIndex="{Boolean}true"
type="String"/>
<date
jcr:primaryType="nt:unstructured"
name="jcr:content/data/master/date"
ordered="{Boolean}true"
propertyIndex="{Boolean}true"/>
<sectors
jcr:primaryType="nt:unstructured"
name="jcr:content/data/master/sectors"
propertyIndex="{Boolean}true"/>
<contentFragment
jcr:primaryType="nt:unstructured"
name="jcr:content/contentFragment"
propertyIndex="{Boolean}true"/>
<model
jcr:primaryType="nt:unstructured"
name="cq:model"
propertyIndex="{Boolean}true"/>
<name
jcr:primaryType="nt:unstructured"
analyzed="{Boolean}true"
isRegexp="{Boolean}false"
name="jcr:content/data/master/name"
nodeScopeIndex="{Boolean}true"
ordered="{Boolean}true"
propertyIndex="{Boolean}true"
type="String"/>
</properties>
</nt:base>
</indexRules>
</foobarCFIndexFilter>
<cqProjectLucene/>
<ntFolderDamLucene/>
<acPrincipalName/>
<uuid/>
<damAssetLucene/>
<rep:policy/>
<cqPayloadPath/>
<nodetypeLucene/>
<nodetype/>
<ntBaseLucene/>
<reference/>
<principalName/>
<cqTagLucene/>
<lucene/>
<repTokenIndex/>
<externalId/>
<authorizableId/>
<cqPageLucene/>
</jcr:root>
Where in the synonyms.txt I had:
á, a
Á, A
and so on.
Also tried with a charFilter with Mapping equivalent chars.
I have made sure that my custom oak index is the one my query is using with Query Performance Diagnosis tool.
But nothing works, after reindex the query results are the same.
How to solve that?

Xpath function to loop through repeating nodes

What XPath function works to loop through repeating XML nodes.
This is my Source XML:
<?xml version="1.0" encoding="UTF-8"?>
<Record>
<Type>V</Type>
<Address>
<Qual>A</Qual>
<ID>A1</ID>
</Address>
<Address>
<Qual>A</Qual>
<ID>B2</ID>
</Address>
<Address>
<Qual>C</Qual>
<ID>C2</ID>
</Address>
<Category>
<EL>PO</EL>
</Category>
<Category>
<EL>DP</EL>
</Category>
</Record>
I don't want to process the data if Qualf=A & ID = B2, Category =DP & Type =V
My Xpath does not work due to repeating nodes..
(concat(Xpath./Type,Xpath./Record/Address/Qual,Xpath./Record/Address/ID,Xpath./Record/Category/EL) != "VAB2DP"
so I tried
choose((concat(Xpath./Type,Xpath./Record/Address/Qual,Xpath./Record/Address/ID,Xpath./Record/Category/EL) != "VAB2DP"),'true','false'
It still does not work.

XPath results based on two nodes

I have XML that has a lot of duplicated values. I'd like to select all the rows with a specific section ("sec") and section tag ("sec_tag"), but I can't seem to get the XPath correct.
Here's a small snippet of the XML:
<root>
<record>
<sec>5</sec>
<sec_tag>919</sec_tag>
<nested_tag>
<info>Info</info>
<types>
<type>1</type>
<type>2</type>
<type>3</type>
</types>
</nested_tag>
<flags>00000000</flags>
</record>
<record>
<sec>5</sec>
<sec_tag>930</sec_tag>
<nested_tag>
<info>Info</info>
<types>
<type>1</type>
<type>2</type>
<type>3</type>
</types>
</nested_tag>
<flags>00000000</flags>
</record>
<record>
<sec>7</sec>
<sec_tag>919</sec_tag>
<nested_tag>
<info>Info</info>
<types>
<type>1</type>
<type>2</type>
<type>3</type>
</types>
</nested_tag>
<flags>00000000</flags>
</record>
</root>
I want the node that has <sec>5</sec> and <sec_tag>919</sec_tag>.
I tried something like this:
//sec[text(), "5"] and //sec_tag[text(), "919"]
Obviously that's not the correct syntax there, I just need to find the correct XPath expression.
You can use the following XPath expression to return record elements having child sec equals 5 and sec_tag equals 919 :
//record[sec = 5 and sec_tag = 919]

Jackrabbit - Select node with maximum property value

Let's say I have several file nodes with a property called foo. In Jackrabbit the xpath query I use to find those nodes by a property value is as follows:
/jcr:root/content/*[jcr:uuid='9b3d22fc-2354-49a6-afd0-9b672ae5a553']//file[foo = 10] order by #score
An oversimplified and raw representation of my repository as XML would look like this:
<content>
<formNode jcrUuid="9b3d22fc-2354-49a6-afd0-9b672ae5a553">
<year>
<month>
<day>
<hour>
<min>
<file foo="4"></file>
<file foo="10"></file>
</min>
</hour>
</day>
<day>
<hour>
<min>
<file foo="10"></file>
</min>
<min>
<file foo="5"></file>
</min>
</hour>
<hour>
<min>
<file foo="6"></file>
</min>
</hour>
</day>
</month>
</year>
</formNode>
</content>
Now. How can I find all the file nodes with the maximum value of foo? Does anyone know how to do this either by using xpath or JCR_SQL2?
I've tried the following queries without success:
Returns all the file nodes under the provided jcr:uuid
/jcr:root/content/*[jcr:uuid='9b3d22fc-2354-49a6-afd0-9b672ae5a553']//file[not(../file/foo > foo)] order by #score
Throws an Exception
jcr:root/content/*[jcr:uuid='9b3d22fc-2354-49a6-afd0-9b672ae5a553']//file[not(//file/foo > foo)] order by #score
Exception:
javax.jcr.query.InvalidQueryException: Unsupported root level query node: org.apache.jackrabbit.spi.commons.query.RelationQueryNode#8fedc
I've also tried the function fn:max. But AFAIK this is a XPATH 2.0 feature, which is not supported by JackRabbit 2.2.13, and I'm forced to use this version of JackRabbit.

How to Use multiple conditions in Xpath?

New to Xpath. Was trying in to use XML task in SSIS to load some values. Using Microsoft' XML inventory mentioned below.
How can I load first-name value in bookstore/books where style is novel and award = 'Pulitzer'?
//book[#style='novel' and ./author/award/text()='Pulitzer'] is what I am trying. It gives the whole element. Where should I modify to just get the first-name value?
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="myfile.xsl" ?>
<bookstore specialty="novel">
<book style="autobiography">
<author>
<first-name>Joe</first-name>
<last-name>Bob</last-name>
<award>Trenton Literary Review Honorable Mention</award>
</author>
<price>12</price>
</book>
<book style="textbook">
<author>
<first-name>Mary</first-name>
<last-name>Bob</last-name>
<publication>Selected Short Stories of
<first-name>Mary</first-name>
<last-name>Bob</last-name>
</publication>
</author>
<editor>
<first-name>Britney</first-name>
<last-name>Bob</last-name>
</editor>
<price>55</price>
</book>
<magazine style="glossy" frequency="monthly">
<price>2.50</price>
<subscription price="24" per="year"/>
</magazine>
<book style="novel" id="myfave">
<author>
<first-name>Toni</first-name>
<last-name>Bob</last-name>
<degree from="Trenton U">B.A.</degree>
<degree from="Harvard">Ph.D.</degree>
<award>P</award>
<publication>Still in Trenton</publication>
<publication>Trenton Forever</publication>
</author>
<price intl="Canada" exchange="0.7">6.50</price>
<excerpt>
<p>It was a dark and stormy night.</p>
<p>But then all nights in Trenton seem dark and
stormy to someone who has gone through what
<emph>I</emph> have.</p>
<definition-list>
<term>Trenton</term>
<definition>misery</definition>
</definition-list>
</excerpt>
</book>
<my:book xmlns:my="uri:mynamespace" style="leather" price="29.50">
<my:title>Who's Who in Trenton</my:title>
<my:author>Robert Bob</my:author>
</my:book>
</bookstore>
I got an answer.
//book[#style='novel' and ./author/award/text()='Pulitzer']//first-name
Use:
/*/book[#style='novel']/author[award = 'Pulitzer']/first-name
This selects any first-name element whose author parent has a award child with string value of 'Pulitzer' and whose (of the author) parent is a book whose style attribute has value "novel" and whose parent is the top element of the XML document.
A similar question in the same context. How can I do the vice-versa ? Let's suppose I want to find the id of all those books whose price is greater than 20 ? I know I am being a nudge, but really want to clear my understanding.
Here is the needed XPATH :
//book/price[text() > 20]/..

Resources