great ancestor & great great ancestor - xpath

1/ I have rule checker that forbid elements depending on a xpath expression.
2/ Every "test" element can contain a "test" element recursivly
3/ I want to forbid the "non usage" of an attibute for the firsts 3 "test" elements
Exemple:
<test targetAttribute="level 1">
<test targetAttribute="level 2">
<test targetAttribute="level 3">
<test targetAttribute="level 4">
<test targetAttribute="level 5">
</test>
</test>
</test>
</test>
</test>
targetAttribute attribute is mandatory for the firsts 3 levels only all other descendant element from level3 have their targetAttribute optionnal.
Here are my xpath:
//test[not(targetAttribute)]/ancestor[1]::test (level1)
//test[not(targetAttribute)]/ancestor[2]::test (level2)
//test[not(targetAttribute)]/ancestor[3]::test (level3)
But it doesn't work ! I also tried without success:
//test/ancestor[1]::test[not(targetAttribute)]
I'm becoming crazy #_# can someone help me plz ?

In order to select an attribute you need to use # before the attribute name. So
//test[not(targetAttribute)]
should be changed to
//test[not(#targetAttribute)]
and it will get all the test elements that do not contain this #targetAttribute.
Second thing.
When you want to select the first closest ancestor test you should use index after the test, like so:
/ancestor::test[1]
this will select the closes ancestor that is test (immediate parent in this case).
/ancestor::test[2]
will give you the grandparent and 3 will produce grand-grand parent.
Also you should probably filter out ancestors that don't have #targetAttribute
Not sure what exactly you are trying to accomplish, but just a try:
//test[not(#targetAttribute)]/ancestor::test[#targetAttribute][3] (level1)
//test[not(#targetAttribute)]/ancestor::test[#targetAttribute][2] (level2)
//test[not(#targetAttribute)]/ancestor::test[#targetAttribute][1] (level3)

Related

Xpath query to take nth occurrence from last occurrence

How to take last 3rd occurrence in xpath for below xml.
When I try with below query it not works. Expected to be Last Third Occurence
Xpath I tried:
/LINK/TEST[last(3)]/NAME
XML:
<LINK>
<N_Number_of_TESTs>
<NAME>N_Number_of_Names</NAME>
</N_Number_of_TESTs>
<TEST>
<NAME>Last Third Occurence</NAME>
</TEST>
<TEST>
<NAME>Last Second Occurence</NAME>
</TEST>
<TEST>
<NAME>Last Occurence</NAME>
</TEST>
</LINK>
You can use /LINK/TEST[last() - 2]/NAME.
The following will select the third-from-last TEST element:
//TEST[count(following-sibling::TEST)=2]

XPath remove single node (via Saxon CLI)

I want to remove a node from an XML file (using SaxonHE9-8-0-11J):
<project name="Build">
<property name="src" value="src/main/resources" />
<property name="target" value="target/classes" />
<condition property="target.exists">
<available file="target" />
</condition>
</project>
Apparently there are 2 ways I can do this.
XPath1: using a not function
XPath2: using an except clause. But both simply return the entire node-set.
With a not function:
saxonb-xquery -s:test.xml -qs:'*[not(local-name()="condition")]'
With an except clause:
saxonb-xquery -s:test.xml -qs:'* except condition'
With -explain switch the queries are:
<query>
<body>
<filterExpression>
<axis name="child" nodeTest="element()"/>
<operator op="ne (on empty return true())">
<functionCall name="local-name">
<dot/>
</functionCall>
<literal value="condition" type="xs:string"/>
</operator>
</filterExpression>
</body>
</query>
and
<query>
<body>
<operator op="except">
<axis name="child" nodeTest="element()"/>
<path>
<root/>
<axis name="descendant" nodeTest="element(condition, xs:anyType)"/>
</path>
</operator>
</body>
</query>
In general, XPath select nodes from one or more input documents, it doesn't allow you to construct new ones, for that you need XSLT or XQuery. And removing the condition child of the project root, if that is what you want to achieve, is something you need XSLT or XQuery for, with XPath, even if you use /*/(* except condition), you then get all children except the condition element, but as a sequence, not wrapped into a a root.
So with XQuery you could use
/*/element {node-name()} { * except condition }
as a compact but generic way to reconstruct any root with all child elements except the condition: https://xqueryfiddle.liberty-development.net/948Fn5b
Whether you get such an expression through a command line shell is a different problem, on Windows with a Powershell window and the cmd shell it works for me to use
-qs:"/*/element {node-name()} { * except condition }"

Xpath Query to Loop Multiple Child Node

<Sections>
<Classes>
<Class>
<ClassStd>VI</ClassStd>
<ClassName>XYZ</ClassName>
</Class>
<Class>
<ClassStd>VII</ClassStd>
<ClassName>ABC</ClassName>
</Class>
</Classes>
<Classes>
<Class>
<ClassStd>VIII</ClassStd>
<ClassName>EFG</ClassName>
</Class>
<Class>
<ClassStd>IX</ClassStd>
<ClassName>MNO</ClassName>
</Class>
</Classes>
</Sections>
I want to get the ClassName values (XYZ,ABC,EfG,MNO) using Xpath. I tried using
//Sections/Classes/Class/*/ClassName/text() and other Xpath queries but i'm not getting desired results. I want to loop through every Classes and Every Class and get the ClassName values. Since the number of Classes or Class is fixed i have to loop till the end to get Values. How i can construct such a loop in Xpath ?
You need to change your xpath to:
//sections/classes/classname/text()

Number of ancestors of a precise node

As usual my question is simple but I dont seem to be able to do what I want :
<test targetAttribute="level 1">
<test targetAttribute="level 2">
<test targetAttribute="level 3">
<test targetAttribute="level 4">
<test targetAttribute="level 5">
</test>
</test>
</test>
</test>
</test>
I want to know how many ancestors have the //test/#targetAttribute="level 5" node.
I have been trying thousand things, nothing is working for me :
count(//test/#targetAttribute="level 5"/ancestor::*)
//test/#targetAttribute="level 5"/count(ancestor::*)
count(ancestor::*[//test/#targetAttribute="level 5"])
...
I just don't seem to be able to find what I am looking at on google...
The notion of ancestor is known for element, so, first, you need to find the target element which has #targetAttribute="level 5" :
//test[#targetAttribute='level 5']
From here, you should be able to modify the above XPath to return count of ancestor elements of the target element :
count(//test[#targetAttribute='level 5']/ancestor::*)
Demo

Ruby + Nokogiri + Xpath navigate Node_Set

<Item id="item0">
<Links>
<FirstLink id="link1" target="one"/>
<SecondLink id="link2" target="two"/>
</Links>
<Data>
<String>content</String>
</Data>
</Item>
<Item id="item1">
<Links>
<FirstLink id="link1" target="two"/>
<SecondLink id="link2" target="two"/>
</Links>
<Data>
<String>content</String>
</Data>
</Item>
I have created a Nokogiri-NodeSet with this structure, i.e. a list of items with links and data children.
How can I filter any items that don't match a certain value in the 'target'-attribute of <FirstLink>?
Actually, what I want in the end is to extract the <Data><String>-Content of every <Item> that matches a certain value in it's <FirstLink> "Target"-Attribute.
I've tried several approaches already but I'm at a loss as to how to identify an element by an attribute of it's grandchild, then extracting the content of this grandchild's parent's sibling, X(.
We can build up an XPath expression to do this. Assuming we are starting from the whole XML document, rather than the node-set you already have, something like
//Item
will select all <Item> elements (I’m guessing you already have something like that to get this node-set).
Next, to select only those <Item> elements which have <Links><FirstLink> where FirstLink has a target attribute value of one:
//Item[Links/FirstLink[#target='one']]
and finally to select the Data/String children of those nodes:
//Item[Links/FirstLink[#target='one']]/Data/String
So with Nokogiri you could use something like this (where doc is your parsed document):
doc.xpath("//Item[Links/FirstLink[#target='one']]/Data/String")
or if you want to use the node-set you already have you can use a relative expression:
nodeset.xpath("self::Item[Links/FirstLink[#target='one']]/Data/String")
I completely didn't understand what your goal is. But using a guess, I am trying to show you, how to proceed in this case :
require 'nokogiri'
doc = Nokogiri::XML <<-xml
<Item id="item0">
<Links>
<FirstLink id="link1" target="one"/>
<SecondLink id="link2" target="two"/>
</Links>
<Data>
<String>content1</String>
</Data>
</Item>
<Item id="item1">
<Links>
<FirstLink id="link1" target="two"/>
<SecondLink id="link2" target="two"/>
</Links>
<Data>
<String>content2</String>
</Data>
</Item>
xml
#xpath method with the expression "//Item", will select all the Item nodes. Then those Item nodes will be passed to the #reject method to select only those nodes, that has a node called Links having the target attribute value is "one". If any of the links, either FirstLink or SecondLink has the target attribute value "one", for that nodes grandparent node Item will be selected.
node.at("//Links/FirstLink")['target'] will give you the string say "one" which is a value of target attribute of the node, FirstLink of first Item nodes , then "two" from the second Item node. The part ['any vaue'] in node.at("//Links/FirstLink")['target']['any vaue'] is a call to the String#[] method.
Remember below approach will give you the flexibility of the use regular expression too.
nodeset = doc.xpath("//Item").reject do |node|
node.at("//Links/FirstLink")['target']['any vaue']
end
Now nodeset contains only the required Item nodes. Now I use #map, passing each item node inside it to collect the content of the String node. Then #at method with an expression //Data/String, will select the String node. Then #text, will give you the content of each String node.
nodeset.map { |n| n.at('//Data/String').text } # => ["content1"]

Resources