Xquery get node with specific child element - xpath

I am using xquery 1.0 and have the following problem.
My input message:
<Body>
<album>
<contents>
<content>correct</content>
<content>hardcore</content>
</contents>
</album>
<album>
<contents>
<content>incorrect</content>
<content>punk</content>
</contents>
</album>
<album>
<contents>
<content>incorrect</content>
<content>rock</content>
</contents>
</album>
</Body>
Desired result:
I would like to search for the 'Album' node that contains the child element <content>correct</content> and when the node has been found I would like to pick/use the element <content>hardcore</content>. Note that the order of the album nodes is subject to change. So a first() or [1] will not be sufficient.
What I tried:
if (body/album/contents/content[text()='correct']) then ???

If I understand you correctly, you probably don't need xquery for that.
//contents/content[.="correct"]/following-sibling::content
should be enough.

Related

xpath: filter selected nodes based on type of parent node

Here's a sample of the XML I'm dealing with:
<subchapter>
<section>
</section>
</subchapter>
<part>
<section>
</section>
</part>
<part>
<section>
</section>
</part>
<quotedContent>
<section>
</section>
</quotedContent>
I'm trying to filter out certain nodes based on the type of their parents nodes. In other words, I want to find all the <section> nodes NOT in <quotedContent> nodes. There are various other parent nodes in addition to <part> and <subchapter> that I want to be included in my end result. So, it's a matter of excluding just the <quotedContent> nodes. I'm pretty sure its just a matter of getting the xpath string correct.
I'm using R's xml2 package, specifically the xml_find_all() function, as follows:
xml_find_all(ustc, "..//d1:section[parent='part']", ns = xml_ns(ustc))
Based on the above xml example, I would expect to get two nodes -- the first two, not the last one inside the .
Use not(parent::quotedContent) in the predicate e.g. //section[not(parent::quotedContent)]. Or //*[not(self::quotedContent)]/section.

How to get parent element with attribute using xpath

I have posted sample XML and expected output kindly help to get the result.
Sample XML
<root>
<A id="1">
<B id="2"/>
<C id="2"/>
</A>
</root>
Expected output:
<A id="1"/>
You can formulate this query in several ways:
Find elements that have a matching attribute, only ascending all the time:
//*[#id=1]
Find the attribute, then ascend a step:
//#id[.=1]/..
Use the fn:id($id) function, given the document is validated and the ID-attribute is defined as such:
/id('1')
I think it's not possible what you're after. There's no way of selecting a node without its children using XPATH (meaning that it'd always return the nodes B and C in your case)
You could achieve this using XQuery, I'm not sure if this is what you want but here's an example where you create a new node based on an existing node that's stored in the $doc variable.
declare variable $doc := <root><A id="1"><B id="2"/><C id="2"/></A></root>;
element {fn:node-name($doc/*)} {$doc/*/#*}
The above returns <A id="1"></A>.
is that what you are looking for?
//*[#id='1']/parent::* , similar to //*[#id='1']/../
if you want to verify that parent is root :
//*[#id='1']/parent::root
https://en.wikipedia.org/wiki/XPath
if you need not just parent - but previous element with some attribute: Read about Axis specifiers and use Axis "ancestor::" =)

Ruby + Nokogiri + Xpath navigate Node_Set

<Item id="item0">
<Links>
<FirstLink id="link1" target="one"/>
<SecondLink id="link2" target="two"/>
</Links>
<Data>
<String>content</String>
</Data>
</Item>
<Item id="item1">
<Links>
<FirstLink id="link1" target="two"/>
<SecondLink id="link2" target="two"/>
</Links>
<Data>
<String>content</String>
</Data>
</Item>
I have created a Nokogiri-NodeSet with this structure, i.e. a list of items with links and data children.
How can I filter any items that don't match a certain value in the 'target'-attribute of <FirstLink>?
Actually, what I want in the end is to extract the <Data><String>-Content of every <Item> that matches a certain value in it's <FirstLink> "Target"-Attribute.
I've tried several approaches already but I'm at a loss as to how to identify an element by an attribute of it's grandchild, then extracting the content of this grandchild's parent's sibling, X(.
We can build up an XPath expression to do this. Assuming we are starting from the whole XML document, rather than the node-set you already have, something like
//Item
will select all <Item> elements (I’m guessing you already have something like that to get this node-set).
Next, to select only those <Item> elements which have <Links><FirstLink> where FirstLink has a target attribute value of one:
//Item[Links/FirstLink[#target='one']]
and finally to select the Data/String children of those nodes:
//Item[Links/FirstLink[#target='one']]/Data/String
So with Nokogiri you could use something like this (where doc is your parsed document):
doc.xpath("//Item[Links/FirstLink[#target='one']]/Data/String")
or if you want to use the node-set you already have you can use a relative expression:
nodeset.xpath("self::Item[Links/FirstLink[#target='one']]/Data/String")
I completely didn't understand what your goal is. But using a guess, I am trying to show you, how to proceed in this case :
require 'nokogiri'
doc = Nokogiri::XML <<-xml
<Item id="item0">
<Links>
<FirstLink id="link1" target="one"/>
<SecondLink id="link2" target="two"/>
</Links>
<Data>
<String>content1</String>
</Data>
</Item>
<Item id="item1">
<Links>
<FirstLink id="link1" target="two"/>
<SecondLink id="link2" target="two"/>
</Links>
<Data>
<String>content2</String>
</Data>
</Item>
xml
#xpath method with the expression "//Item", will select all the Item nodes. Then those Item nodes will be passed to the #reject method to select only those nodes, that has a node called Links having the target attribute value is "one". If any of the links, either FirstLink or SecondLink has the target attribute value "one", for that nodes grandparent node Item will be selected.
node.at("//Links/FirstLink")['target'] will give you the string say "one" which is a value of target attribute of the node, FirstLink of first Item nodes , then "two" from the second Item node. The part ['any vaue'] in node.at("//Links/FirstLink")['target']['any vaue'] is a call to the String#[] method.
Remember below approach will give you the flexibility of the use regular expression too.
nodeset = doc.xpath("//Item").reject do |node|
node.at("//Links/FirstLink")['target']['any vaue']
end
Now nodeset contains only the required Item nodes. Now I use #map, passing each item node inside it to collect the content of the String node. Then #at method with an expression //Data/String, will select the String node. Then #text, will give you the content of each String node.
nodeset.map { |n| n.at('//Data/String').text } # => ["content1"]

getting XmlSearch to return siblings only, not children

I'm getting a SOAP response that looks like this:
<Activity>
<Id>A</Id>
<Subject>foo</Subject>
<Activity>Task</Activity>
</Activity>
<Activity>
<Id>B</Id>
<Subject>bar</Subject>
<Activity>Appointment</Activity>
</Activity>
<Activity>
<Id>C</Id>
<Subject>snafu</Subject>
<Activity>Task</Activity>
</Activity>
In Coldfusion, I was trying to parse out the Activity nodes with this:
<cfset arrMainNodes = XmlSearch(soapResponse, "//*[name()='Activity']") />
The problem is, instead if getting an array with three elements, I get an array with six: 3 of the parent, and 3 of the children.
I can't for the life of me figure out the XPath statement the will find siblings only, and not children.
Please Help.
Use:
//*[name()='Activity' and not(ancestor::*[name()='Activity' ])]
This selects all elements in the document, whose name is "Activity" and that do not have an ancestor with name "Activity".

XQuery ancestor axis doesn't work, but explicit XPath does

Consider the following XML snippet:
<doc>
<chapter id="1">
<item>
<para>some text here</para>
</item>
</chapter>
</doc>
In XQuery, I have a function that needs to do some things based on the ancestor chapter of a given "para" element that is passed in as a parameter, as shown in the stripped down example below:
declare function doSomething($para){
let $chapter := $para/ancestor::chapter
return "some stuff"
};
In that example, $chapter keeps coming up empty. However, if I write the function similar to the follwing (i.e., without using the ancestor axis), I get the desired "chapter" element:
declare function doSomething($para){
let $chapter := $para/../..
return "some stuff"
};
The problem is that I cannot use explicit paths as in the latter example because the XMl I will be searching is not guaranteed to have the "chapter" element as a grandparent every time. It may be a great-grandparent or great-great-grandparent, and so on, as shown below:
<doc>
<chapter id="1">
<item>
<subItem>
<para>some text here</para>
</subItem>
</item>
</chapter>
</doc>
Does anyone have an explanation as to why the axis doesn't work, while the explicit XPath does? Also, does anyone have any suggestions on how to solve this problem?
Thank you.
SOLUTION:
The mystery is now solved.
The node in question was re-created in another function, which had the result of stripping it of all of its ancestor information. Unfortunately, the previous developer did not document this wonderful, little function and has cost us all a good deal of time.
So, the ancestor axis worked exactly as it should - it was just being applied to a deceptive node.
I thank all of you for your efforts in answering my questions.
The ancestor axis does work fine. I suspect your problem is namespaces. The example you showed and that I ran (below) has XML without any namespaces. If your XML have a namespace then you would need to provide that in the ancestor XPath, like this: $para/ancestor:foo:chapter where in this case the prefix _foo_ is bound to the correct namespace for the chapter element.
let $doc := <doc>
<chapter id="1">
<item>
<para>some text here</para>
</item>
</chapter>
</doc>
let $para := $doc//para
return $para/ancestor::chapter
RESULT:
<?xml version="1.0" encoding="UTF-8"?>
<chapter id="1">
<item>
<para>some text here</para>
</item>
</chapter>
These things almost always boil down to namespaces! As a daignostic to confirm 100% that namespace are not the issue, can you try:
declare function local:doSomething($para) {
let $chapter := $para/ancestor::*[local-name() = 'chapter']
return $chapter
};
This seems surprising to me; which XQuery implementation are you using? With BaseX, the following query...
declare function local:doSomething($para) {
let $chapter := $para/ancestor::chapter
return $chapter
};
let $xml :=
<doc>
<chapter id="1">
<item>
<para>some text here</para>
</item>
</chapter>
</doc>
return local:doSomething($xml//para)
...returns...
<chapter id="1">
<item>
<para>some text here</para>
</item>
</chapter>
I suspect namespaces too. If $para/../.. works but $para/parent::item/parent::chapter turns up empty, then you know it's a question of namespaces.
Look for an xmlns declaration at the top of your content, e.g.:
<doc xmlns="http://example.com">
...
</doc>
In your XQuery, you then need to bind that namespace to a prefix and use that prefix in your XQuery/XPath expressions, like this:
declare namespace my="http://example.com";
declare function doSomething($para){
let $chapter := $para/ancestor::my:chapter
return "some stuff"
};
What prefix you use doesn't matter. The important thing is that the namespace URI (http://example.com in the above example) matches up.
It makes sense that ../.. selects the element you want, because .. is short for parent::node() which selects the parent node regardless of its name (or namespace). Whereas ancestor::chapter will only select <chapter> elements that are not in a namespace (unless you have declared a default element namespace, which is usually not a good idea in XQuery because it affects both your input and your output).

Resources