xpath get first element based on condition - xpath

I need my xpath expression to select only the first child element of an xml file based on condition. Say the first having field1=B.
I use this expression but it return that with field1=A.
<root>
<entry>
<field1>A</field1>
<field2>10</field2>
</entry>
<entry>
<field1>B</field1>
<field2>20</field2>
</entry>
/root/entry[//field1='B' or 'C'][1]
How can I do it?

It should be
/root/entry[.//field1='B' or .//field1='C'][1]
Note that entry[//field1='B'][1] means return first entry node if field1 node with value 'B' exists (anywhere in XML) while entry[.//field1='B'][1] means return first entry node if it has a descendant field1 node with value 'B'
Also you can simplify expression as
/root/entry[field1='B' or field1='C'][1]
if field1 always appears as direct child of entry

Related

Predicates: how is the expression nodeName='text' evaluated?

In this xpath:
/A/B[C='hello']
Is C="hello" some kind of syntactic shortcut for C[text()='hello']? Is it documented anywhere?
Edit: Okay, I discovered one difference: C= returns all the text nodes in C and C's children, while C[text()= returns only the text nodes in C.
Now, suppose I have the XML:
<root>
<A>
<B>
<C>hello<E>EEE</E>world</C>
<D>world</D>
</B>
<B>
<C>goodbye</C>
<D>mars</D>
</B>
</A>
</root>
How would I choose the B node containing the first C node using the syntax C[text()=? I can get the B node using the C= syntax like this:
/root/A/B[C="helloEEEworld"]
But this doesn't work:
/root/A/B[C[text()="helloworld"]]
nor do these:
/root/A/B[C[text()="hello world"]]
/root/A/B[C[text()="helloEEEworld"]]
Hmmm...this works:
/root/A/B[C[text()="hello"]]
Why is that? Does text() only return the first text node? According to the W3C, text() returns all text node children of the context node.
text() really returns all text node children as list of nodes
When you use /root/A/B[C[text()="hello"]] you mean fetch B node with C child that any direct child node is equal to "hello".
In the same way you can match it by :
/root/A/B[C[text()="world"]]
or explicitly specify that you want to get node by exact first or second direct child text node:
/root/A/B[C[text()[1]="hello"]]
/root/A/B[C[text()[2]="world"]]
If you want to match required node by its complete text content you can use
/root/A/B[C[.="helloEEEworld"]]
or
/root/A/B[C="helloEEEworld"]
C in the predicate expression [C='hello'] returns all C elements that is direct child of context element which is B. So the entire predicate is a boolean expression that contains comparison between a node-set and a string (notice that element is a type of node in XPath data model), and behavior of this case is documented in the spec as follows :
If one object to be compared is a node-set and the other is a string, then the comparison will be true if and only if there is a node in the node-set such that the result of performing the comparison on the string-value of the node and the other string is true. If one object to be compared is a node-set and the other is a boolean, then the comparison will be true if and only if the result of performing the comparison on the boolean and on the result of converting the node-set to a boolean using the boolean function is true. [source]
C='hello' in /A/B[C='hello'] will be evaluated to true if any of the C elements, after converted to string, equals 'hello'. So it is more of a shortcut for C[string()='hello'] if you will.
"Hmmm...this works:
/root/A/B[C[text()="hello"]]
Why is that? Does text() only return the first text node? According to the W3C, text() returns all text node children of the context node."
Instead of the first text node, text() in this context returns all direct child text nodes. This is because child:: is the default axis in XPath. Contrasts your XPath with the equivalent verbose version of it :
/child::root/child::A/child::B[child::C[child::text()="hello"]]

Xpath 1.0: Cannot get the value of a specific element. Duplicate values returned

I have the following piece of XML:
<Resource>
<ResourceSummaryBag>
<Entry>
<ObjectName>NR_LDI</ObjectName>
<usage_prm>
<Entry>
<UsedEntries>98416</UsedEntries>
</Entry>
</usage_prm>
</Entry>
<Entry>
<ObjectName>R_LDI</ObjectName>
<usage_prm>
<Entry>
<UsedEntries>13265</UsedEntries>
</Entry>
</usage_prm>
</Entry>
</ResourceSummaryBag>
</Resource>
I want to extract usage_prm/UsedEntries values by ObjectName element.
If I use contains, I get duplicate values for 'R_LDI' object but I want the values for the specific R_LDI and NR_LDI objects.
Resource/ResourceSummaryBag/Entry[./ObjectName[contains(.,'R_LDI')]]/usage_prm/Entry/UsedEntries
Result: 98416 13265
Any solution?
Thanks in advance for any help.
Use = + normalize-space() instead of contains():
./ObjectName[normalize-space()='R_LDI']
normalize-space() strips leading and trailing white-space from a string, replaces sequences of whitespace characters by a single space, and returns the resulting string [MDN]

Simplify specific XPath expression

I would like to know if the following XPath expression can be simplified:
//map[requester/#type='2' and requester/code]
Some test data:
<root>
<map>
<requester type="2">
<code>a</code>
<code>b</code>
</requester>
</map>
...
</root>
My objective is to get only map elements which have at least one requester with type attribute and value '2' and also have at least one code element.
For your use case, this is probably as simple as it could be. However, it doesn't match what you are describing doing.
Here you are selecting map elements where
There is a requester element with type attribute equal to 2
There is a requester element with a code element
The requester elements in (1) and (2) are not necessarily the same
For example, the map element in the following is selected:
<root>
<map>
<requester type="2"/>
<requester>
<code>a</code>
</requester>
</map>
</root>
If you want the elements in (1) and (2) to be the same, you should use (simplified slightly at the suggestion of kjhughes)
//map[requester[#type='2']/code]
Here we select all map elements which have a requester element which in turn has an attribute type with a value of 2 and a code element.

How to get content from next node

I have an XML below -
<document>
<node name="Node 0 Text here" ID="01" >aa
</node>
<node name="Node 1 Text here" ID="11">bb
</node>
<node name="Node 2 Text here" ID="12">cc
</node>
<node name="Node 3 Text here" ID="22">dd
</node>
<node name="Node 4 Text here" ID="23">ee
</node>
</document>
I need to search content in a particular node within this XML.
If search keyword does not exist in that node, then I have to begin searching from the next node of current node, you could say sibling.
If that keyword does not exist in all the nodes after the current node then it should begin search from start..
I have to achieve this in my code behind- dotnet class. I have used -
XmlNodeList xmlNodes = xd.SelectNodes("//12/following-sibling::*");
Here, 12 refers to nodeid of the current node,which will be passed as an argument. But I am getting error.
Any help is appreciated.
I need to search content in a particular node within this XML
to get a node matching by its content, the XPath is:
node[contains(text(),'aa')]
This will return the first node for example and any other node whose content text contains aa.
If search keyword does not exist in that node, then I have to begin searching from the next node of current node, you could say sibling. If that keyword does not exist in all the nodes after the current node then it should begin search from start.
This sentence does not make much sense to XPath. The expression above will return all nodes matching the keyword. If you want the first matched node you can get it from the XmlNodeList after or directly from the XPath expression changing it to:
node[contains(text(),'aa')][1]
12 refers to nodeid of the current node,which will be passed as an argument
That's not correct. To select the node by id you should use, for instance:
node[#id=12]/text()
This will get the content of the node with id=12.
Use:
(/*/node[ID='12']/following-sibling::*[contains(.,$pattern)][1]
|
/*/node[ID='12']/preceding-sibling::*[contains(.,$pattern)][1]
)
[last()]
This expression selects the last from the two wanted selections -- the first of the following siblings that contains the value of $pattern and the first of the preceding siblings that contains the value of $pattern.
You need to substitute $pattern with the exact value you want to serch for.

XQuery return text node if it contains given keyword

A test sample of my xml file is shown below:
test.xml
<feed>
<entry>
<title>Link ISBN</title>
<libx:libapp xmlns:libx="http://libx.org/xml/libx2" />
</entry>
<entry>
<title>Link Something</title>
<libx:module xmlns:libx="http://libx.org/xml/libx2" />
</entry>
</feed>
Now, I want to write an xquery which will find all <entry> elements which have <libx:libapp> as a child. Then, for all such entries return the title if the title contains a given keyword (such as Link). So, in my example xml document the xquery should return "Link ISBN".
My sample xquery is shown below:
samplequery.xq (here doc_name is the xml file shown above and libapp_matchkey is a keyword such as 'Link')
declare namespace libx='http://libx.org/xml/libx2';
declare variable $doc_name as xs:string external;
declare variable $libpp_matchkey as xs:string external;
let $feeds_doc := doc($doc_name)
for $entry in $feeds_doc/feed/entry
(: test whether entry has libx:libapp child and has "Link" in its title child :)
where ($entry/libx:libapp and $entry/title/text()[contains(.,$libapp_matchkey)])
return $entry/title/text()
This xquery is returning null instead of the expected result 'Link ISBN'. Why is that?
I want to write an xquery which will
find all elements which have
as a child. Then, for
all such entries return the title if
the title contains a given keyword
(such as Link).
Just use:
/*/entry[libx:libapp]/title[contains(.,'Link')]/text()
Wrapping this XPath expression in XQuery we get:
declare namespace libx='http://libx.org/xml/libx2';
/*/entry[libx:libapp]/title[contains(.,'Link')]/text()
when applied on the provided XML document:
<feed>
<entry>
<title>Link ISBN</title>
<libx:libapp xmlns:libx="http://libx.org/xml/libx2" />
</entry>
<entry>
<title>Link Something</title>
<libx:module xmlns:libx="http://libx.org/xml/libx2" />
</entry>
</feed>
the wanted, correct result is produced:
Link ISBN

Resources