Im kind of new to xpath, so... sorry in advance if something is not referred to accurately...
I would like to formulate an xpath query which will select the 'uncle' of each element of a specific name:
Say I have the following XML:
<aaa>
<bbb>
<first_uncle>
uncle_bob
</first_uncle>
<ccc>
<ddd>d_val_1</ddd>
</ccc>
<ccc>
<ddd>d_val_2</ddd>
</ccc>
<ccc>
<ddd>d_val_3</ddd>
</ccc>
</bbb>
<bbb>
<first_uncle>
uncle_jack
</first_uncle>
<ccc>
<ddd>d_val_4</ddd>
</ccc>
<ccc>
<ddd>d_val_5</ddd>
</ccc>
</bbb>
</aaa>
I would like to have an output which lists the 'first_uncle' of each ddd.
Something like this:
uncle_bob
uncle_bob
uncle_bob
uncle_jack
uncle_jack
My trials (//ccc/ddd/../../*[1]) gave me a list of 'unique uncles':
uncle_bob
uncle_jack
Thanks!
In XPath 1.0, a single XPath expression can only select a set of actual nodes. It can't select the same nodes multiple times, and there are only two first_uncles in your XML.
So you would need to do this in two steps (pseudocode, since you haven't told us what language or XML library you're using):
var people = doc.select('/aaa/bbb/ccc/ddd');
foreach (var person in people) {
var uncle = person.selectSingle('../../first_uncle');
// use uncle
}
Related
I am new to xpath so I apologize in advance for how basic this question is.
How do I extract just the text from a specific element? For example, how would I extract just "text"
<h1>text</h1>
I tried the following but it seems to select everything including the tags instead of just the text.
//h1/text()
Thanks for your help
`
DocumentBuilderFactory docFactory = DocumentBuilderFactory
.newInstance();
DocumentBuilder docBuilder = docFactory.newDocumentBuilder();
Document doc = docBuilder.parse(new File("src/myFile.xml"));
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
String sessionId = (String) xpath
.evaluate(
"/Envelope/Body/LoginProcessResponse/loginResponse/sessionId",
doc, XPathConstants.STRING);
`
here Envelope is my parent element and i just traversed to the required path(in my case it is sessionid).
Hope it helps
This answer is rather an XSLT answer than an XPath answer, but many of the concepts are nevertheless applicable.
The XPath expression
//h1/text()
seems to be correct. It does select all text() nodes that are direct children of <h1> elements.
But one problem may be, that the XSL default template still copies all the othertext() nodes like described here in the W3C specification:
In the absence of a select attribute, the xsl:apply-templates instruction processes all of the children of the current node, including text nodes.
So to solve your problem, you have to define an explicit template that
ignores all other text() nodes like this:
<xsl:template match="text()" />
If you add this line to your XSL processing, the result will most likely be more pleasant to you.
Im using this Xpath query
//li[contains(#class, 'cmil_header')]/span[contains(#class, 'cmil_theatre')] and the result of this query is:
Park
Saga Tokey
Latvia
Latvia
Skande
Paramount
Paramount
Paramount
Oslo
Oslo
...
I have been searching and i have come to conclusion that there is a option to select unique or distinct nodevalues/items with Xpath. But i can't get it to work.
I have managede to be able to select specific item with //li[contains(#class, 'cmil_header')][1]/span[contains(#class, 'cmil_theatre')] (Park in this case), and i thought //li[contains(#class, 'cmil_header')][distinct-values()]/span[contains(#class, 'cmil_theatre')] would work, but not.
My question:
How would my query be to reproduce:
Park
Saga Tokey
Latvia
Skande
Paramount
Oslo
...
Edit: pastabin with sample
http://pastebin.com/a3x7hRFu
XPath 1.0 solution (where there is no distinct-values function) that relies on the duplicates being sequential:
//li[contains(#class, 'cmil_header')]/span[contains(#class, 'cmil_theatre') and (not(../preceding-sibling::li[contains(#class, 'cmil_header')]) or ../preceding-sibling::li[contains(#class, 'cmil_header')][1]/span[contains(#class, 'cmil_theatre')]/text() != ./text())]
find all li nodes that contain the cmil_header class: //li[contains(#class, 'cmil_header')]
find the child span nodes that contain the cmil_theatre class: /span[contains(#class, 'cmil_theatre') and
where there is no previous li node containing the cmil_header class: (not(../preceding-sibling::li[contains(#class, 'cmil_header')])
or the previous li node containing the cmil_header class has a span node child that contains the cmil_theatre class: or ../preceding-sibling::li[contains(#class, 'cmil_header')][1]/span[contains(#class, 'cmil_theatre')]
and the text content of that span is not the same as the text content of... : /text() !=
...this span: ./text())]
i thought //li[contains(#class, 'cmil_header')][distinct-values()]/span[contains(#class, 'cmil_theatre')] would work, but not.
No, there is no way this could work. I find it hard to know what you were imagining. The most basic error is that distinct-values() expects an argument. More subtly, you really don't seem to have understood how predicates (expressions in square brackets) work.
What would work -- assuming your XPath processor supports XPath 2.0 -- is
distinct-values(//li[contains(#class, 'cmil_header')]/
span[contains(#class, 'cmil_theatre')])
With the next xml, how coud i get the list of directors where two directors has the same LastName in one movie?
<MoviesLib>
<Movie Title="Batman" Year="2013">
<Directors>
<Director>
<Name>Robert</Name>
<LastName>Zemeckis</LastName>
</Director>
</Directors>
</Movie>
<Movie Title="Gru" Year="2012">
<Directors>
<Director>
<Name>john</Name>
<LastName>tailer</LastName>
</Director>
<Director>
<Name>Emma</Name>
<LastName>Smith</LastName>
</Director>
<Director>
<Name>Lana</Name>
<LastName>Smith</LastName>
</Director>
</Directors>
</Movie>
</MoviesLib>
for example in this case would be: Emma Smith, Lana Smith
thanks
The following XPath 2.0 expression should work:
for $d in //Director
return $d[../Director[not(. is $d) and LastName = $d/LastName]]
I can't come up with a single XPath 1.0 expression since it doesn't support for expressions (see the question How to get the context of outer predicate? for some background).
I'm trying to test if an attribute on an ancestor of an element not equal a string.
Here is my XML...
<aaa att="xyz">
<bbb>
<ccc/>
</bbb>
</aaa>
<aaa att="mno">
<bbb>
<ccc/>
</bbb>
</aaa>
If I'm acting on element ccc, I'm trying to test that its grandparent aaa #att doesn't equal "xyz".
I currently have this...
ancestor::aaa[not(contains(#att, 'xyz'))]
Thanks!
Assuming that by saying an ancestor of an element you're referring to an element with child elements, this XPath expression should do:
//*[*/ccc][#att != 'xyz']
It selects
all nodes
that have at least one <ccc> grandchild node
and that have an att attribute whose value is not xyz.
Update: Restricted test to grandparents of <ccc>.
Update 2: Adapted to your revised question:
//ccc[../parent::aaa/#att != 'xyz']
Selects
all <ccc> elements
that have a grandparent <aaa> with its attribute att set to a value that is not xyz
I have XML like this:
<AAA>
<BBB aaa="111" bbb="222">
<CCC/>
<CCC xxx="555" yyy="666" zzz="777"/>
</BBB>
<BBB aaa="999">
<CCC xxx="qq"/>
<DDD xxx="ww"/>
<EEE xxx="oo"/>
</BBB>
<BBB>
<DDD xxx="oo"/>
</BBB>
</AAA>
I want to get first <CCC> element. But with XPath expression //*/CCC[1] I have got two <CCC> elements. Each of them is the first elemet in <BBB></BBB> context. How to get first element in subset?
This one should work for you:
(//*/CCC)[1]
I want to get first element. But with
XPath expression //*/CCC[1] I have
got two elements. Each of them is the
first elemet in <BBB></BBB> context.
How to get first element in subset?
This is a FAQ:
The [] operator has a higher precedence (binds stronger) than the // abbreviation.
Use:
(//CCC)[1]
This selects the first (in document order) CCC element in the XML document.