How to get nodes using xpath - xpath

When I have 2 set of nodes with same element name for ex :
<contacts>
<names>
...
</names>
<names>
...
</names>
</contacts>
Normally I'd use //contacts/names to get the node, but how do I do if they have the same name how do I get second or first or nth?

For the provided XML document use:
/contacts/names[1]
the above selects the first names element.
/contacts/names[2]
the above selects the second names element.
Try to avoid using the // abbreviation as much as possible, because it is usually grossly inefficient, causes all the (sub)tree roted in the context node to be traversed.

You can do this to get the first and/or second specifically:
//contacts/names[1]
//contacts/names[2]

Use //contacts/names[n] to get the nth names node. For example: //contacts/names[1] gets the first names node while //contacts/names[2] gets the second names node, etc.

Related

XPATH Select All Attributes attr Except One On Specific Element elem

I was selecting all attributes id and everything was going nicely then one day requirements changed and now I have to select all except one!
Given the following example:
<root>
<structs id="123">
<struct>
<comp>
<data id="asd"/>
</comp>
</struct>
</structs>
</root>
I want to select all attributes id except the one at /root/structs/struct/comp/data
Please note that the Xml could be different.
Meaning, what I really want is: given any Xml tree, I want to select all attributes id except the one on element /root/structs/struct/comp/data
I tried the following:
//#id[not(ancestor::struct)] It kinda worked but I want to provide a full xpath to the ancestor axis which I couldn't
//#id[not(contains(name(), 'data'))] It didn't work because name selector returns the name of the underlying node which is the attribute not its parent element
The following should achieve what you're describing:
//#id[not(parent::data/parent::comp/parent::struct/parent::structs/parent::root)]
As you can see, it simply checks from bottom to top whether the id attribute's parent matches the path root/structs/struct/comp/data.
I think this should be sufficient for your needs, but it does not 100% ensure that the parent is at the path /root/structs/struct/comp/data because it could be, for example, at the path /someOtherHigherRoot/root/structs/struct/comp/data. I'm guessing that's not a possible scenario in your XML structure, but if you had to check for that, you could do this:
//#id[not(parent::data/parent::comp/parent::struct/parent::structs/parent::root[not(parent::*)])]

Choosing specific element in XPath

I got 2 elements under the same name "reason". When i'm using //*:reason/text() it gives me both of the elements, but i need the first one. (not the one inside "details"). please help..
<xml xmlns:gob="http://osb.yes.co.il/GoblinAudit">
<fault>
<ctx:fault xmlns:ctx="http://www.bea.com/wli/sb/context">
<ctx:errorCode>BEA-382500</ctx:errorCode>
<ctx:reason>OSB Service Callout action received SOAP Fault response</ctx:reason>
<ctx:details>
<ns0:ReceivedFaultDetail xmlns:ns0="http://www.bea.com/wli/sb/stages/transform/config">
<ns0:faultcode xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">soapenv:Server</ns0:faultcode>
<ns0:faultstring>BEA-380001: Internal Server Error</ns0:faultstring>
<ns0:detail>
<con:fault xmlns:con="http://www.bea.com/wli/sb/context">
<con:errorCode>BEA-380001</con:errorCode>
<con:reason>Internal Server Error</con:reason>
<con:location>
<con:node>RouteTo_FinancialControllerBS</con:node>
<con:path>response-pipeline</con:path>
</con:location>
</con:fault>
</ns0:detail>
</ns0:ReceivedFaultDetail>
</ctx:details>
<ctx:location>
<ctx:node>PipelinePairNode2</ctx:node>
<ctx:pipeline>PipelinePairNode2_request</ctx:pipeline>
<ctx:stage>set maintain offer</ctx:stage>
<ctx:path>request-pipeline</ctx:path>
</ctx:location>
</ctx:fault>
</fault>
</xml>
You are using the // qualifier which will descend into any subtree and find all occurences of reason. You can try to be more specific about the subpath:
//fault/*:fault/*:reason/text()
This will only match the outer reason but not the inner reason..
"...but i need the first one"
You can use position index to get the first matched reason element :
(//*:reason)[1]/text()
" (not the one inside "details")"
The above can be expressed as finding reason element which doesn't have ancestor details :
//*:reason[not(ancestor::*:details)]/text()
For a large XML document, using more specific path i.e avoid // at the beginning, would results in a more efficient XPath :
/xml/fault/*:fault/*:reason/text()
But for a small XML, it's just a matter of personal preference, since the improvement is likely to be negligible.

immediately preceding-sibling must contain attribute

Here is my XML file:
<w type="fruit-hard">apple</w>
<w type="fruit-soft">orange</w>
<w type="vegetable">carrot</w>
I need to find carrot's immediately preceding sibling whose type is fruit-soft. In Chrome (locally loaded XML file), when I try
$x("//w[#type='vegetable']/preceding-sibling::w[1]")
I get "orange" element node like I want, but how do I require that its type be "fruit-soft"? My attempt (below) returns "false."
$x("//w[#type='vegetable']/preceding-sibling::w[1] and preceding-sibling::w[#type='fruit-soft']")
Your original XPath ...
//w[#type='vegetable']/preceding-sibling::w[1]
... is equivalent to
//w[#type='vegetable']/preceding-sibling::w[position()=1]
. You can add additional criteria to the predicate as needed:
//w[#type='vegetable']/preceding-sibling::w[position()=1 and #type='fruit-soft']
Or you can add an add a separate predicate
//w[#type='vegetable']/preceding-sibling::w[1][#type='fruit-soft']
Note that this attempt:
//w[#type='vegetable']/preceding-sibling::w[1] and preceding-sibling::w[#type='fruit-soft']
returns false because the parts on either side of the and are evaluated separately, converted to type boolean, and combined to yield the final result. Supposing that the context node against which that is evaluated is the document root, there will never be a node matching preceding-sibling::w[#type='fruit-soft']. Moreover, even if there were such a node, that expression does not require nodes matching the first part to be the same ones that matches the second part.

using xpath need to get to a child and get another node one level up

I am trying traverse through an XML with XPath. I want to visit /group/isRequired[text()='Optional'] and travel one level up to grab the /bool node
I tried a few things like the below but can't seem to get it rit... appreciate any inputs.
I basically want to verify the Library node, group+isRequired node and the bool nodes in one statement.
//root/sample[library[text()='2']]/group/isRequired[text()='Optional']//bool[text()='true']
//root/sample[library[text()='2']]/group/isRequired[text()='Optional']../bool[text()='true']
//root/sample[library[text()='2']]/group/isRequired[text()='Optional']/bool[text()='true']
//root/sample[library[text()='2']]/group/isRequired[text()='Optional']/../bool[text()='true']
<root>
<sample>
<id>1</id>
<library>2</library>
<ruleName>Default</ruleName>
<group>
<groupID>1</groupID>
<groupName>orange</groupName>
<isRequired>Optional</isRequired>
</group>
<variant>1</variant>
<bool>true</bool>
</sample>
</root>
You need to move two steps up:
/root/sample[library[text()='2']]/group/isRequired[text()='Optional']/../../bool[text()='true']
But is much cleaner to put multiple conditions in one predicate:
/root/sample[library[text()='2'] and group/isRequired[text()='Optional'] and bool[text()='true']]
Simpler:
/root/sample[library = "2" and group/isRequired = "Optional" and bool = "true"]
You don't have to use /text() to get the value of every node in the XPath. Depending on whether you XML has a schema, you don't need to put the literal values in quotes. Without it, everything is a string value, so I put them in quotes just for safety.
You can go a different route, by filtering sample node by group/isRequired child, then you can continue from that sample node to get to the bool node :
//root/sample[library='2' and group/isRequired='Optional']/bool[.='true']

xpath - matching value of child in current node with value of element in parent

Edit: I think I found the answer but I'll leave the open for a bit to see if someone has a correction/improvement.
I'm using xpath in Talend's etl tool. I have xml like this:
<root>
<employee>
<benefits>
<benefit>
<benefitname>CDE</benefitname>
<benefit_start>2/3/2004</benefit_start>
</benefit>
<benefit>
<benefitname>ABC</benefitname>
<benefit_start>1/1/2001</benefit_start>
</benefit>
</benefits>
<dependent>
<benefits>
<benefit>
<benefitname>ABC</benefitname>
</benefit>
</dependent>
When parsing benefits for dependents, I want to get elements present in the employee's
benefit element. So in the example above, I want to get 1/1/2001 for the dependent's
start date. I want 1/1/2001, not 2/3/2004, because the dependent's benefit has benefitname ABC, matching the employee's benefit with the same benefitname.
What xpath, relative to /root/employee/dependent/benefits/benefit, will yield the value of
benefit_start for the benefit under parent employee that has the same benefit name as the
dependent benefit name? (Note I don't know ahead of time what the literal value will be, I can't just look for 'ABC', I have to match whatever value is in the dependent's benefitname element.
I'm trying:
../../../benefits/benefit[benefitname=??what??]/benefit_start
I don't know how to refer to the current node's ancestor in the middle of
the xpath (since I think "." at the point I have ??what?? will refer to
the benefit node of the employee/benefits.
EDIT: I think what I want is "current()/benefitname" where the ??what?? is. Seems to work with saxon, I haven't tried it in the etl tool yet.
Your XML is malformed, and I don't think you've described your siduation very well (the XPath you're trying has a bunch of ../../s at the beginning, but you haven't said what the context node is, whether you're iterating through certain nodes, or what.
Supposing the current context node were an employee element, you could select benefit_starts that match dependent benefits with
benefits/benefit[benefitname = ../../dependent/benefits/benefit/benefitname]
/benefit_start
If the current context node is a benefit element in a dependents section, and you want to get the corresponding benefit_start for just the current benefit element, you can do:
../../../benefits/benefit[benefitname = current()/benefitname]/benefit_start
Which is what I think you've already discovered.

Resources