Select XML Node by position - xpath

I have the following XML structure
<Root>
<BundleItem>
<Item>1</Item>
<Item>2</Item>
<Item>3</Item>
</BundleItem>
<Item>4</Item>
<Item>5</Item>
<Item>6</Item>
<BundleItem>
<Item>7</Item>
<Item>8</Item>
<Item>9</Item>
</BundleItem>
</Root>
And by providing the following xPath
//Item[1]
I am selecting
<Item>1</Item>
<Item>4</Item>
<Item>7</Item>
My goal is to select only <Item>1</Item> or <Item>7</Item> regardless of the parent element where they are found and only depending on the position, which i am providing in the xPath.
Is it possible to do that only by using the position and without providing additional criterias in the xPath ?

//Item[1] selects the all the first child elements that are <Item/> regardless of their parent.
To get the two items you are looking for you could use //Item[text() = 1 or text() = 7].
A good tutorial can be found at w3schools.com and you can play with XPath expressions over your XML input here. (I am not affiliated with either of these resources but find them useful.)

Related

XPath to retrive XML tag value without Namespace and prefix

I have the following XML -
<d><m:properties xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata" xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices">
<d:AllTexts/>
<d:BomFlag/>
<d:OrderNumber>9489</d:OrderNumber>
<d:LineNumber>000000</d:LineNumber>
<d:VcFlag>Y</d:VcFlag>
<d:PricingFlag/>
<d:TextType>H</d:TextType>
<d:TextId>ZC01</d:TextId>
<d:TextLineNo>1</d:TextLineNo>
<d:TextLine>ecom header text 1</d:TextLine>
and trying to retrieve the TextLine nodelist as based on TextId = ZC01 -
<TextLine>ecom header text1</TextLine>
when I applied the xpath as --> //m:properties[d:TextId = 'ZC01']/d:TextLine
I get the output as -
<d:TextLine xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices">ecom header text 1</d:TextLine>
how can I remove the prefix and namespace? I tried using local-name(), but that didn't work
May be used it wrong way.
Thank you for your help!
Thanks
Sugata
XPath is a selection language: it can only retrieve nodes that are actually there, it can't change them in any way. If the selected element has a prefix and namespace in the original, then it will have a prefix and namespace in the result.
However, you need to distinguish what the XPath selects (a node) from the way it the result is displayed. This depends on the application that is evaluating the XPath. The two popular ways of displaying a node selected by an XPath expression are (a) by serialising the node as XML (which is what we see in your case), and (b) by showing a path to the selected node, such as /d/m:properties/d:TextLine. You haven't told us how you are evaluating the XPath expression or displaying its result, and you may have options here.
But perhaps you should consider XSLT or XQuery, which (unlike XPath) allow you to construct new XML that differs from your original.

XPATH Select All Attributes attr Except One On Specific Element elem

I was selecting all attributes id and everything was going nicely then one day requirements changed and now I have to select all except one!
Given the following example:
<root>
<structs id="123">
<struct>
<comp>
<data id="asd"/>
</comp>
</struct>
</structs>
</root>
I want to select all attributes id except the one at /root/structs/struct/comp/data
Please note that the Xml could be different.
Meaning, what I really want is: given any Xml tree, I want to select all attributes id except the one on element /root/structs/struct/comp/data
I tried the following:
//#id[not(ancestor::struct)] It kinda worked but I want to provide a full xpath to the ancestor axis which I couldn't
//#id[not(contains(name(), 'data'))] It didn't work because name selector returns the name of the underlying node which is the attribute not its parent element
The following should achieve what you're describing:
//#id[not(parent::data/parent::comp/parent::struct/parent::structs/parent::root)]
As you can see, it simply checks from bottom to top whether the id attribute's parent matches the path root/structs/struct/comp/data.
I think this should be sufficient for your needs, but it does not 100% ensure that the parent is at the path /root/structs/struct/comp/data because it could be, for example, at the path /someOtherHigherRoot/root/structs/struct/comp/data. I'm guessing that's not a possible scenario in your XML structure, but if you had to check for that, you could do this:
//#id[not(parent::data/parent::comp/parent::struct/parent::structs/parent::root[not(parent::*)])]

Xpath child of multiple types

I have this xpath:
.//*[#id='some_id']/td//div
and now I want to select any child of the div that is of certain type, for example every child that is either a label or span. Something like this
.//*[#id='some_id']/td//div/(label|span)/.......
but that is not valid xpath. How can I do that (wthout writing two full xpaths for the given 2 example for child types)
descendant:: finds on all level below, to find only children use
.//*[#id='some_id']/td//div/*[self::label or self::span]
you need to use
descendant::
to select child elements of particular element. look at the below example,
.//*[#id='some_id']/td//div/descendant::label[#class='some-class']
the above xpath will get all label with class "some-class" which is actually the child of ".//*[#id='some_id']/td//div/" element.
to find multiple child elements then use below xpath,
.//*[#id='some_id']/td//div/descendant::*[local-name()='label' or local-name='span']

xpath - matching value of child in current node with value of element in parent

Edit: I think I found the answer but I'll leave the open for a bit to see if someone has a correction/improvement.
I'm using xpath in Talend's etl tool. I have xml like this:
<root>
<employee>
<benefits>
<benefit>
<benefitname>CDE</benefitname>
<benefit_start>2/3/2004</benefit_start>
</benefit>
<benefit>
<benefitname>ABC</benefitname>
<benefit_start>1/1/2001</benefit_start>
</benefit>
</benefits>
<dependent>
<benefits>
<benefit>
<benefitname>ABC</benefitname>
</benefit>
</dependent>
When parsing benefits for dependents, I want to get elements present in the employee's
benefit element. So in the example above, I want to get 1/1/2001 for the dependent's
start date. I want 1/1/2001, not 2/3/2004, because the dependent's benefit has benefitname ABC, matching the employee's benefit with the same benefitname.
What xpath, relative to /root/employee/dependent/benefits/benefit, will yield the value of
benefit_start for the benefit under parent employee that has the same benefit name as the
dependent benefit name? (Note I don't know ahead of time what the literal value will be, I can't just look for 'ABC', I have to match whatever value is in the dependent's benefitname element.
I'm trying:
../../../benefits/benefit[benefitname=??what??]/benefit_start
I don't know how to refer to the current node's ancestor in the middle of
the xpath (since I think "." at the point I have ??what?? will refer to
the benefit node of the employee/benefits.
EDIT: I think what I want is "current()/benefitname" where the ??what?? is. Seems to work with saxon, I haven't tried it in the etl tool yet.
Your XML is malformed, and I don't think you've described your siduation very well (the XPath you're trying has a bunch of ../../s at the beginning, but you haven't said what the context node is, whether you're iterating through certain nodes, or what.
Supposing the current context node were an employee element, you could select benefit_starts that match dependent benefits with
benefits/benefit[benefitname = ../../dependent/benefits/benefit/benefitname]
/benefit_start
If the current context node is a benefit element in a dependents section, and you want to get the corresponding benefit_start for just the current benefit element, you can do:
../../../benefits/benefit[benefitname = current()/benefitname]/benefit_start
Which is what I think you've already discovered.

XPath concat multiple nodes

I'm not very familiar with xpath. But I was working with xpath expressions and setting them in a database. Actually it's just the BAM tool for biztalk.
Anyway, I have an xml which could look like:
<File>
<Element1>element1<Element1>
<Element2>element2<Element2>
<Element3>
<SubElement>sub1</SubElement>
<SubElement>sub2</SubElement>
<SubElement>sub3</SubElement>
<Element3>
</File>
I was wondering if there is a way to use an xpath expression of getting all the SubElements concatted? At the moment, I am using:
/*[local-name()='File']/*[local-name()='Element3']/*[local-name()='SubElement']
This works if it only has one index. But apparently my xml sometimes has more nodes, so it gives NULL. I could just use
/*[local-name()='File']/*[local-name()='Element3']/*[local-name()='SubElement'][0]
but I need all the nodes. Is there a way to do this?
Thanks a lot!
Edit: I changed the XML, I was wrong, it's different, it should look like this:
<item>
<element1>el1</element1>
<element2>el2</element2>
<element3>el3</element3>
<element4>
<subEl1>subel1a</subEl1>
<subEl2>subel2a</subEl2>
</element4>
<element4>
<subEl1>subel1b</subEl1>
<subEl2>subel2b</subEl2>
</element4>
</item>
And I need to have a one line code to get a result like: "subel2a subel2b";
I need the one line because I set this xpath expression as an xml attribute (not my choice, it's specified). I tried string-join but it's not really working.
string-join(/file/Element3/SubElement, ',')
/File/Element3/SubElement will match all of the SubElement elements in your sample XML. What are you using to evaluate it?
If your evaluation method is subject to the "first node rule", then it will only match the first one. If you are using a method that returns a nodeset, then it will return all of them.
You can get all SubElements by using:
//SubElement
But this won't keep them grouped together how you want. You will want to do a query for all elements that contain a SubElement (basically do a search for the parent of any SubElements).
//parent::SubElement
Once you have that, you could (depending on your programming language) loop through the parents and concatenate the SubElements.

Resources