XPath with MSXML, "scope" of XPath-expressions - xpath

i have an understanding-problem using Microsoft XML Core Services 6.0 (MSXML) with XPath-expressions.
I´ve broken down the problem to the most simple case. So let´s take the following XML-File:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<element name="E1A1">
<subEle value="1a"/>
<subEle value="1b"/>
<subEle value="1c"/>
</element>
<element name="E2A1">
<subEle value="2a"/>
<subEle value="2b"/>
<subEle value="3b"/>
</element>
<element name="E3A1">
<subEle value="3a"/>
<subEle value="3b"/>
<subEle value="3c"/>
</element>
</root>
I want to get the "value"-attribues per "element". I will use pseudo-code to describe
my problem and i will focus on the important things, so i will not write how i initialize
the Msxml2.DOMDocument variable etc. .
First, i get all "element"-nodes that have a name-attribute:
oNodeList = oDom.selectNodes("//element[#name]")
The result of the selectNodes-statement is a nodelist, where i access the items node by node
in a for-loop. In this loop, i execute another selectNodes-statement, that gives me (at least i thought so)
the "subEle"s for each "element":
for i from 1 to oNodeList.length
oNodeMain = oNodeList.nextNode()
oNodeResList = oNodeMain.selectNodes("//subEle")
msgInfo("n items", oNodeResList.length)
endFor
And here comes the problem: the selectNodes-statement in the loops seems to have ALL "subEle"s
in scope; the messagebox pops up three times, telling me the length of the nodelist is 9.
I would have expected that it pops up 3 times, telling me each time that the nodelist has a length of 3 (because
every "element" has exactly 3 "subEle"s), since i´m doing the selectNodes-statement on "oNodeMain",
which gets the nextNode in each loop. Maybe i just need to modify XPath-expression in the loop and
don´t use the "//", because it works then, but i have no idea why.
The program i use for this is Paradox 11 and i use MSXML by OLE.
Is this behaviour "normal", where is my misunderstanding? Any suggestions on how to achieve what i´m
trying are welcome.

Don't use an absolute path starting with /, instead use a relative path i.e. oNodeMain.selectNodes("subEle") selects all subEle child elements of oNodeMain and oNodeMain.selectNodes(".//subEle") selects all descendant subEle elements of oNodeMain.
Your path starting with // searches from the root node (also called document node).

Related

BizTalk Mapping Fields to a Sequence

I am getting my hands on BizTalk and VS. My input schema looks something similar to this.
<root>
<order>
<orderid>
<orderdate>
...
...
and the output schema
<order>
<header:sequence>
<element name="orderid">
<element name="orderdate">
...
...
</header:sequence>
In short, in output, the header is a sequence of complex types and individual nodes in the source are enumerated as the sequence in the output.
How do we solve this in Visual Studio?
What you need to do is having a looping functoid that goes from each of the element being mapped and to the repeating destination element. And then two links from the source elements the first that is a standard link Copy text value, the second that goes to the name attribute, for which you change the link to Copy name.
Input
<root>
<order>
<orderid>1234567890</orderid>
<orderdate>2020-01-28</orderdate>
</order>
</root>
Output
<order>
<header>
<element name="orderid">1234567890</element>
<element name="orderdate">2020-01-28</element>
</header>
</order>
Note: You can change the order of what is output by using the reorder inputs in the Configure Looping Functoid.

Finding the position index of a comment()

Faced with this:
<div>
some text
<!-- this is the hook comment-->
target part 1
target part 2
<!-- this is another comment-->
some other text
</div>
I'm trying to get to the desired output of:
target part 1
target part 2
The number of comments and text elements is unknown, but the target text always comes after the comment containing hook. So the idea is to find the position() of the relevant comment(), and get the next element.
There are some previous questions about finding the position of an element containing a certain text or by attribute, but comment() is an odd duck and I can't modify the answers there to this situation. For example, trying a variation on the answers:
//comment()[contains(string(),'hook')]/preceding::*
or using preceding-sibling::*, returns nothing.
So I decided to try something else. A count(//node()) of the xml returns 6. And //node()[2] returns the relevant comment(). But when I try to get the position of that comment by using index-of() (which should return 2)
index-of(//node(),//comment()[contains(string(),'hook')])
it returns 3!
Of course, I can disregard that and use the 3 index position as the position for the target text (instead of incrementing 2 by 1), but I was wondering, first, why is the outcome what it is and, second, does it have any unintended consequences.
There is no need to firstly find the position() of the elements if you want to get the nodes between two comments (FYI position() depends on the whole nodeset you selected).
You can get the elements directly - here they are text() nodes. So a sample file like
<?xml version="1.0" encoding="UTF-8"?>
<root>
<div>
some text
<!-- this is the hook comment-->
target part 1
target part 2
<!-- this is another comment-->
some other text
<!-- this is another comment-->
no one needs this
<!-- this is another comment-->
this is also useless
<!-- this is another hook comment-->
second target text
<!-- this is another comment-->
again some useless crap
<!-- this is another comment-->
and the last piece that noone needs
</div>
</root>
can be queried with the following expression
//comment()[contains(string(),'hook')]/following-sibling::text()[preceding-sibling::comment()[1][contains(string(),'hook')]]
to result in
target part 1
target part 2
second target text
If you only want the first block, restrict the expression to the first item:
(//comment()[contains(string(),'hook')]/following-sibling::text()[preceding-sibling::comment()[1][contains(string(),'hook')]])[1]
Its result is
target part 1
target part 2
as desired.
If you can use XPath-2.0, you can append a /position() to the expressions above to get the position of the comment()s. But, as mentioned above, they are relative to comment nodes. So the result would be 1 2.

using xpath need to get to a child and get another node one level up

I am trying traverse through an XML with XPath. I want to visit /group/isRequired[text()='Optional'] and travel one level up to grab the /bool node
I tried a few things like the below but can't seem to get it rit... appreciate any inputs.
I basically want to verify the Library node, group+isRequired node and the bool nodes in one statement.
//root/sample[library[text()='2']]/group/isRequired[text()='Optional']//bool[text()='true']
//root/sample[library[text()='2']]/group/isRequired[text()='Optional']../bool[text()='true']
//root/sample[library[text()='2']]/group/isRequired[text()='Optional']/bool[text()='true']
//root/sample[library[text()='2']]/group/isRequired[text()='Optional']/../bool[text()='true']
<root>
<sample>
<id>1</id>
<library>2</library>
<ruleName>Default</ruleName>
<group>
<groupID>1</groupID>
<groupName>orange</groupName>
<isRequired>Optional</isRequired>
</group>
<variant>1</variant>
<bool>true</bool>
</sample>
</root>
You need to move two steps up:
/root/sample[library[text()='2']]/group/isRequired[text()='Optional']/../../bool[text()='true']
But is much cleaner to put multiple conditions in one predicate:
/root/sample[library[text()='2'] and group/isRequired[text()='Optional'] and bool[text()='true']]
Simpler:
/root/sample[library = "2" and group/isRequired = "Optional" and bool = "true"]
You don't have to use /text() to get the value of every node in the XPath. Depending on whether you XML has a schema, you don't need to put the literal values in quotes. Without it, everything is a string value, so I put them in quotes just for safety.
You can go a different route, by filtering sample node by group/isRequired child, then you can continue from that sample node to get to the bool node :
//root/sample[library='2' and group/isRequired='Optional']/bool[.='true']

Xpath for an element , all ancestors of which have the same name up to a point

I have an XML that looks like the following:
xml tree
I need those tag elements that have only son elements as their ancestors.The only non-son ancestor allowed is the root element parent.After parent no ancestor of tag can be anything other than son . This xpath therefore would return <tag id="t1" /> and <tag id="t2" />
//son//tag would be one solution. Another would be //tag[ancestor::son] You could use /descendent:: in place of //; there are differences in the order in which results are reported. There are other variants; which one is best depends on the exact context in which you're doing this.
I should have posted this earlier or may be it does not matter.Here is the nasty looking xpath I wrote to solve this:
/parent/(descendant::tag except(descendant::element() except descendant::son)/descendant::tag)
Hope someone would suggest a better looking alternative.

Wrapping an XML element with its ancestor nodes/tags

I can't navigate the XML doc programmatically and I need an one-line XPath solution for reasons I describe at the end.
I am working with an XML schema that looks something like the one below. (This is something I have to use as-is.)
<Root>
<!-- Child 1 -->
<Child>
<Name>Joe</Name>
<Age>12</Age>
</Child>
<!-- Child 2 -->
<Child>
<Name>Mike</Name>
<Age>25</Age>
</Child>
<!-- Child 3 -->
<Child>
<Name>Jane</Name>
<Age>20</Age>
</Child>
</Root>
Assuming I'm already at the "Joe" node (i.e. the Name element inside Child 1), I need to define an XPath query that will "wrap" that node as follows:
<Root>
<!-- Child 1 -->
<Child>
<Name>Joe</Name>
<Age>12</Age>
</Child>
</Root>
I've tried various combinations of ancestor, string-join, concat, etc., but can't seem to find the solution that "wraps" the element correctly. (The way I was using ancestor was returning all Child nodes, for example, which is not what I need.)
Some other considerations:
The solution has to be a one-line XPath query, if that's possible (for reasons given below).
It has to be generic enough to work for any Child element (i.e., it can't assume that I'm always at the first or second or third child, for example).
From the example above, you can see that I don't actually need the actual Root node per-se, just its tag (i.e. I don't want all Child nodes under it). However, I do need the actual Child node (so that I get the Name and Age).
NOTE: For what it's worth, I can't actually navigate the XML programmatically. I am using a library (whose code I cannot change) in which I have to define everything in terms of one-line XPath queries within a configuration file. It will essentially navigate through all of the Name elements, so my solution has to work from that point.
XPath is a query language.
This, among other things means that the evaluation of an XPath expression never modifies the XML document.
So, the answer is: Modifying an XML document or creating a new document cannot be done using only XPath.
Such transformations are very easy and natural to specify with XSLT.

Resources