xpath - find datetimes textnodes - xpath

I have some XML with elements that contain values with mixed data types. For example:
<someroot>
<event>
<dt>21.10.15 08:00</dt>
</event>
<event>
<dt>10:00</dt>
</event>
<event>
<dt>21/10/15 08:00</dt>
</event>
</someroot>
How can I find all the text nodes that contain only datetime values? Note that dt elements can be at different depths and may just contain times.

Xpath doesn't provide ways to find text nodes by mask/regex
But it has usefull functions like starts-with(),ends-with()

Related

XPATH How to Select two specific chidren from a same parent

Good night, friends!
Lets suppose we have a xml with 30 items like that and I want to get just the name and url from an item where name contains the word: Richard.
<channel>
<item>
<name>Brian</name>
<lastname>Connor</lastname>
<age>40</age>
<enclosure url="http://www.brian.com"/>
</item>
<item>
<name>Richard</name>
<lastname>Wendell</lastname>
<age>38</age>
<enclosure url="http://www.richard.com"/>
</item>
</channel>
How can I do that using XPath?
I tried:
"//channel/item[name[contains(text(),'Richard')]]" but it returns just the name and I don't know how to select the url information together.
Please excuse my english!
Your approach does not work because you are selecting a sub tree (in this case an item) of the XML tree which contains more information than you want. If you want just a subset of the attributes in ONE xpath expression you have to select them separately and then concatenate them adequately, e.g.
concat('name=', //channel/item[contains(name, 'Richard')]/name, ' url=', //channel/item[contains(name, 'Richard')]/enclosure/#url)
The example will allow you to alter the additional formatting easily.
By the way: your XML input was malformatted. I corrected this.

XMLPath Query for nested XML fragment

I'm trying to write a xpath query to pull data from an xml document. Unfortunately the document has a xml fragment embedded in it that seems to have lost its encoding (< has become &lt > has become &gt etc).
An example of the xml doc is:
<OrderData xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Id>1</Id>
<RawData><?xml version="1.0" encoding="UTF-16"?>
<Data xmlns="nnn-mmm-com">
<Order Action="Remove" >
<Instrument InstID="1"></Order><
/Data>
</RawData>
</OrderData>
I'm trying to extract the following values:
Id
Action
InstID
Getting the Id is no problem, but drilling into the fragment inside RawData is proving beyond me. Any pointers gratefully received
(I'm planning to execute the xpath query in Hive using Hive-XML-SerDe which is xpath 1.0)
Thanks
With XPath 3.1 you can parse the embedded XML document and turn it into a node tree, which you can then process using path expressions. So:
/OrderData/RawData/parse-xml(.)/*:Data/*:Instrument/#InstID
should get what you want.
You didn't say what version of XPath your library supports, which usually means that it only supports 1.0, so you may need to find a different library.

XSLT Sort parent node based on specific attribute of a child

first post ever, have done lots of searching but cannot find an answer specific enough or more importantly, relevant enough. Note that I am a business analyst, not a developer, so I may be missing some understanding here.
We produce XML that we then process to produce a report. Where data can be represented by a table, the XML contains details for the table title (ELEMENT_HEADING), table header row (PROMPTS), then repeating nodes representing the rows (DATA) and columns (VALUES).
The problem I am facing is I need to sort the DATA node based on a text value of the node where the node has a specific attribute value.
In the sample XML provided below, I need to sort the DATA node based on the VALUE text value for the attribute #pic='TRORGPCNT' in ascending order i.e the DATA node with TRORGPCNT of 10 should appear before the DATA node with 90. Then when the report is produced the table rows are in ascending percentage order.
I hope have explained myself clearly enough :)
Any tips on how I might accomplish this?
Sample XML:
<PROPOSAL_ELEMENT multi="Y" pec="TEACHRESP" elem_mandatory="N" elem_visible="Y">
<ELEMENT_HEADING pec="TEACHRESP">Teaching Responsibility</ELEMENT_HEADING>
<PROMPTS>
<PROMPT pic="TRORGUN" item_mandatory="Y" item_visible="Y">Faculty or School with teaching responsibility</PROMPT>
<PROMPT pic="TRORGPCNT" item_mandatory="Y" item_visible="Y">Teaching responsibility %</PROMPT>
</PROMPTS>
<DATA elem_mandatory="N" elem_visible="Y" delete_ind="N">
<VALUES>
<VALUE pic="TRORGUN" item_mandatory="Y" item_visible="Y" item_description="FACULTY OF NURSING AND HEALTH" display_in_summary_tab="Y" summary_order="">FACULTY OF NURSING AND HEALTH</VALUE>
<VALUE pic="TRORGPCNT" item_mandatory="Y" item_visible="Y" item_description="" display_in_summary_tab="Y" summary_order="">90</VALUE>
</VALUES>
</DATA>
<DATA elem_mandatory="N" elem_visible="Y" delete_ind="N">
<VALUES>
<VALUE pic="TRORGUN" item_mandatory="Y" item_visible="Y" item_description="FACULTY OF ARTS" display_in_summary_tab="Y" summary_order="">FACULTY OF ARTS</VALUE>
<VALUE pic="TRORGPCNT" item_mandatory="Y" item_visible="Y" item_description="" display_in_summary_tab="Y" summary_order="">10</VALUE>
</VALUES>
</DATA>
Sorting in XSLT is accomplished using the xsl:sort instruction, which must appear as the first child of the for-each or apply-templates that selects the nodes you want to sort. If you're selecting the set of DATA element nodes then an appropriate sorting instruction would be
<xsl:sort select="VALUES/VALUE[#pic='TRORGPCNT']"
data-type="number" />

How to find all nodes of a specific type in XPath

Lets say i have the following form data instance in my view.xml:
<xhtml:html xmlns:xhtml="http://www.w3.org/1999/xhtml"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xxforms="http://orbeon.org/oxf/xml/xforms"
xmlns:exforms="http://www.exforms.org/exf/1-0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<xhtml:head>
<xforms:instance id="instanceData">
<form xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<fruits>
<fruit>
<fruit-name>Mango</fruit-name>
</fruit>
<fruit>
<fruit-name>Apple</fruit-name>
</fruit>
<fruit>
<fruit-name>Banana</fruit-name>
</fruit>
</fruits>
</form>
</xforms:instance>
</xhtml:head>
I want to select all the fruit names from the above instance.
I tried the following ways but it always selects the first fruit.
instance('instanceData')/fruits/fruit[*]/fruit-name
instance('instanceData')/fruits/fruit/fruit-name
instance('instanceData')/fruits/fruit[position()>0]/fruit-name
Please provide a way to overcome this in XPATH
try this
"//fruit-name"
It shall find all fruitnames wherever they are in the document hierarchy.
If you want to select all the <fruit-name> from the instance instanceData (<xforms:instance id="instanceData">) that looks like the one you have in your question, the following should do it:
instance('instanceData')/fruits/fruit/fruit-name
If this doesn't work, one common reason is that you have a default namespace declaration in the document that contains your instance, like: xmlns="http://www.w3.org/1999/xhtml". If you have this, you need to undo that default namespace declaration on where you declare the instance, with:
<xforms:instance xmlns="" id="instanceData">
(And if this is the issue, my advice is not to use default namespace declarations. Ever. Instead declare xmlns:xhtml="http://www.w3.org/1999/xhtml" and use the xhtml prefix everywhere.)
First:
It may be a typo any way to point out you xml has wrong node ending
<service>
Second:
your XPATH is very much valid but when you parse it out you need to iterate over the result set as like its a sequence of node and not a single value.
e.g) in JDOM :< Element.selectObject Vs selectSingleNodes Vs selectAsArray kind.
In your XForms you need to iterate over the resultset to get the list of fruits.
if you want only fruit names then you could try
instance('instanceData')/fruits/fruit/fruit-name/text()

XPath concat multiple nodes

I'm not very familiar with xpath. But I was working with xpath expressions and setting them in a database. Actually it's just the BAM tool for biztalk.
Anyway, I have an xml which could look like:
<File>
<Element1>element1<Element1>
<Element2>element2<Element2>
<Element3>
<SubElement>sub1</SubElement>
<SubElement>sub2</SubElement>
<SubElement>sub3</SubElement>
<Element3>
</File>
I was wondering if there is a way to use an xpath expression of getting all the SubElements concatted? At the moment, I am using:
/*[local-name()='File']/*[local-name()='Element3']/*[local-name()='SubElement']
This works if it only has one index. But apparently my xml sometimes has more nodes, so it gives NULL. I could just use
/*[local-name()='File']/*[local-name()='Element3']/*[local-name()='SubElement'][0]
but I need all the nodes. Is there a way to do this?
Thanks a lot!
Edit: I changed the XML, I was wrong, it's different, it should look like this:
<item>
<element1>el1</element1>
<element2>el2</element2>
<element3>el3</element3>
<element4>
<subEl1>subel1a</subEl1>
<subEl2>subel2a</subEl2>
</element4>
<element4>
<subEl1>subel1b</subEl1>
<subEl2>subel2b</subEl2>
</element4>
</item>
And I need to have a one line code to get a result like: "subel2a subel2b";
I need the one line because I set this xpath expression as an xml attribute (not my choice, it's specified). I tried string-join but it's not really working.
string-join(/file/Element3/SubElement, ',')
/File/Element3/SubElement will match all of the SubElement elements in your sample XML. What are you using to evaluate it?
If your evaluation method is subject to the "first node rule", then it will only match the first one. If you are using a method that returns a nodeset, then it will return all of them.
You can get all SubElements by using:
//SubElement
But this won't keep them grouped together how you want. You will want to do a query for all elements that contain a SubElement (basically do a search for the parent of any SubElements).
//parent::SubElement
Once you have that, you could (depending on your programming language) loop through the parents and concatenate the SubElements.

Resources