How to get parent node of a particular node using XPATH? - xpath

I have the below xml file and I need to use an xpath to get "amount" value based on "linenumber".
<Doc>
<Documents>
<Document1>
<linenumber>800</linenumber>
<amount>100.00</amount>
<name>fee1</name>
</Document1>
<Document2>
<linenumber>801</linenumber>
<amount>200.00</amount>
<name>fee2</name>
</Document2>
<Document3>
<linenumber>802</linenumber>
<amount>300.00</amount>
<name>fee3</name>
</Document3>
<Document4>
<linenumber>803</linenumber>
<amount>400.00</amount>
<name>fee4</name>
</Document4>
</Documents>
</Doc>
I tried the xpath specified in the below function 'GetDocumentField' but the function call GetDocumentField(801, "amount") did not return any value. I can't use LINQ since it is based on .Net framework 2.0. Can anyone suggest how to write this xpath query? Thanks!
Private Function GetDocumentField(ByVal line As Integer, ByVal field As String) As String
Return GetValueFromXPath(String.Format("//linenumber[.='{0}']/parent::node()/{1}", line, field)))
End Function
Private Function GetValueFromXPath(ByVal xpath As String) As String
Dim node As XmlNode
node = InputXml.SelectSingleNode(xpath)
Return GetValueFromNode(node)
End Function
Private Function GetValueFromNode(ByVal node As XmlNode) As String
If node Is Nothing Then
Return String.Empty
End If
If node.InnerText Is Nothing Then
Return String.Empty
End If
Return node.InnerText
End Function

Use this XPath expression:
/*/*/*[linenumber = '801']/amount
But before this, correct the presented non-well-formed XML document (you probably didn't notice that there was a parsing exception when you loaded InputXml). More specifically, all amount elements have no closing tag.

Related

Xpath sibling filter based on value of element in current node

Is there an Xpath to find a cousin node that has an element that matches the value of an element in the current node?
Please see below - I am iterating over each "Order" node and want to return the value of LocationID from the Collection node that has the same OrderLoadRef value as the order. For the first order it should return "AAA", for the second it should return "BBB".
The XPath works if I change the value of the OrderLoadRef manually, but how to I set it to be the value of the OrderLoadRef in the current Order Element? I've tried using the self axis, but think by the time we get to the condition, "self" is the collection node, not the order?
I can't hard code relative collection / order node positions as there could be a variable number of these nodes in the XML that my parser receives.
XDocument xDoc = XDocument.Parse(#"<DocRoot>
<Load>
<Collections>
<Collection>
<OrderLoadRef>1</OrderLoadRef>
<LocationID>AAA</LocationID>
</Collection>
<Collection>
<OrderLoadRef>2</OrderLoadRef>
<LocationID>BBB</LocationID>
</Collection>
</Collections>
<Orders>
<Order>
<OrderRef>1521505</OrderRef>
<OrderLoadRef>1</OrderLoadRef>
</Order>
<Order>
<OrderRef>1521505_2</OrderRef>
<OrderLoadRef>2</OrderLoadRef>
</Order>
</Orders>
</Load>
</DocRoot>");
List<XElement> orders = xDoc.XPathSelectElements("//Order").ToList();
foreach(XElement order in orders)
{
string locationId = order.XPathSelectElement("parent::Orders/parent::Load/Collections/Collection[OrderLoadRef = {OrderLoadRef from current order element}]/LocationID").Value;
}
Edited to add: I need this to be a purely XPath solution as I'm not able to alter the C# code in the parser. More than happy to be told it's not possible, but wanted to make sure before I relayed the message!
As Mads said, XPath 3 and later (i.e. the current version 3.1) allows you to use a let expression so e.g.
for $order in /DocRoot/Load/Orders/Order
return
let $col := /DocRoot/Load/Collections/Collection[OrderLoadRef = $order/OrderLoadRef]/LocationID
return $col
is pure XPath 3 and returns (for your sample) the two LocationID elements:
<LocationID>AAA</LocationID>
<LocationID>BBB</LocationID>
In the .NET framework XmlPrime and Saxon.NET support XPath 3.1 and XQuery 3.1 although only XmlPrime has extension methods for C# to work against XDocument, I think, Saxon.NET does allow XPath 3.1 against its XDM tree model or against System.Xml.XmlDocument.
XPath 3.0 (and greater) supports let expressions, which would allow you to do what you want. You could let a variable with the OrderLoadRef from the context node and use it within a predicate selecting the desired Collection by it's OrderLoadRef.
For a static XPath 1.0 expression, I don't think you can achieve what you want. You would need to construct the XPath using the context node information.
Inside your for loop, create a variable for the Order's OrderLoadRef value. Use that value to construct the XPath that you want to evaluate to then select the locationId
foreach(XElement order in orders)
{
string orderLoadRef = order.XPathSelectElement("OrderLoadRef").Value;
string locationId = order.XPathSelectElement("ancestor::Load/Collections/Collection[OrderLoadRef = " + orderLoadRef + "]/LocationID").Value;
//do something with the locationId
}

How to take xpath to Get Text from class inside th

I have the following XPath :
//table[#class='ui-jqgrid-htable']/thead/tr/th//text()
And I'm trying to get the text from it with the following command :
String LabelName = driver.findElement(By.xpath("//table[#class='ui-jqgrid htable']/thead/tr/th//text()")).getText()
But it's not printing text, the result is blank. Could you help me please ?
The text() in your xpath does not qualify as an element. Your element ends at //table[#class='ui-jqgrid-htable']/thead/tr/th. Try using getText() for this XPath.
Also, a table would have many headers. Using findElement will only return the first one.
If you want to get all headers use
driver.findElements(By.xpath("//table[#class='ui-jqgrid-htable']/thead/tr/th"))
and loop through the list to getText of individual element.

I can't extract the node text with a Xpath

I have a XML file (test.xml) like this one:
<?xml version="1.0" encoding="ISO-8859-1"?>
<s2xResponse>
<s2xData>
<Name>This is the name</Name>
<InfocomData>
<DateOfUpdate day="07" month="02" year="2018">20180207</DateOfUpdate>
<CompanyName>MY COMPANY</CompanyName>
<TaxCode FlagCheck="0">XXXYYYWWWZZZ</TaxCode>
</InfocomData>
<AssessmentSummary>
<Rating Code="2">Rating Description for Code 2</Rating>
</AssessmentSummary>
<AssessmentData>
<SectorialDistribution>
<CompaniesNumber>11650</CompaniesNumber>
<ScoreDistribution />
<CervedScoreDistribution>
<DistributionData>
<Rating Code="1">SICUREZZA</Rating>
<Percentage>1.91</Percentage>
</DistributionData>
<DistributionData>
<Rating Code="2">SOLVIBILITA' ELEVATA</Rating>
<Percentage>35.56</Percentage>
</DistributionData>
</CervedScoreDistribution>
</SectorialDistribution>
</AssessmentData>
</s2xData>
</s2xResponse>
I'm trying to get the "Name" node text ("This is the name") with a U-SQL script using the XmlExtractor. The following is the code I'm using:
USE TestXML; // It contains the registered assembly
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats];
#xml = EXTRACT xml_text string
FROM "textxpath/test.xml"
USING Extractors.Text(rowDelimiter: "^", quoting: false);
#xml_cleaned =
SELECT
xml_text.Replace("\r\n", "").Replace("\t", " ") AS xml_text
FROM #xml;
#values =
SELECT Microsoft.Analytics.Samples.Formats.Xml.XPath.Evaluate(xml_text, "s2xResponse/s2xData/Name")[1] AS value
FROM #xml_cleaned;
OUTPUT #values TO #"outputs/test_xpath.txt" USING Outputters.Text(quoting: false);
But I'm getting this runtime error:
Execution failed with error '1_SV1_Extract Error :
'{"diagnosticCode":195887116,"severity":"Error","component":"RUNTIME","source":"User","errorId":"E_RUNTIME_USER_EXPRESSIONEVALUATION","message":"Error
while evaluating expression
Microsoft.Analytics.Samples.Formats.Xml.XPath.Evaluate(xml_text.Replace(\"\r\n\",
\"\").Replace(\"\t\", \" \"),
\"s2xResponse/s2xData/Name\")[1]","description":"Inner exception from
user expression: Index was out of range. Must be non-negative and less
than the size of the collection.
I get the same error even if I use a zero index for the Evaluate result ([0]).
What's wrong with my query?
The problem here is that you are applying the subscript [1] to the result of XPath.Evaluate, which I believe will be returning the Name nodes. However, you are applying the [1] subscript in code, not in XPath, so the subscript is likely to be zero based, and not 1-based as it is in XPath, hence the Index out of range error.
Here's one solution - simply apply the subscript operator in Xpath (where it is still 1-based), and select the text() there
.Evaluate("s2xResponse/s2xData/Name[1]/text()")
Is there a particular reason you want to use the Evaluate method? I got his to work using the XmlDomExtractor, which would allow you to extract multiple values from the xml, eg
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats];
DECLARE #inputFile string = "/input/input100.xml";
#input =
EXTRACT Name string
FROM #inputFile
USING new Microsoft.Analytics.Samples.Formats.Xml.XmlDomExtractor(rowPath : "/s2xResponse",
columnPaths : new SQL.MAP<string, string>{
{ "s2xData/Name", "Name" },
}
);
#output =
SELECT *
FROM #input;

xpath: check if current elements position is second in order

Background:
I have an XML document with the following structure:
<body>
<section>content</section>
<section>content</section>
<section>content</section>
<section>content</section>
</body>
Using xpath I want to check if a <section> element is the second element and if so apply some function.
Question:
How do I check if a <section> element is the second element in the body element?
../section[position()=2]
If you want to know if the second element in the body is named section then you can do this:
local-name(/body/child::element()[2]) eq "section"
That will return either true or false.
However, you then asked how can you check this and if it is true, then apply some function. In XPath you cannot author your own functions you can only do that in XQuery or XSLT. So let me for a moment assume you are wishing to call a different XPath function on the value of the second element if it is a section. Here is an example of applying the lower-case function:
if(local-name(/body/child::element()[2]) eq "section")then
lower-case(/body/child::element()[2])
else()
However, this can simplified as lower-case and many other functions take a value with a minimum cardinality of zero. This means that you can just apply the function to a path expression, and if the path did not match anything then the function typically returns an empty sequence, in the same way as a path that did not match will. So, this is semantically equivalent to the above:
lower-case(/body/child::element()[2][local-name(.) eq "section"])
If you are in XQuery or XSLT and are writing your own functions, I would encourage you to write functions that will accept a minimum cardinality of zero, just like lower-case does. By doing this you can chain functions together, and if there is no input data (i.e. from a path expression that does not match anything), these is no output data. This leads to a very nice functional programming style.
Question: How do I check if a element is the second element
in the body element?
Using C#, you can utilize theXPathNodeIterator class in order to traverse the nodes data, and use its CurrentPosition property to investigate the current node position:
XPathNodeIterator.CurrentPosition
Example:
const string xmlStr = #"<body>
<section>1</section>
<section>2</section>
<section>3</section>
<section>4</section>
</body>";
using (var stream = new StringReader(xmlStr))
{
var document = new XPathDocument(stream);
XPathNavigator navigator = document.CreateNavigator();
XPathNodeIterator nodes = navigator.Select("/body/section");
if (nodes.MoveNext())
{
XPathNavigator nodesNavigator = nodes.Current;
XPathNodeIterator nodesText =
nodesNavigator.SelectDescendants(XPathNodeType.Text, false);
while (nodesText.MoveNext())
{
if (nodesText.CurrentPosition == 2)
{
//DO SOMETHING WITH THE VALUE AT THIS POSITION
var currentValue = nodesText.Current.Value;
}
}
}
}

Nokogiri: find tag, get attributes and replace tag

I'm working with Nokogiri and I'm a newbye. I'm parsing an HTML document to match some placeholder, and after match I must replace the widget placeholder with some generated HTML.
I create this method:
doc = Nokogiri::HTML.fragment(raw)
matches = doc.xpath(".//widget")
if matches.present?
matches.each do |match|
media_replace(..)
else
self.body = raw
end
I have some matches, and every match has this attributes.
matches.first.attributes
{"data_id"=>#(Attr:0x3fdd42e2cebc { name = "data_id", value = "5" }),
"data_type"=>#(Attr:0x3fdd42e2ce94 { name = "data_type", value = "gallery" })}
How can I extract these attributes(gallery and 5) to pass them to my media_replace method?
Media_replace method return to me an 'html': how can I replace every 'match' with the returned HTML?
To get attribute values from a node you can use the [] method. For example:
media_replace(match['data_id'], match['data_gallery'])
To replace the node, use the replace or swap methods (assuming media_replace returns a string or other compatible object):
new_html = media_replace(...)
match.replace(new_html)

Resources